Skip to content
This repository has been archived by the owner on Apr 18, 2021. It is now read-only.

rla3rd/pgbquery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pgbquery

A query and aggregation framework for Bcolz and Postgresql

Bcolz is a light weight package that provides columnar, chunked data containers that can be compressed either in-memory and on-disk. that are compressed by default not only for reducing memory/disk storage, but also to improve I/O speed. It excels at storing and sequentially accessing large, numerical data sets.

Pgbquery is largely based on the bquery framework built upon Bcolz, which can be found on github at https://github.com/visualfabriq/bquery, with changes made to support the multicorn foreign data wrapper.

The pgbquery framework provides methods to perform query and aggregation operations on bcolz containers, a multicorn wrapper, as well as accelerate these operations by pre-processing possible groupby columns. Currently the real-life performance of sum aggregations using on-disk bcolz queries is normally between 1.5 and 3.0 times slower than similar in-memory Pandas aggregations.

About

Multicorn wrapper for bcolz ctables

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published