nested-pandas

An extension of pandas for efficient representation of nested associated datasets.

Nested-Pandas extends the pandas package with tooling and support for nested dataframes packed into values of top-level dataframe columns. Pyarrow is used internally to aid in scalability and performance.

Nested-Pandas allows data like this:

To instead be represented like this:

Where the nested data is represented as nested dataframes:

   # Each row of "object_nf" now has it's own sub-dataframe of matched rows from "source_df"
   object_nf.loc[0]["nested_sources"]

Allowing powerful and straightforward operations, like:

   # Compute the mean flux for each row of "object_nf"
   import numpy as np
   object_nf.reduce(np.mean, "nested_sources.flux")

Nested-Pandas is motivated by time-domain astronomy use cases, where we see typically two levels of information, information about astronomical objects and then an associated set of N measurements of those objects. Nested-Pandas offers a performant and memory-efficient package for working with these types of datasets.

Core advantages being:

hierarchical column access
efficient packing of nested information into inputs to custom user functions
avoiding costly groupby operations

This is a LINCC Frameworks project - find more information about LINCC Frameworks here.

Acknowledgements

This project is supported by Schmidt Sciences.

Name		Name	Last commit message	Last commit date
Latest commit History 304 Commits
.github		.github
benchmarks		benchmarks
docs		docs
src/nested_pandas		src/nested_pandas
tests/nested_pandas		tests/nested_pandas
.copier-answers.yml		.copier-answers.yml
.git_archival.txt		.git_archival.txt
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
.setup_dev.sh		.setup_dev.sh
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nested-pandas

Acknowledgements

About

Releases 12

Packages

Contributors 7

Languages

License

lincc-frameworks/nested-pandas

Folders and files

Latest commit

History

Repository files navigation

nested-pandas

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases 12

Packages 0

Contributors 7

Languages

Packages