Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of group_by+mutate #242

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

dafxy
Copy link

@dafxy dafxy commented Aug 28, 2024

The implementation makes it consistent with the tidyverse sintax.

@dafxy
Copy link
Author

dafxy commented Aug 28, 2024

This solves issue #201. Other functions to be applied to grouped data can be easily implemented in the same fashion.

assert isinstance(by, str) or isinstance(by, list), "Use list or string to group by."
super().__init__(df, by, *args, **kwargs)
self.df = df
self.by = by if isinstance(by, list) else list(by)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a little tweek here:
self.by = by if isinstance(by, list) else [by]

@markfairbanks
Copy link
Owner

Apologies for the long response time on this - I will take a look either this weekend or early next week. I think with the polars v1.0.0 release (now on v1.6.0) it's time to resurrect tidypolars and get it back to working form.

@markfairbanks
Copy link
Owner

Can you break this into multiple pull requests?

@markfairbanks
Copy link
Owner

markfairbanks commented Sep 11, 2024

I'm starting to review but there are multiple parts - it would be nice to address each addition separately. For each part we need to discuss code changes and tests (none of these functions have tests as of yet).

Can you pick one of these to start with at least? If you do group_by() feel free to include both mutate() and filter()
These are the additions:

  • group_by()
    • mutate()
    • filter()
  • nest()/unnest()
  • crossing()
  • distinct(keep_all = )
  • as_factor()
  • replace() (I don't think this belongs in tidypolars but we can discuss)

@dafxy
Copy link
Author

dafxy commented Sep 11, 2024

Feel free to cherry-pick each commit in the repo.

@markfairbanks
Copy link
Owner

markfairbanks commented Sep 11, 2024

We can try and do it all in one but this is an overly large PR at the moment. Moving forward each PR must have a (generally) unique purpose.

Some of these pieces could be merged tomorrow with very little effort whereas others are going to take a while to update - which means none of this is getting merged for a while (I could be wrong maybe it will go faster than I think).

Copy link
Owner

@markfairbanks markfairbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my first pass at a review.

All functions also need tests and to be added to the changelog

tidypolars/funs.py Outdated Show resolved Hide resolved
tidypolars/funs.py Outdated Show resolved Hide resolved
tidypolars/funs.py Outdated Show resolved Hide resolved
tidypolars/tibble.py Outdated Show resolved Hide resolved
tidypolars/tibble.py Outdated Show resolved Hide resolved
tidypolars/tibble.py Outdated Show resolved Hide resolved
tidypolars/tibble.py Outdated Show resolved Hide resolved
tidypolars/tibble.py Show resolved Hide resolved
tidypolars/tibble.py Outdated Show resolved Hide resolved
tidypolars/tibble.py Show resolved Hide resolved
@dafxy
Copy link
Author

dafxy commented Sep 12, 2024

How do you think we should go about the changes above? They are pretty straightforward.

@markfairbanks
Copy link
Owner

Just make the commits on your repo. Once all the changes are done with tests and changelog we're good to merge.

@markfairbanks
Copy link
Owner

If you need ideas for tests my tidytable repository is good place to grab them
https://github.com/markfairbanks/tidytable/tree/main/tests/testthat

@markfairbanks
Copy link
Owner

If you haven't used pytest let me know and I can walk you through it

@dafxy
Copy link
Author

dafxy commented Sep 12, 2024

All done, except for the test. If you or someone could implement the tests, that would be great.

@markfairbanks markfairbanks self-requested a review September 12, 2024 03:13
@markfairbanks
Copy link
Owner

I can help a little with tests, but you’ll need to build some here (and in every pull request). In general for every PR you need the function, a test or a two, and to update the changelog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants