New example glm ordinal features #717

jonsedar · 2024-10-27T11:14:09Z

New example notebook: GLM-ordinal-features

The current example notebook for ordinal regression
GLM-ordinal-regression.ipynb
shows how to handle ordinals in the target (endogenous) feature, but not in the predictor (exogenous) feature(s).

I wanted to treat an ordinal feature in a model myself and dug up an old example by Austin Rochford, itself based on a 2018 paper by Burkner & Charpenitier. The example was good, but I thought I'd take the opportunity
to build out a full workflow, add more explanations and include in pymc-examples.

I've a few other related ones in the pipeline, but starting simple.

Notebook follows style guide https://docs.pymc.io/en/latest/contributing/jupyter_style.html
PR description contains a link to the relevant issue: New example for handling ordinal predictor features #716
Check the notebook is not excluded from any pre-commit check: https://github.com/pymc-devs/pymc-examples/blob/main/.pre-commit-config.yaml

📚 Documentation preview 📚: https://pymc-examples--717.org.readthedocs.build/en/717/

+ already complete and created in another env

+ cited inside Notebook + adjusted errata

review-notebook-app · 2024-10-27T11:14:14Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

jonsedar · 2024-10-27T11:17:56Z

@ maintainer team

Hello! It's been years since I submitted an examples notebook. This is the first and simplest of a handful of new notebooks that I'd like to add, and am starting simple and working upwards.

I've tried to add all the various formatting bits and pieces that seem to make the current standard, so I hope any changes required will be minor, but I'll happily edit as needed!

I've tested this running locally and in Google Collab

jonsedar · 2024-10-27T11:32:25Z

WIP need to create a simple env with jupyter-black in pre-commit so we have the same validation. My env uses more and different validation (!), will return to this in a couple of hours after I get back from something

…: Too many commas in 'B{\\"u}rkner, P., & Charpentier, E.'

+ ran pre-commit in full + autocreated *.myst file

jonsedar · 2024-10-27T17:57:01Z

Alrighty - ready for review!

review-notebook-app · 2024-10-28T03:00:29Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:29Z
----------------------------------------------------------------

Probably better as paragraphs rather than bullet points (except where there are actual lists). For consistency with other notebooks, as well as readability.

Key issue with ordinal variables is that the distance between categories are unknown and potentially unequal (e.g. the difference between better and good vs that between good and very good).

jonsedar commented on 2024-10-28T05:30:51Z
----------------------------------------------------------------

Thanks - I've updated to be more clear :)

review-notebook-app · 2024-10-28T03:00:30Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:30Z
----------------------------------------------------------------

Assume the target_accept comment is not necessary since the model runs fine with 0.8?

jonsedar commented on 2024-10-28T04:57:44Z
----------------------------------------------------------------

Yep fair enough - I was drawing attention to it for when I change it in Model B, but we can just have the implict default

review-notebook-app · 2024-10-28T03:00:31Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:30Z
----------------------------------------------------------------

🔥

jonsedar commented on 2024-10-28T04:58:21Z
----------------------------------------------------------------

I couldn't resist ;)

review-notebook-app · 2024-10-28T03:00:32Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:31Z
----------------------------------------------------------------

A few typos here.

jonsedar commented on 2024-10-28T05:00:22Z
----------------------------------------------------------------

My tpying skills aren't what they were!

review-notebook-app · 2024-10-28T03:00:32Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:32Z
----------------------------------------------------------------

Cells above can be removed, no?

jonsedar commented on 2024-10-28T07:57:47Z
----------------------------------------------------------------

Stylistic thing.. I quite like to preserve the workflow if possible to explicitly show that data prep steps have been considered but not needed. I've added a note above in section 0 (NOTE some boilerplate steps are included but ~~struck through~~ with and explanatory comment

e.g. "Not needed in this simple example". This is to preserve the logical workflow which is

more generally useful)

Happy to remove completely if you prefer

review-notebook-app · 2024-10-28T03:00:33Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:33Z
----------------------------------------------------------------

A bit more exposition on the encoding of categoricals may be helpful for those unfamiliar.

jonsedar commented on 2024-10-28T07:54:04Z
----------------------------------------------------------------

Sure, have added more description and dicussion of the data

review-notebook-app · 2024-10-28T03:00:34Z

View / edit / reply to this conversation on ReviewNB

fonnesbeck commented on 2024-10-28T03:00:33Z
----------------------------------------------------------------

A bit more exposition about the target variable would be helpful, either here or in the introduction. I'm not really sure what I'm looking at.

jonsedar commented on 2024-10-28T07:53:55Z
----------------------------------------------------------------

Sure, have added more description and dicussion of the data

jonsedar · 2024-10-28T04:58:22Z

I couldn't resist ;)

View entire conversation on ReviewNB

jonsedar · 2024-10-28T05:00:24Z

My tpying skills aren't what they were!

View entire conversation on ReviewNB

jonsedar · 2024-10-28T05:01:52Z

Well caught!

View entire conversation on ReviewNB

jonsedar · 2024-10-28T05:30:52Z

Thanks - I've updated to be more clear :)

View entire conversation on ReviewNB

jonsedar · 2024-10-28T06:23:20Z

Thanks Chris - much appreciated! Nice to have all the software admin tools surrounding the process too - makes life much better.

Your request to have more detail on the data encouraged me to properly read that Burkner paper and see that the ordinal scales are supposed to be [0-4] on both features (d450, d455), but on d450 we only observe values [0-3]. So there's a missing data problem too, which makes this example more rich and reinforces the need for ordinal handling (not numeric).

That paper actually doesn't handle the missing value, so I'll take the opportunity to improve upon it in our notebook here

jonsedar · 2024-10-28T07:50:45Z

Sure, added a title with a little explantion to encourage folk to dig more

View entire conversation on ReviewNB

jonsedar · 2024-10-28T07:52:12Z

Indeed! I've had this boilerplate for a while and can't remember why I was using it over and above factorize :D

I've changed it accordingly and suffered no impact here so maybe I'll have to change my boilerplate too

View entire conversation on ReviewNB

jonsedar · 2024-10-28T07:53:38Z

Poor naming on my part, I've tried to clarify

View entire conversation on ReviewNB

jonsedar · 2024-10-28T07:53:56Z

Sure, have added more description and dicussion of the data

View entire conversation on ReviewNB

jonsedar · 2024-10-28T07:54:05Z

Sure, have added more description and dicussion of the data

View entire conversation on ReviewNB

jonsedar · 2024-10-28T07:57:48Z

Stylistic thing.. I quite like to preserve the workflow if possible to explicitly show that data prep steps have been considered but not needed. I've added a note above in section 0 (NOTE some boilerplate steps are included but ~~struck through~~ with and explanatory comment

e.g. "Not needed in this simple example". This is to preserve the logical workflow which is

more generally useful)

Happy to remove completely if you prefer

View entire conversation on ReviewNB

+ new work to positively include a coeff in mdlb for d450 = c4

jonsedar · 2024-10-28T08:23:59Z

Huh? Cell 11 does come after 4... why is that a problem?

Check cells were executed sequentially........................................................Failed
- hook id: check-execution-order
- exit code: 1

Cell 11 comes after 4 in file 'examples/generalized_linear_models/GLM-ordinal-features.ipynb'
Cell 15 comes after 11 in file 'examples/generalized_linear_models/GLM-ordinal-features.ipynb'
Cell 20 comes after 16 in file 'examples/generalized_linear_models/GLM-ordinal-features.ipynb'
Cell 30 comes after 21 in file 'examples/generalized_linear_models/GLM-ordinal-features.ipynb'
Cell 35 comes after 30 in file 'examples/generalized_linear_models/GLM-ordinal-features.ipynb'

jonsedar · 2024-10-28T08:35:25Z

Aha, the checks are defeated! Over to you @fonnesbeck :)

+ fixed a couple of typos

jonsedar · 2024-11-21T07:07:13Z

This is ready to go, how do we get this merged? @fonnesbeck ? Cheers :)

fonnesbeck

LGTM

fonnesbeck · 2024-12-14T19:55:05Z

Sorry this took so long to get back to!

jonsedar · 2024-12-15T04:03:45Z

No worries - thanks for the merge!

* + added new notebook GLM-ordinal-features.ipynb + already complete and created in another env * Created using Colab * minor updates for latest seaborn * Tweaks for collab and included authors * added header * + added new reference article and online + cited inside Notebook + adjusted errata * + ran black again... * + rep Burkner as B{\"u}rkner * + maybe fixed readthedocs complaint pybtex.database.InvalidNameString: Too many commas in 'B{\\"u}rkner, P., & Charpentier, E.' * + ran black-jupyter, let's see * + another run of black-jupyter * + installed local pymc_examples env + ran pre-commit in full + autocreated *.myst file * + update tags * + possibly persuaded the precommits to all pass * + rerun on colab to confirm all good post new pre-commit process * + okay, reran in colab again... lets see if this passes * + added (again) the myst.md * + minor updates post review + new work to positively include a coeff in mdlb for d450 = c4 * + reran precommit and readded myst.md * + rerun localyl e2e * + added myst.md again * + reran again to ensure cell execution order even for markdown cells * + reran again again to ensure order * + minor update: forced addtional level c4 into d450 categorical feature + fixed a couple of typos * + changed rating to intermediate

jonsedar and others added 6 commits October 27, 2024 12:37

+ added new notebook GLM-ordinal-features.ipynb

067f013

+ already complete and created in another env

Created using Colab

eb4959d

minor updates for latest seaborn

23b6444

Tweaks for collab and included authors

c161968

added header

8e8836f

+ added new reference article and online

1270103

+ cited inside Notebook + adjusted errata

jonsedar added 2 commits October 27, 2024 15:22

+ ran black again...

12a7097

+ rep Burkner as B{\"u}rkner

4719a5f

jonsedar marked this pull request as draft October 27, 2024 11:30

jonsedar and others added 9 commits October 27, 2024 15:34

+ maybe fixed readthedocs complaint pybtex.database.InvalidNameString…

4d29dc4

…: Too many commas in 'B{\\"u}rkner, P., & Charpentier, E.'

+ ran black-jupyter, let's see

ab5ae12

+ another run of black-jupyter

f7e6d57

+ installed local pymc_examples env

5e3a987

+ ran pre-commit in full + autocreated *.myst file

+ update tags

2368614

+ possibly persuaded the precommits to all pass

abc1188

+ rerun on colab to confirm all good post new pre-commit process

5b1c653

+ okay, reran in colab again... lets see if this passes

9bc703f

+ added (again) the myst.md

e1453db

jonsedar marked this pull request as ready for review October 27, 2024 17:56

jonsedar added 5 commits October 28, 2024 11:58

+ minor updates post review

070753d

+ new work to positively include a coeff in mdlb for d450 = c4

+ reran precommit and readded myst.md

89d1865

+ rerun localyl e2e

3a9cd1e

+ added myst.md again

de7eccb

+ reran again to ensure cell execution order even for markdown cells

52754f6

+ reran again again to ensure order

8f224da

+ minor update: forced addtional level c4 into d450 categorical feature

37bdf03

+ fixed a couple of typos

review-notebook-app bot mentioned this pull request Nov 12, 2024

New example notebook for auto-imputation aka handle missing values with a simple dataset and full workflow #722

Closed

3 tasks

+ changed rating to intermediate

796c3d4

fonnesbeck approved these changes Dec 14, 2024

View reviewed changes

fonnesbeck merged commit a3f03a0 into pymc-devs:main Dec 14, 2024
2 checks passed

jonsedar deleted the new-example-glm-ordinal-features branch December 16, 2024 06:19

jonsedar mentioned this pull request Dec 19, 2024

New example for handling ordinal predictor features #716

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New example glm ordinal features #717

New example glm ordinal features #717

jonsedar commented Oct 27, 2024 •

edited

Loading

review-notebook-app bot commented Oct 27, 2024

jonsedar commented Oct 27, 2024

jonsedar commented Oct 27, 2024 •

edited

Loading

jonsedar commented Oct 27, 2024

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024 •

edited

Loading

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Nov 21, 2024

fonnesbeck left a comment

fonnesbeck commented Dec 14, 2024

jonsedar commented Dec 15, 2024

New example glm ordinal features #717

New example glm ordinal features #717

Conversation

jonsedar commented Oct 27, 2024 • edited Loading

New example notebook: GLM-ordinal-features

review-notebook-app bot commented Oct 27, 2024

jonsedar commented Oct 27, 2024

jonsedar commented Oct 27, 2024 • edited Loading

jonsedar commented Oct 27, 2024

review-notebook-app bot commented Oct 28, 2024 • edited Loading

review-notebook-app bot commented Oct 28, 2024 • edited Loading

review-notebook-app bot commented Oct 28, 2024 • edited Loading

review-notebook-app bot commented Oct 28, 2024 • edited Loading

review-notebook-app bot commented Oct 28, 2024 • edited Loading

review-notebook-app bot commented Oct 28, 2024 • edited Loading

review-notebook-app bot commented Oct 28, 2024 • edited Loading

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024 • edited Loading

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Oct 28, 2024

jonsedar commented Nov 21, 2024

fonnesbeck left a comment

Choose a reason for hiding this comment

fonnesbeck commented Dec 14, 2024

jonsedar commented Dec 15, 2024

jonsedar commented Oct 27, 2024 •

edited

Loading

jonsedar commented Oct 27, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

review-notebook-app bot commented Oct 28, 2024 •

edited

Loading

jonsedar commented Oct 28, 2024 •

edited

Loading