Request for feedback #22

eigenfoo · 2019-06-27T06:56:37Z

Following up from lindeloev/tests-as-linear#16, the port is ready for you to take a look - any feedback you have would be fantastic! You can view the page here: https://eigenfoo.xyz/tests-as-linear/.

Some specific notes for you:

Your cheatsheets are licensed under CC-BY 4.0. Can I assume that the entire R post is similarly licensed? I've gone ahead and licensed this project the same way, citing your post as the source work.
I've learnt that the Python statistics ecosystem really doesn't compare to R! scipy and statsmodels don't support:
a. Two-way ANOVA
b. ANCOVA
c. One-way ANOVAs on GLMs
I've just left these code snippets un-ported, with appropriate warning boxes.
I haven't ported your simulations/appendices - hopefully somebody else will pick up Port appendices/simulations to Python #14.
I'm not sure about how to implement Welch's t-test with a linear model: you don't go into details about how to do that in your original post. Do you know of anywhere I can read up on that?

The text was updated successfully, but these errors were encountered:

lindeloev · 2019-06-27T14:24:26Z

Awesome!

Yes, all good re license!
That's also where I got stuck. Since ANOVA, ANCOVA, etc. are linear models, statsmodels do support them (ols combined with anova_lm) - it just doesn't have built-in dedicated functions to do it. Actually, R's car::Anova also just calls lm in the background and then do some post-processing like anova_lm, so it's not all that different. pyvttbl.DataFrame.anova may be one alternative, but I don't know how popular it is.

In any case, I would consider re-phrasing what you write in the yellow boxes to something like this:

"Note on Python port: Unfortunately, scipy.stats does not have a dedicated function to perform XXXX, so we cannot demonstrate directly that it is fundamentally a linear model."

In general, I'm still really surprised that Python's t-tests etc. do not provide confidence intervals. Could be worth some MORE yellow boxes but you decide :-)

No problem. I made the simulations to study equivalences. R was just a means to that end. I don't expect people to look at it to learn how to do these simulations.
Ugh, I fail to find (Google) python packages that model independent variances (not just correlated). Here is the reasoning behind the nlme::gls approach and an lme4::lmer equivalent` in R: https://stats.stackexchange.com/questions/142685/equivalent-to-welchs-t-test-in-gls-framework

Other ideas for the Python notebook:

I deliberately simulated data which created mid-sized values for ease of comparison. p = 3.321709e-07 and p = 9.83425e-08 look very different but are not. Easier to compare p = 0.43 vs. p = 0.45.
I wonder if not it would be easier to leave out section 8-10, and simply link to the R tutorial? These sections are not Python-specific AFAICS. (Edit: hmm, it does link to the appropriate sections in the Python-version. What do you think?)
I think we should make the cheat sheets differ somewhat. Maybe include a clear R-logo and a Python-Logo somewhere in the two cheat sheets? Right now, I struggle to find an aesthetic solution - mostly because the R logo is so ugly! :-)

Ideas for the cheat sheet:

Vertical text to the left,
Fixing icons on the right,
change lm to a python solution in ANOVA + Kruskal-Wallis row(s).
Extreme petitesse: Maybe the titles could be "Built-in function in scipy.stats" and "Equivalent linear model in smf.ols"?

eigenfoo · 2019-06-28T06:00:58Z

Thanks for the feedback @lindeloev!

pyvttbl.DataFrame.anova may be one alternative, but I don't know how popular it is.

I also came across that solution, but I'd prefer not to use it. The last release of pyvttbl was back in 2013 and it doesn't look like its actively maintained.

I guess the best thing to do would just be to leave notes in the yellow boxes.

In any case, I would consider re-phrasing what you write in the yellow boxes to something like this:

Done!

In general, I'm still really surprised that Python's t-tests etc. do not provide confidence intervals. Could be worth some MORE yellow boxes but you decide :-)

Probably not! I've written a short note describing the lack of CIs (it's when I first actually use a scipy.stats function), but I don't think its worth putting it in its own yellow box 😄

Ugh, I fail to find (Google) python packages that model independent variances (not just correlated). Here is the reasoning behind the nlme::gls approach and an lme4::lmer equivalent` in R:

I'm not sure how to achieve this in statsmodels, but I've added this as a comment to the code.

I wonder if not it would be easier to leave out section 8-10, and simply link to the R tutorial? These sections are not Python-specific AFAICS. (Edit: hmm, it does link to the appropriate sections in the Python-version. What do you think?)

I think it's better to leave them in. As you say, its not language-specific content, so there's no reason to keep them separate: it would be a pain to have to click around two web pages to read those sections!

Ideas for the cheat sheet:

I admit that I skimped on the effort for the cheatsheet! I'm not a big fan of cheatsheets, but it could definitely be better and prettier. I'll look into writing it up better ~~in LaTeX~~.

eigenfoo · 2019-06-28T07:43:20Z

Cheatsheets fixed (b962693)! I quickly gave up on LaTeX - I forget what a nightmare it is to write in LaTeX.

I'm not sure if adding the Python logo is wise - it might clutter the sheet even more, and it's already jam-packed with information! I think it should be obvious which cheatsheet it is, depending on how the viewer finds it (i.e. either through your blog or mine).

I think this might be ready to release and publicize - what do you think @lindeloev?

lindeloev · 2019-06-28T09:09:13Z

This all sounds reasonable and the updated cheat sheet looks great!

My major worry re the cheat sheet was that when your python version (hopefully!) goes viral, that people would think the "N/A"s mean "not possible in theory" when it is just a technical limitation of the Python modules right now. How about either:

Making these N/A links which point to https://lindeloev.github.io/tests-as-linear
Same as 1, but writing "N/A in Python, but see R version"

In addition, I just added links to your Python version in the R cheat sheet:

Maybe you could do the same below the title in "your" cheat sheet, pointing to the R version? (Also added links to the Python version in my Notebook).

With this, I think it's ready for prime time!

eigenfoo · 2019-06-28T10:02:29Z

Ah, valid concerns! I suppose I don't spend enough time thinking about what how things could be misconstrued. I've fixed up the "N/A" comments on the cheatsheet, and also linked back to the original R version. I've also released this as v1.0.0.

Would you do the honors of publicizing it? You're the original author, after all! 🚀

EDIT: my Twitter handle is @_eigenfoo, if you'd prefer to tweet it out.

lindeloev · 2019-06-28T10:29:05Z

I'd be very happy to tweet it with all sorts of praise for your work here! Will do it this afternoon. https://eigenfoo.xyz/tests-as-linear/ does not seem to include the update cheat sheet, though, so I'll hold off until then (if this is the link to be shared?).

lindeloev · 2019-06-28T11:11:09Z

Only if you feel like it is worth the time: Consider making a Twitter Card to show the cheat sheet: https://cards-dev.twitter.com/validator. This is the HTML I used to do so: https://github.com/lindeloev/tests-as-linear/blob/master/include/twitter_card.html (change it to your Twitter handle too). I haven't played with HTML in ipython notebooks or the export.

eigenfoo · 2019-06-28T11:26:04Z

Master Twitter user! I'll do this once I get back to a keyboard. I'll let you know!

eigenfoo · 2019-06-28T11:39:36Z

does not seem to include the update cheat sheet

Hmm, it seems to be working for me. Could you refresh the link again?

eigenfoo · 2019-06-28T12:29:41Z

Twitter card created! I just embedded the HTML tags directly into the index.html. I'm realizing that I should invest some time into some basic web design skills... serving a single HTML file makes me feel bad.

Nevertheless, I think everything is ready for prime time!

(if this is the link to be shared?)

Yes! https://eigenfoo.xyz/tests-as-linear/ would be perfect.

lindeloev · 2019-06-28T12:48:53Z

It's live! https://twitter.com/jonaslindeloev/status/1144587998291464195

lindeloev · 2019-06-28T12:52:10Z

It took me a few hours to find not-too-ugly ways to do share buttons, twitter/facebook/linkedin cards, etc. From now on, I'll just copy-paste the header of the notebook you ported :-)

But it's worth it because people use it quite a lot and it's fun to get Twitter mentions so that you can follow the spread of your work. You should definitely put some share buttons on your blog/website :-)

eigenfoo · 2019-06-28T14:26:31Z

I'll try to find some time to take a look at that! In any case, thanks so much for your time! Perhaps we'll bump into each other again sometime.

eigenfoo mentioned this issue Jun 28, 2019

ENH: onboard lindeloevs feedback #23

Merged

eigenfoo mentioned this issue Jun 28, 2019

MAINT: add R version to cheatsheet, improve NA comments #24

Merged

eigenfoo closed this as completed Jun 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for feedback #22

Request for feedback #22

eigenfoo commented Jun 27, 2019 •

edited

Loading

lindeloev commented Jun 27, 2019 •

edited by eigenfoo

Loading

eigenfoo commented Jun 28, 2019 •

edited

Loading

eigenfoo commented Jun 28, 2019 •

edited

Loading

lindeloev commented Jun 28, 2019

eigenfoo commented Jun 28, 2019 •

edited

Loading

lindeloev commented Jun 28, 2019

lindeloev commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

lindeloev commented Jun 28, 2019

lindeloev commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

Request for feedback #22

Request for feedback #22

Comments

eigenfoo commented Jun 27, 2019 • edited Loading

lindeloev commented Jun 27, 2019 • edited by eigenfoo Loading

eigenfoo commented Jun 28, 2019 • edited Loading

eigenfoo commented Jun 28, 2019 • edited Loading

lindeloev commented Jun 28, 2019

eigenfoo commented Jun 28, 2019 • edited Loading

lindeloev commented Jun 28, 2019

lindeloev commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

lindeloev commented Jun 28, 2019

lindeloev commented Jun 28, 2019

eigenfoo commented Jun 28, 2019

eigenfoo commented Jun 27, 2019 •

edited

Loading

lindeloev commented Jun 27, 2019 •

edited by eigenfoo

Loading

eigenfoo commented Jun 28, 2019 •

edited

Loading

eigenfoo commented Jun 28, 2019 •

edited

Loading

eigenfoo commented Jun 28, 2019 •

edited

Loading