Desparsified lasso(1/4): add comments and docstring of the functions #127

lionelkusch · 2025-01-14T16:54:32Z

I improved the comments and the docstring.
Some modifications are related to the issues #112 and #60 .
I let some elements to part to check or to validate:

Do we want other functions for the estimation of the standard deviation of the noise? (L116 of desparsified_lasso.py)
The lambda_max function can be used in the lines L135 and L340 of desparsified_lasso.py. Do we use this function or not?
The calculation of confint_radius inverse 2 times the omega_diag. I don't see the reason of it. (L173 of desparsified_lasso.py)
The function group_reid as an option for "null model", do we keep it or not? (L212 of noise_std.py)
Why the name of the method in group reid is different than the name of the "function"? (L231 of noise_std.py)
Should I remove the comments L273 of noise_std.py?

codecov · 2025-01-14T17:05:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.99%. Comparing base (19e6cf7) to head (f9f6f0f).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #127      +/-   ##
==========================================
+ Coverage   81.70%   81.99%   +0.29%     
==========================================
  Files          43       43              
  Lines        2312     2322      +10     
==========================================
+ Hits         1889     1904      +15     
+ Misses        423      418       -5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jpaillard

Would it be possible / would it make sense to have both deparsified_lasso and deparsified_group_lasso in a single function?

For CPI / LOCO / PI, I used a group argument by default = none, corresponding to the non-grouped case.
When group is not none, it should be a dict with keys corresponding to group ids and values to ids of variable belonging to the group.

doc_conf/references.bib

jpaillard · 2025-01-15T10:01:45Z

src/hidimstat/desparsified_lasso.py

+    fit_Y : bool, optional (default=True)
+        Whether to fit Y in noise estimation.


This is not so clear to me. Could you provide some more details?

This is linked to my question 4:

The function group_reid as an option for "null model", do we keep it or not? (L212 of noise_std.py)

In the Reid function, there is an option to not refit y. I don't understand either what is the usage of this parameter.

lionelkusch · 2025-01-15T16:34:07Z

Would it be possible / would it make sense to have both deparsified_lasso and deparsified_group_lasso in a single function?

For CPI / LOCO / PI, I used a group argument by default = none, corresponding to the non-grouped case. When group is not none, it should be a dict with keys corresponding to group ids and values to ids of variable belonging to the group.

The group has a different meaning in this case. It's related to time.
However, I group the function for avoiding duplicated code.
Can you tell me if the code is still readable?

bthirion

Thax for improving this. Please find some comments enclosed.

bthirion · 2025-01-18T11:31:16Z

src/hidimstat/clustered_inference.py

@@ -65,11 +67,6 @@ def hd_inference(X, y, method, n_jobs=1, memory=None, verbose=0, **kwargs):
    n_jobs : int or None, optional (default=1)
        Number of CPUs to use during parallel steps such as inference.

-    memory : str or joblib.Memory object, optional (default=None)


why do you remove the memory argument ?

This argument is for optimisation of the calculation by memorising the results of a call of a function with the same arguments. I don't think the basic user requires it and I don't take the time to look in detail if it's very efficient.
I think that it's interesting when the function is run multiple times on the same data but I don't think that it's important to keep it for the moment because it should be the case.

src/hidimstat/clustered_inference.py

bthirion · 2025-01-18T11:33:30Z

src/hidimstat/clustered_inference.py

@@ -178,7 +169,6 @@ def clustered_inference(
    method="desparsified-lasso",
    seed=0,
    n_jobs=1,
-    memory=None,


Same comment: why get rid of memory ?

bthirion · 2025-01-18T11:33:44Z

src/hidimstat/ensemble_clustered_inference.py

@@ -113,11 +112,6 @@ def ensemble_clustered_inference(
        Number of CPUs used to compute several clustered inference
        algorithms at the same time.

-    memory : str, optional (default=None)


bthirion · 2025-01-18T11:36:03Z

test/test_noise_std.py

    error_ratio = cov_hat / cov

    assert_almost_equal(np.max(error_ratio), 1.0, decimal=0)
    assert_almost_equal(np.log(np.min(error_ratio)), 0.0, decimal=1)


+def test_reid_exception():
+    "test the exceptions of reid"


Can you add a one-line docstring ?

src/hidimstat/desparsified_lasso.py

bthirion · 2025-01-18T11:55:50Z

src/hidimstat/desparsified_lasso.py

+            Lower bounds of confidence intervals
+        cb_max : array-like
+            Upper bounds of confidence intervals
+    If confidence_interval_only=False:


Can we avoid this pattern of having different outputs. One solution is to always send the full version. Another possibility is to use a dictionary. But I think that systematically doing the full version is better?

I added a boolean for choosing between the difference of outputs and this allows the user to choose what it wants.
If you want to always have the full output, I can remove the boolean associate of these difference outputs.

src/hidimstat/desparsified_lasso.py

Co-authored-by: bthirion <[email protected]>

remove optional in the docstring Co-authored-by: bthirion <[email protected]>

…h/hidimstat into PR_desparsified_lasso

lionelkusch added 24 commits January 2, 2025 18:00

Remove the method parameter because only lasso is supported

99fa174

Comment reid procedure

3eebe97

update comment

ffbb6e9

Comments desparsified and reorganise the structure

f92ea4d

Improve comment and let's 2 question?

d465c36

Comment deparsified lasso grouping

eeae34b

format

d23e3d6

comment group_red

4ad785c

Comment emperical snr

4787c27

Comment the side function of desparsified lasso

97ff04f

Fix bugs

4cdaf90

Put back stationnary noise

931e52f

Fix the changement of signature

d3d91e8

Modify documentation to include new function

92c7f57

Add a question

8366e70

Improve commit

e81e3db

remove unecesary parameter

9052326

Merge branch 'main' into PR_desparsified_lasso

7e4435a

Fix citation

8068e89

Add reference to citation

b4055a3

Improve docstring with copilot

109dcd2

Add test

0ca0ec1

Improve docstring

977cd30

Format files

f25cc8c

lionelkusch requested review from jpaillard and bthirion January 14, 2025 16:54

lionelkusch added 2 commits January 14, 2025 17:58

Format files

f304a5e

Remove memory parameters

aa44640

jpaillard reviewed Jan 15, 2025

View reviewed changes

lionelkusch added 8 commits January 15, 2025 16:37

Update the function

ab0ebd5

Group the function in one

9081811

Update the tests

e2dd939

Fix usage of the function

aea49d6

Remove one option for confidence interval

4b376c7

Formating

6c9989a

Formating files

0511cc6

update reference

942a508

lionelkusch added 5 commits January 15, 2025 17:52

Fix bugs

6001f9b

formating

ef5ec38

Merge branch 'main' into PR_desparsified_lasso

c5a070f

Improve coverage

9a099b9

Improve coverage

e9ba724

bthirion reviewed Jan 18, 2025

View reviewed changes

lionelkusch and others added 8 commits January 20, 2025 14:01

Apply suggestions from code review

44985a1

Co-authored-by: bthirion <[email protected]>

Update the doctsring of a file

367d0df

Apply suggestions from code review

0d09b9b

remove optional in the docstring Co-authored-by: bthirion <[email protected]>

Replace lambda by alpha

7c6ada3

Merge branch 'PR_desparsified_lasso' of https://github.com/lionelkusc…

9dbebc4

…h/hidimstat into PR_desparsified_lasso

Replace distrib by distribution

91c8b67

format

1b381a5

Add dimension of array in docstring

f9f6f0f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Desparsified lasso(1/4): add comments and docstring of the functions #127

Desparsified lasso(1/4): add comments and docstring of the functions #127

lionelkusch commented Jan 14, 2025

codecov bot commented Jan 14, 2025 •

edited

Loading

jpaillard left a comment

jpaillard Jan 15, 2025

lionelkusch Jan 15, 2025

lionelkusch commented Jan 15, 2025

bthirion left a comment

bthirion Jan 18, 2025

lionelkusch Jan 20, 2025

bthirion Jan 18, 2025

lionelkusch Jan 20, 2025

bthirion Jan 18, 2025

lionelkusch Jan 20, 2025

bthirion Jan 18, 2025

lionelkusch Jan 20, 2025

bthirion Jan 18, 2025

lionelkusch Jan 20, 2025

		fit_Y : bool, optional (default=True)
		Whether to fit Y in noise estimation.

Desparsified lasso(1/4): add comments and docstring of the functions #127

Are you sure you want to change the base?

Desparsified lasso(1/4): add comments and docstring of the functions #127

Conversation

lionelkusch commented Jan 14, 2025

codecov bot commented Jan 14, 2025 • edited Loading

Codecov Report

jpaillard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lionelkusch commented Jan 15, 2025

bthirion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 14, 2025 •

edited

Loading