-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Desparsified lasso(1/4): add comments and docstring of the functions #127
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #127 +/- ##
==========================================
+ Coverage 81.70% 81.99% +0.29%
==========================================
Files 43 43
Lines 2312 2322 +10
==========================================
+ Hits 1889 1904 +15
+ Misses 423 418 -5 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible / would it make sense to have both deparsified_lasso
and deparsified_group_lasso
in a single function?
For CPI / LOCO / PI, I used a group
argument by default = none, corresponding to the non-grouped case.
When group
is not none, it should be a dict with keys corresponding to group ids and values to ids of variable belonging to the group.
src/hidimstat/desparsified_lasso.py
Outdated
fit_Y : bool, optional (default=True) | ||
Whether to fit Y in noise estimation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not so clear to me. Could you provide some more details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is linked to my question 4:
The function group_reid as an option for "null model", do we keep it or not? (L212 of noise_std.py)
In the Reid function, there is an option to not refit y. I don't understand either what is the usage of this parameter.
The group has a different meaning in this case. It's related to time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thax for improving this. Please find some comments enclosed.
@@ -65,11 +67,6 @@ def hd_inference(X, y, method, n_jobs=1, memory=None, verbose=0, **kwargs): | |||
n_jobs : int or None, optional (default=1) | |||
Number of CPUs to use during parallel steps such as inference. | |||
|
|||
memory : str or joblib.Memory object, optional (default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you remove the memory argument ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This argument is for optimisation of the calculation by memorising the results of a call of a function with the same arguments. I don't think the basic user requires it and I don't take the time to look in detail if it's very efficient.
I think that it's interesting when the function is run multiple times on the same data but I don't think that it's important to keep it for the moment because it should be the case.
@@ -178,7 +169,6 @@ def clustered_inference( | |||
method="desparsified-lasso", | |||
seed=0, | |||
n_jobs=1, | |||
memory=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment: why get rid of memory ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
@@ -113,11 +112,6 @@ def ensemble_clustered_inference( | |||
Number of CPUs used to compute several clustered inference | |||
algorithms at the same time. | |||
|
|||
memory : str, optional (default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
test/test_noise_std.py
Outdated
error_ratio = cov_hat / cov | ||
|
||
assert_almost_equal(np.max(error_ratio), 1.0, decimal=0) | ||
assert_almost_equal(np.log(np.min(error_ratio)), 0.0, decimal=1) | ||
|
||
|
||
def test_reid_exception(): | ||
"test the exceptions of reid" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a one-line docstring ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Lower bounds of confidence intervals | ||
cb_max : array-like | ||
Upper bounds of confidence intervals | ||
If confidence_interval_only=False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid this pattern of having different outputs. One solution is to always send the full version. Another possibility is to use a dictionary. But I think that systematically doing the full version is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a boolean for choosing between the difference of outputs and this allows the user to choose what it wants.
If you want to always have the full output, I can remove the boolean associate of these difference outputs.
Co-authored-by: bthirion <[email protected]>
remove optional in the docstring Co-authored-by: bthirion <[email protected]>
…h/hidimstat into PR_desparsified_lasso
I improved the comments and the docstring.
Some modifications are related to the issues #112 and #60 .
I let some elements to part to check or to validate: