Model template #104

lionelkusch · 2024-12-27T10:16:46Z

Based on the PR #58, #73, #100, #101, #102, I propose the following requirement for each model:
acn: acronym
name: full name of the model

acn.py (python file)

import ...

__all__ = [acn, acn_...]

def acn(
    X: np.ndarray[Any, np.dtype[np.float64]],
    y: np.ndarray[Any, np.dtype[np.float64]],
    karg_1:type =...,
    karg_2:type =...)
    ""
    name
   
    short description :footcite:t:`.....`

    Parameters
    -----------------
    X : ndarray, shape (n_samples, n_features)
        Data.

    y : ndarray, shape (n_samples,)
        Target.

    karg_1:....

    karg_2:...

    Returns
    -----------
    .... : array, shape (n_features,)
   

    References
    ----------
    .. footbibliography::

    """
    assert ....

    //function implementation
   
    assert ....

   return ...

// additional functions (example: compute pvalues)
def acn_...(
arg_1:type,
...
karg_1:type, ..
):
    ""   
    short description

    Parameters
    -----------------
    arg_1:type ...
       .......

    karg_1:....

    Returns
    -----------
    .... : ....

    """
    assert ....

    //function implementation
   
    assert ....

   return ...

// private functions
def _prf_1(
arg_1:type,
...
karg_1:type, ..
):
    ""
    short description

    Parameters
    -----------------
    arg_1:type ...
       .......

    karg_1:....

    Returns
    -----------
    .... : ....

    """
    assert ....

    //function implementation
   
    assert ....

   return ...

test/test_acn.py (test for the methods)

import ...

\\unit tests
def test_acn_...():
     ""
     Description of the test (try to reduce to unique one call by functions)
     ""
    .....
    assert ...

\\optional tests (functional, ....)
def test_acn_...():
     ""
     Description of the test
     ""
    .....
    assert ...

example/inference_models/acn.py (example of the methods)

""
acn: name
==================================================================
short description
"""

#############################################################################
# Imports needed for this script
# ------------------------------
import ...

#############################################################################
# dataset
# --------------------

...
plot_dataset()

#############################################################################
# Usage the methods
# -----------------
# see the API for more details about the optional parameter:
# :py:func:`hidimstat.acn`

output = acn(X,y)

#############################################################################
# description of the output
#

#############################################################################
# Plot the results
# ----------------
# 

....

#############################################################################
# Interpretation of result
#.....

############################################################################
#
# Principle of the methods
# ------------------------
# .....

#############################################################################
# Assumptions, Advantages and Disadvantages
# -----------------------------------------
# 
# **Assumptions**:
# ...
# 
# **Advantages**:
#....
# 
# **Disadvantages**:
# ....
#

#############################################################################
# References
# ----------
# .. footbibliography::

The text was updated successfully, but these errors were encountered:

lionelkusch · 2024-12-27T10:19:23Z

This template work only for the functions which doesn't require fitting methods.
When a fitting methods is required, I propose to use a class like for LOCO or CPI and the all the function will be present in the class as methods.

lionelkusch · 2024-12-27T10:21:09Z

@bthirion @jpaillard @AngelReyero.
What do you think?

AngelReyero · 2024-12-27T11:44:10Z

Therefore is it only for the Permutation Feature Importance? What other methods do not need refitting? The only difference with the other methods is the fit function?

lionelkusch · 2024-12-27T12:23:00Z

The knockoff, clustered_inference, dcrt_zero, desparsified_lasso are functions which don't need fitting.
From my analysis of the library, actually, only two methods need a fit function: CPI and LOCO.

For me, the fit function means that the algorithm is required to keep an internal state for applying other functions. This is not the case for most algorithms instead because there is fast or their output contains all information.

bthirion · 2024-12-27T22:19:07Z

I agree with the general outline.
One aspect is that we won't create one example for each function of the library. Examples are here to guide users toward the main information. They should not be exhaustive.
Please also remember that structure is here to help by providing guidelines and a common understanding. What matters first is functionality, clarity of the material and making maintenance easy.

AngelReyero · 2024-12-28T09:58:52Z

In the case of the knockoffs for instance, wouldn't we also need a similar internal state to keep the information estimated to generate the knockoffs? For example, if using the generating method from https://arxiv.org/abs/2407.06892 there is need to fit multiple regressors. Similarly for other generating methods based on covariance estimation, where in order to generate knockoffs it is necessary to keep the covariance estimate.

lionelkusch · 2024-12-30T08:02:48Z

Yes, I released that some methods require having access to the estimator, such as knockoffs or permutation_test. This can be generalised to most of the methods.

If we stay with functions, which signatures are the most interesting for you:
def acn(
X: np.ndarray[Any, np.dtype[np.float64]],
y: np.ndarray[Any, np.dtype[np.float64]],
estimator: scikit-learn.BaseEstimator,
karg_1:type =...,
karg_2:type =...)

def acn(
X: np.ndarray[Any, np.dtype[np.float64]],
estimator: scikit-learn.BaseEstimator,
karg_1:type =...,
karg_2:type =...)

In the second case, 'y' is computed at the beginning of the function.
For the estimator, do you want to impose a requirement that the estimator is already fit it or not?

However, if we have the estimators as parameters, I will prefer a class implementation for getting track of the estimators, especially if we need to fit them.

lionelkusch added method implementation Question regarding methods implementations management of project question regarding the policy of the project labels Dec 27, 2024

This was referenced Dec 30, 2024

Selection of model examples #106

Open

ADA-SVR (3/4) PR example of models #100

Open

Graphical representation for the users #51

Open

lionelkusch mentioned this issue Jan 14, 2025

Estimation threshold(1/4): add comments and docstring of the functions #122

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model template #104

Model template #104

lionelkusch commented Dec 27, 2024

lionelkusch commented Dec 27, 2024

lionelkusch commented Dec 27, 2024

AngelReyero commented Dec 27, 2024

lionelkusch commented Dec 27, 2024

bthirion commented Dec 27, 2024

AngelReyero commented Dec 28, 2024

lionelkusch commented Dec 30, 2024

Model template #104

Model template #104

Comments

lionelkusch commented Dec 27, 2024

acn.py (python file)

test/test_acn.py (test for the methods)

example/inference_models/acn.py (example of the methods)

lionelkusch commented Dec 27, 2024

lionelkusch commented Dec 27, 2024

AngelReyero commented Dec 27, 2024

lionelkusch commented Dec 27, 2024

bthirion commented Dec 27, 2024

AngelReyero commented Dec 28, 2024

lionelkusch commented Dec 30, 2024