-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Folder organisations #86
Comments
My detail vision of it is: . +--ReadMe.txt +--pyproject.toml +--LICENCE +--codecov.yml +--.gitignore +--doc_conf | +--conf.py (configuration file) | +--Makefile (make file for creating the library) | +--documentation (folder for additional documentation) | +--api.rst (index of the API) | +--index.rst (home page of the documentation) | +--... (specific pages such as how to contribute, ....) | +--references.bib (bibliography reference) | +--docs (folder where to build the documentation) +--examples (folder with examples) | +--__init__.py | +--figures (folder to save all the generated figures and additional figures) | +--generates (folders which contains generated by examples) | +--get_started | +--find_importance_variable_1.png | +--... | +--models | +--ada_svr_1.png (illustrate the data of the toy_dataset) | +--ada_svr_2.png (illustrate the results on the toy_dataset) | +--... | +--estimators | +--dnn_learner_1.png | +--... | +--comparison_models | +--geometric_2D_1.png | +--benchmarks_1D_dataset_1.png | +--... | +--external (folder which contains figures to illustrate the examples) | +--get_started | +--... | +--models | +--ada_svr_1.py (figure 1of Gaonkar et al. 2012) | +--... | +--comparison_models | +--... | +--_utils (folder with the function for examples) | +--__init__.py | +--plot_function.py (functions for plotting datasets or results) | +--.... | +--get_started (folder for having tutorials or basic examples of how to use Hidimstat) | +--find_importance_variable.py (apply one generic method to a dataset) | +--.... | +--models (folder which shows how to use models on toy dataset and after a description of the methods and a description of advantages, disadvantages and assumptions) | +--ada_svr.py | +--permutation_test.py | +--..... | +--estimator (folder which shows how to use an estimator on toy dataset and after a description of the estimator and a description of advantages, disadvantages and assumptions) | +--dnn_learner.py | +--..... | +--comparison_models (folder where the models are compared on toy dataset with a specific characteristic (linear/non-linear, geometric, ...) and benchmark) | +--geometric_2D.py | +--.... | +--benchmarks (study the speed of each model) | +--1D_dataset.py | +--scalability.py | +--performance p~n.py | +--.... +--hidimstat (folder with code) | +--__init__.py | +--models (folder with models for estimate variables of importance) | +--__init__.py | +--_utils (folder with functions shared between functions of the sub-packages) | +--__init__.py | +--scikit-learn_estimator.py (generic function for using the estimator API) | +--.... | +--tests (folder for testing the models) | +--__init__.py | +--_utils (folder with functions shared between tests) | +--__init__.py | +--.... | +--test_ada_svr.py | +--test_permutation_test.py | +--.... | +--ada_svr.py | +--permutation_test.py | +--.... | +--estimator (folder with specific estimators) | +--__init__.py | +--_utils (folder with functions shared between functions of the sub-packages) | +--__init__.py | +--.... | +--tests (folder for testing the estimators) | +--__init__.py | +--_utils (folder with functions shared between tests) | +--__init__.py | +--.... | +--test_dnn_learner.py | +--.... | +--dnn_learner.py | +--.... | +--extra (folder for generation of toy_dataset and statistics methods) | +--__init__.py | +--toy_data | +--__init__.py | +--_utils (folder with functions shared between functions of the subsub-packages) | +--__init__.py | +--.... | +--tests (folder for testing the generated function) | +--__init__.py | +--_utils (folder with functions shared between tests) | +--__init__.py | +--.... | +--test_1d_dataset.py | +--.... | +--1d_dataset.py | +--.... | +--stat_tools (the folder contains methods for calculating pvalue and calculating error rate) | +--__init__.py | +--_utils (folder with functions shared between functions of the subsub-packages) | +--__init__.py | +--.... | +--tests (folder for testing the generated function) | +--__init__.py | +--_utils (folder with functions shared between tests) | +--__init__.py | +--.... | +--test_pval.py | +--.... | +--pval.py | +--.... |
It will be impossible to talk about this very detailed vision.
|
The doc_conf is composed of:
For the moment, there is not a specific organisation of the different nonautogenerated documentation. |
The examples folders is composed of 5 folders:
|
hidimstat is composed of 3 sub-packages:
If sub_packages are composed of a _utils folder (shared function in subpackage), a test folder (for the tests) and the functions. |
@bthirion @Remi-Gau @jpaillard @man-shu |
quick things:
|
I have a few suggestions:
|
The toy_dataset won't contain data; it will contain only functions for generating data, in my view. |
My suggestions:
|
except for the _utils folder that make sense when you want to keep things private to a subpackage 2 tools I have used in other projects:
|
I think I'd like to start with something simpler while we have very few examples, and reorganize a posteriori depending on the examples we have. |
I would not have too many levels. extra/toy_data and extra/stat_tools should rather be something like a |
Why ? |
Thx for bringing up this discussion ! |
We don't have a specific dataset where the data are present required to be stored at the moment. |
I didn't add them to _utils because there are functions required, for example and tests. They shouldn't be private functions. We need to make a difference between side functions, which are public and side functions, which are private. |
In my opinion, there are missing examples; it's why I want to add them. The example will be here for answering 2 questions to users:
|
random thought (feel free to ignore): may be easier to have some rules or guideline, that the project should follow regarding folder structure and try try to slowly implement it, rather than trying to find the 'right' structure. Obviously easier said than done. |
I don't plan for a brutal refactoring of the project. It's more to have a direction where to move. |
But I'd like to reuse, as much as possible, public datasets, because they are known to users. Generating data means that you "invent" (at least come up with) the problem together with the solution, which is not great. I'd really like to confine generated data to situations where there is no other possibility. |
We should start with https://christophm.github.io/interpretable-ml-book/ and https://shap.readthedocs.io |
I separate the different discussion in different issues for going in mode details:
If I miss a point or you have a new point, you can open an issue or add a comment here. |
Based on the issue #93, there won't be a separate folder for "side function". |
From the PR #73, we need to have a vision about the organization of the library.
The text was updated successfully, but these errors were encountered: