-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model validation #371
Comments
POC Features:Inputs:
Outputs:
Functionalities:
Perspectives:
Testing
TestThe following example files are part of the nf-test that can be executed as declared below. They were obtained from the pipeline's nf-test test modules/local/validatemodel/tests/main.nf.test --debug --profile docker Example YML models:
- formula: "~ treatment"
contrasts:
- id: "treatment_mCherry_hND6"
comparison: ["treatment", "mCherry", "hND6"]
- id: "treatment_mCherry_hND6_sample_number"
comparison: ["treatment", "mCherry", "hND6"]
blocking_factors: ["sample_number"]
- id: "treatment234"
comparison: ["treatment", "mCherry", "hND6"] Note: Check that I added the "formula" field, compared to @nschcolnicov POC for the YML validation. The script uses it to iterate over, and adds the blocking factors when required. But it can be adjusted if we want to remove it. If we decide to keep the Example sample sheet
Run script validate_model.R \
--yml path/to/yml \
--samplesheet path/to/samplesheet \
--sample_id_col 'sample' |
blocking factor and formula are mutually exclusive. Ultimately, we want to get rid of the blocking factor and always specify the formula explicitly. Keeping the blocking factors was meant as an intermediate step to keep the changes to the pipeline atomic. |
@grst @nschcolnicov The code is already updated for the simpler yml format contrasts:
- id: "treatment_mCherry_hND6"
comparison: ["treatment", "mCherry", "hND6"]
- id: "treatment_mCherry_hND6_sample_number"
comparison: ["treatment", "mCherry", "hND6"]
blocking_factors: ["sample_number"]
- id: "treatment234"
comparison: ["treatment", "mCherry", "hND6"] However, I noticed now that it will always evaluate simple linear models ( |
Description of feature
This issue is to follow up on the request in #362 to implement model validation. The goal is to catch any errors in the model definition/contrast specification as early as possible for a fast feedback loop.
List of things to check (feel free to add items):
PR/CR
as a factor level fails downstream in clusterProfiler because of the/
).Implementation
Probably convenient to do it in R... but could we even do this in groovy directly? Then errors would be instant and we woudn't need to wait for a process to be fired up.
CC @apeltzer @tschwarzl @atrigila @alanmmobbs93 @nschcolnicov
The text was updated successfully, but these errors were encountered: