-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YAML-based contrast definition #370
Comments
Taking one step back, a first PR could replace contrasts:
- id: treatment_A_vs_B
comparison: ["treatment", "A", "B"]
blocking_factors: ["response"] It would be a good first step as it already provides the modules (e.g json validation) for furture additions. |
@grst Created this POC for just making the transition from contrasts.csv to contrasts.yaml: #382 Tested it using the test_maxquant profile, I provided the following yaml file:
Running it with This worked ok, I even ran the nf-test for this profile using this PR, passing it the new contrasts yaml file, and it passed without having to update the snaps: Some caveats that I see: |
Thanks @nschcolnicov! I think shinyngs and differentialabundance are deeply interweaved and both under control from @pinin4fjords. So ultimately, the best way forward seems to update shinyngs rather than duplicating code into the As a next step, could you please create a json schema for |
@grst Waiting on a fix to nf-schema plugin to get the yaml validation to work: nextflow-io/nf-schema#79 |
Thanks all! I'd like to avoid bin scripts and keep things associated with the modules. Happy for anyone to make contributions to shinyngs though! |
@grst @pinin4fjords
I also created an issue in the shinyngs package to include these changes: pinin4fjords/shinyngs#67 Before proceeding with merging any of these PRs we should align on what exactly is the format that we would lile the .yaml to have. @alanmmobbs93 proposed this format in this ticket #371 (comment):
I created this bin/ script to be able to test any changes to the validate_fom_components.R script from the shinyngs package: https://github.com/nf-core/differentialabundance/pull/382/files#diff-48cc6b0867b0868e90e7d5cd3e5b52ce4931590fe464e0f1314f9ba5eb972a5d Once we have aligned on exactly how we want the yaml to look like, we can update the script in the PR, test the pipeline, and once that is done, we can proceed to update the shinyngs tool. |
I think we should first focus on the version without explicit model specification. The model is defined implicitly based on the The format proposed by @alanmmobbs93 will be the next iteration: switching to an explicit model definition. But this will require quite some changes also downstream in the pipeline, so in the interest of making the review process by @pinin4fjords smoother, I suggest to separate these two steps. EDIT: how are we doing with respect to the nf-schema issue? |
Yep, agreed, always good to separate things that way |
|
I'd prefer we just updated it in shinyngs. I can do the release legwork etc. |
Hi @pinin4fjords!
Keep in mind that we are looking into adding more changes to the script in the near future, so we will likely need to repeat this process multiple times. |
Yeah, a couple of reasons:
Appreciate it's a pain development-wise, but it's nice for production. I've resisted allowing others to create new local components for the same reasons, so it wouldn't be fair for me to not object here as well. Hopefully we can bundle the changes on the shinyngs side so there aren't too many cycles of this. Is your PR ready for review? I was watching it, but it's marked as draft currently. |
@pinin4fjords I see, ok makes sense then! Let me review it, I just converted it into a PR and I already see an error in the CI tests. I'll address this issue and tag you once its ready for review |
Thanks! I was OOO yesterday, but will take a look ASAP |
Description of feature
In #362, we agreed that a yaml-based contrast sheet is more flexible than a csv-based sheet and is better suited to cover future developments of the pipeline. The aim of this issue is to come up with a specification of this format that shall then be defined as a json-schema for validation.
A very minimal version for feature parity with the current
contrasts.csv
could look something likeHow exactly to specify contrasts will be the topic of a separate issue.
Beyond the minimal information, we could consider including additional parameters here:
The question is really what information do want to allow setting at a method/contrast level, and what's fixed globally for the entire pipeline run.
Specifying information locally increases flexibility, but potentially leads to redundant information (need to specify parameters over and over for different models) and increases pipeline complexity (need to keep track of more information from
meta
rather than accessing global parameters)CC @apeltzer @tschwarzl @atrigila @nschcolnicov @alanmmobbs93 @suzannejin
The text was updated successfully, but these errors were encountered: