-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outsource Entrez Gene logic to cognoma/genes #32
Conversation
`0.genes-download.ipynb` is a notebook to download datasets from `cognoma/genes`. Update `2.TCGA-process.ipynb` to use the gene mapping guidelines in cognoma/genes#1. Remove `mapping/PANCAN-mutation/` since this mapping is now done in `2.TCGA-process.ipynb`. Closes cognoma#23. Closes cognoma#30 by exporting gene info files in `2.TCGA-process.ipynb`
Will update repo and commit info in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read both the new genes-download code and the changes to TCGA-process. Both seem reasonable in light of the new logic in the genes
repository.
@cgreene could you take a look at the additional changes. |
Ok - reviewed commit d15f4c1 - do you want to check that the genes are correlated before averaging? They probably are... but I know the Troyanskaya Lab used to have some logic to only calculate the mean for genes that agreed after symbol mapping. I think Matt Hibbs first wrote it if I recall correctly. |
Reviewed commit eeba83d and it LGTM 👍 |
My feelings are: full PR LGTM 👍 - one question. I don't think you need to address it now. You may want to consider it (maybe it's worth an issue)? |
I agree that correlation is a good sanity check here. Given that only 39 genes had multiple expression measurements (see this diff), I'm not too worried about any issues. If anyone is worried just reply here or open and issue and I'll be happy to look into it further. |
Rerun with gene data created by cognoma#32. Should result in all genes having a symbol.
0.genes-download.ipynb
is a notebook to download datasets fromcognoma/genes
. Update2.TCGA-process.ipynb
to use the gene mapping guidelines in cognoma/genes#1. Removemapping/PANCAN-mutation/
since this mapping is now done in2.TCGA-process.ipynb
.Closes #23. Closes #30 by exporting gene info files in
2.TCGA-process.ipynb
.