Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outsource Entrez Gene logic to cognoma/genes #32

Merged
merged 3 commits into from
Oct 10, 2016
Merged

Conversation

dhimmel
Copy link
Member

@dhimmel dhimmel commented Oct 7, 2016

0.genes-download.ipynb is a notebook to download datasets from cognoma/genes. Update 2.TCGA-process.ipynb to use the gene mapping guidelines in cognoma/genes#1. Remove mapping/PANCAN-mutation/ since this mapping is now done in 2.TCGA-process.ipynb.

Closes #23. Closes #30 by exporting gene info files in 2.TCGA-process.ipynb.

`0.genes-download.ipynb` is a notebook to download datasets from
`cognoma/genes`. Update `2.TCGA-process.ipynb` to use the gene mapping
guidelines in cognoma/genes#1. Remove `mapping/PANCAN-mutation/` since this
mapping is now done in `2.TCGA-process.ipynb`.

Closes cognoma#23. Closes cognoma#30 by exporting gene info files in `2.TCGA-process.ipynb`
@dhimmel
Copy link
Member Author

dhimmel commented Oct 7, 2016

Will update repo and commit info in 0.genes-download.ipynb once cognoma/genes#1 is merged.

Copy link
Member

@cgreene cgreene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read both the new genes-download code and the changes to TCGA-process. Both seem reasonable in light of the new logic in the genes repository.

@dhimmel
Copy link
Member Author

dhimmel commented Oct 9, 2016

@cgreene could you take a look at the additional changes.

@cgreene
Copy link
Member

cgreene commented Oct 10, 2016

Ok - reviewed commit d15f4c1 - do you want to check that the genes are correlated before averaging? They probably are... but I know the Troyanskaya Lab used to have some logic to only calculate the mean for genes that agreed after symbol mapping. I think Matt Hibbs first wrote it if I recall correctly.

@cgreene
Copy link
Member

cgreene commented Oct 10, 2016

Reviewed commit eeba83d and it LGTM 👍

@cgreene
Copy link
Member

cgreene commented Oct 10, 2016

My feelings are: full PR LGTM 👍 - one question. I don't think you need to address it now. You may want to consider it (maybe it's worth an issue)?

@dhimmel
Copy link
Member Author

dhimmel commented Oct 10, 2016

do you want to check that the genes are correlated before averaging?

I agree that correlation is a good sanity check here. Given that only 39 genes had multiple expression measurements (see this diff), I'm not too worried about any issues.

If anyone is worried just reply here or open and issue and I'll be happy to look into it further.

@dhimmel dhimmel merged commit 781eff0 into cognoma:master Oct 10, 2016
@dhimmel dhimmel deleted the genes branch October 10, 2016 14:51
dhimmel added a commit to dhimmel/cancer-data that referenced this pull request Oct 10, 2016
Rerun with gene data created by cognoma#32.
Should result in all genes having a symbol.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants