Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export a gene information table #23

Closed
dhimmel opened this issue Sep 7, 2016 · 1 comment
Closed

Export a gene information table #23

dhimmel opened this issue Sep 7, 2016 · 1 comment
Labels

Comments

@dhimmel
Copy link
Member

dhimmel commented Sep 7, 2016

Similar to how we have sample information in samples.tsv, it would be nice to create a table with gene information. The primary identifier is entrez_gene_id. Additional columns could be:

  • symbol
  • name
  • chromosome
  • n_mutations - number of mutated samples
  • median_expression - median gene expression
  • mad_expression - median absolute deviation of gene expression

I'm leaning towards a combined dataset for mutation and expression genes. But I could be convinced that splitting the datasets would be better.

We should probably get this information from entrez gene as @clairemcleod did in #12.

Labeling this issue a task awaiting a claimer.

@dhimmel dhimmel added the task label Sep 7, 2016
@cgreene
Copy link
Member

cgreene commented Sep 8, 2016

@dhimmel median_expression and mad_expression explanations might need a quick edit.

dhimmel added a commit to dhimmel/cancer-data that referenced this issue Oct 5, 2016
dhimmel added a commit to dhimmel/cancer-data that referenced this issue Oct 5, 2016
dhimmel added a commit to dhimmel/cancer-data that referenced this issue Oct 7, 2016
`0.genes-download.ipynb` is a notebook to download datasets from
`cognoma/genes`. Update `2.TCGA-process.ipynb` to use the gene mapping
guidelines in cognoma/genes#1. Remove `mapping/PANCAN-mutation/` since this
mapping is now done in `2.TCGA-process.ipynb`.

Closes cognoma#23. Closes cognoma#30 by exporting gene info files in `2.TCGA-process.ipynb`
dhimmel added a commit that referenced this issue Oct 10, 2016
* Outsource Entrez Gene logic to cognoma/genes

`0.genes-download.ipynb` is a notebook to download datasets from
`cognoma/genes`. Update `2.TCGA-process.ipynb` to use the gene mapping
guidelines in cognoma/genes#1. Remove `mapping/PANCAN-mutation/` since this
mapping is now done in `2.TCGA-process.ipynb`.

Closes #23. Closes #30 by exporting gene info files in `2.TCGA-process.ipynb`

* Average expression values for the same gene

* Update cognoma/genes download location
dhimmel added a commit to dhimmel/cancer-data that referenced this issue Oct 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants