Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purple, Cobalt, Amber: Paired Tumour-Normal Whole Exome Sequencing (targeted mode) #648

Open
SeanNesdoly opened this issue Dec 4, 2024 · 2 comments

Comments

@SeanNesdoly
Copy link

Dear Hartwig Medical Foundation,

Many thanks for creating and sharing a comprehensive suite of genome analysis tools.

For the 'targeted' mode of your Purple purity, ploidy, & Copy Number Variant caller pipeline, I understand that it was designed to analyze tumour-only panel sequencing data for Illumina's TruSight Oncology 500 assay.

In my case, I have paired tumour-normal whole exome sequencing data that I would like to analyze with Cobalt, Amber, and Purple. Specifically, I am trying to run Cobalt's NormalisationFileBuilder. In your code, it calculates a relativeEnrichment factor for each on-target 1 kbp-wide window in the genome using the depth information across all specified tumour samples (and optionally, WGS tumour samples). This is used to adjust the raw/unnormalised Cobalt depths (notably, for both tumour & normal samples). This normalisation step also effectively masks the genome to restrict Purple's analysis to the on-target windows.

Do you know if it is possible to run Cobalt in the 'targeted' mode with paired tumour-normal whole exome sequencing data? I'd be happy to discuss more specifics with reference to your codebase, if you have the time.

Best regards,
Sean

Sean Nesdoly
Bioinformatician
morrissylab.ucalgary.ca
Riddell Centre for Cancer Immunotherapy
Charbonneau Cancer Institute, Cumming School of Medicine
University of Calgary

@p-priestley
Copy link
Contributor

Both exome and tumor-normal targeted mode are fully supported

You are on the right track with running the cobalt normalisation file builder. This creates the main resource file you need to normalise the copy number biases of the panel

In case you didn't find it, the full overview for how to generate resources specific to your targeted panel or exome is here:
https://github.com/hartwigmedical/hmftools/blob/master/pipeline/README_TARGETED.md

This is the same procedure we followed to generate the TSO500 resources.

We are happy to answer questions if anything is unclear

@SeanNesdoly
Copy link
Author

SeanNesdoly commented Dec 5, 2024

Thanks for your quick reply and the offer of help.

When calculating relativeEnrichment(w) for an on-target window $w$ in the
genome, a median GC-adjusted depth is calculated across all samples $s \in S_{tum}$
using SampleRegionData.adjustedGcRatio() values. Here, $S_{tum}$ is
the set of tumour samples used as input to Cobalt's NormalisationFileBuilder
for which Cobalt tumour depths are available (unnormalised, run in
tumour-normal mode without the -target_region parameter). My questions:

  1. There is an option to specify WGS tumour samples that match the
    targeted/panel samples in $S_{tum}$. If used, its window depths are used as
    the divisor in the relativeEnrichment(w) median calculation:

    $median ( \frac{adjustedGcRatio_{s.panel}(w)}{adjustedGcRatio_{s.wgs}(w)})$
    $\forall (s.panel, s.wgs) \in S_{tum}$

    Presumably, this would mean that the targeted tumour samples have matching
    WGS tumour samples? Or, have I misinterpreted this?

    • calcRelativeEnrichment in cobalt/norm/Normaliser.java
    • addCobaltSampleData (line 135) in cobalt/norm/DataLoader.java
      • cobaltWgsRatio.tumorGCRatio()
  2. The relativeEnrichment(w) calculation, which uses depth information from
    all tumour samples, is then used to normalise both tumour and normal sample
    depths for each window $w$. Does this bias comparisons between tumour and
    normal window depths during CNV calling in Purple, given that the
    normal/germline samples were excluded in the relativeEnrichment(w)
    calculations?

    • Alternatively, how does the normalisation, or usage, of normal/germline
      depths differ from that of tumour depths?

Any insight is greatly appreciated.

Thanks,
Sean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants