Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: preparation for somatic cnv #590

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

ericblanc20
Copy link
Contributor

Adds several (fairly simple & simple-minded) steps required for proper CNV calling:

  • guess_sex: simple inference of sex for autosome & sex chromosome coverage
  • germline_snvs: simple identification of well-supported germline SNPs. The variant_calling step unfortunately cannot be used for this task, as it is designed for trios.
  • somatic_variants_for_cnv: creates input for cnv tools using B-allele fractions to improve/verify CNV calls based on coverage alone. The somatic_variant_calling step cannot be used, as the somatic variants from mutect2 differ greatly when germline variants are included or not.

The current code is OK, but can certainly be improved:

  • Pipes could be extended, but it needs more clever choices in germline_snvs/__init__.py
  • The problems with model serialization & validation should be understood & fixed: at the moment, automatic alias generation in bcftools models seems to work in isolation, when only the step model is considered. But validation fails during registration of sub-steps.
  • Better design of the snappy_wrapper is probably possible. Also, the derived BcftoolsWrapper is a first attempt at streamlining UNIX-like tools (such as bcftools, bedtools, bedops, samtools, rnaqc, ...). Its design should be critically reviewed, before similar wrappers are built.
  • The treatment of ignored_chroms should also be seen as a first attempt to be critically reviewed. The code in genome_windows is exercised in the ignored_chroms wrapper, called from the germline_snvs & somatic_variants_for_cnv snakefiles.

@ericblanc20 ericblanc20 linked an issue Jan 10, 2025 that may be closed by this pull request
@ericblanc20 ericblanc20 requested a review from tedil January 10, 2025 16:51
@ericblanc20 ericblanc20 changed the title 587 preparation for somatic cnv feat: preparation for somatic cnv Jan 10, 2025
Copy link

  • Please format your Python code with ruff: make fmt
  • Please check your Python code with ruff: make check
  • Please format your Snakemake code with snakefmt: make snakefmt

You can trigger all lints locally by running make lint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preparation for somatic CNV
1 participant