Skip to content

Latest commit

 

History

History
76 lines (55 loc) · 3.58 KB

README.md

File metadata and controls

76 lines (55 loc) · 3.58 KB

CYP2D6 Parser

The CYP2D6 parser utilizies the recommendations by PharmVar to organize CYP2D6 haplotypes and genotypes. The parsing and sorting of these haplotypes appears simple but many edge cases exist. This tool helps to facillitate the genotype reporting process by providing a standardized approach for reporting haplotypes.

Current tool outputs supported:

  1. Aldy
  2. Cyrius
  3. PyPGx
  4. Stargazer - In Progress
  5. StellarPGx

Installation

Installation can be acheived using pip or any other package manager of your choosing

pip install cyp2d6_parser 

Parsing genotypes

API

import cyp2d6_parser

# Provide a genotype. If a caller is known, you can provide it to aid in some parsing
# otherwise just a genotype is fine too, but a warning will be reported
results = cyp2d6_parser.parse_genotype(genotype="*1/*10+*36")

# Provide a file. If supplying a file or directory, must identify the type of caller used
results = cyp2d6_parser.parse_genotype(file='aldy_results.tsv', caller='aldy')

# Can also provide a directory of files using glob patterns
results = cyp2d6_parser.parse_genotype(dir_='aldy_results/*.tsv', caller='aldy', output_file='results.tsv')


# parse_genotype above will report activity score and phenotype
result = cyp2d6_parser.get_activity_score(genotype='*1/*10+*36')

result = cyp2d6_parser.get_phenotype(genotype='*1/*10+*36')
result = cyp2d6_parser.get_phenotype(activity_score=2)

CLI

python3 -m cyp2d6_parser genotype --genotype_call "*1/*10+*36"

python3 -m cyp2d6_parser --caller aldy genotype --file aldy_results.tsv

python3 -m cyp2d6_parser --caller aldy genotype --dir aldy_results/*.tsv -o results.tsv

python3 -m cyp2d6_parser activity --genotype_call "*1/*10+*36"

python3 -m cyp2d6_parser phenotype --genotype_call "*1/*10+*36"
python3 -m cyp2d6_parser phenotype --activity_score 2

Interpreting output from parse_genotype

  1. sample_id - File name or left blank
  2. genotype_raw - Genotype(s) used as input
  3. genotype - Parsed and organized genotype according to PharmVar reccomendations. Retired alleles are converted to the new allele name (Ex: *57 -> *36). If you would like to report retired alleles, an option is available. Suballeles names are removed (Ex: *4.001 -> *4). If a single genotype cannot be resolved, Indeterminate/Indeterminate is reported.
  4. activity_score - Total activity score for the reported genotype. Activity score may contain >= values if the genotype contains allele(s) with unknown activity.
    • genotype: *1/*106
    • activity_score: CYP2D6 Activity Score >= 1.0
  5. phenotype - Phenotype based on activity score
  6. copy_number - Total number of copies in genotype only if the genotype is not Indeterminate. The only allele which does not count towards total copy number is *5 (gene deletion).
  7. possible_genotype - If the input genotypes cannot be resolved to a single genotype, all possible genotypes are reported in this field as semicolon delimited.
  8. possible_activity_score - If all possible genotypes resolve to a single activity score, then an activity score is reported
  9. possible_phenotype - If all possible activity scores resolve to a single phenotype, then the phenotype is reported. The activity score field can be blank with a possible phenotype. Ex:
    • possible_genotypes: *1/*1;*1/*10
    • possible_activity_score: not reported because possible values are 2 and 1.25
    • possible_phenotype: CYP2D6 Normal Metabolizer

Bugs and desired features

Please report any bugs and/or desired features to either the github issues page or to [email protected]