-
Notifications
You must be signed in to change notification settings - Fork 31
Files
/
Copy pathID_MAPPING.csv
72 lines (69 loc) · 14.5 KB
/
ID_MAPPING.csv
1 | URI | Recommended name | Registry identifier | Alternative name(s) | Description | Identifier pattern | Type |
---|---|---|---|---|---|---|---|
2 | http://identifiers.org/clinicaltrials/ | clinicaltrial | MIR:00000137 | NCT | ClinicalTrials.gov provides free access to information on clinical studies for a wide range of diseases and conditions. Studies listed in the database are conducted in 175 countries | ^NCT\d{8}$ | Entity |
3 | http://identifiers.org/hgnc/ | hgnc | MIR:00000080 | HUGO Gene Nomenclature Committee | The HGNC (HUGO Gene Nomenclature Committee) provides an approved gene name and symbol (short-form abbreviation) for each known human gene. All approved symbols are stored in the HGNC database, and each symbol is unique. HGNC identifiers refer to records in the HGNC symbol database. | ^((HGNC|hgnc):)?\d{1,5}$ | Entity |
4 | http://identifiers.org/kegg.drug/ | kegg.drug | MIR:00000025 | KEGG | KEGG DRUG contains chemical structures of drugs and additional information such as therapeutic categories and target molecules. | ^D\d+$ | Entity |
5 | http://identifiers.org/ensembl.protein/ | ensembl.protein | Entity | ||||
6 | http://identifiers.org/refseq/ | refseq | MIR:00000039 | The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products. | ^((AC|AP|NC|NG|NM|NP|NR|NT|NW|XM|XP|XR|YP|ZP)_\d+|(NZ\_[A-Z]{4}\d+))(\.\d+)?$ | Entity | |
7 | http://identifiers.org/drugbank/ | drugbank | MIR:00000102 | The DrugBank database is a bioinformatics and chemoinformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. This collection references drug information. | ^DB\d{5}$ | Entity | |
8 | http://identifiers.org/inchi/ | inchi | MIR:00000383 | IUPAC International Chemical Identifier | The IUPAC International Chemical Identifier (InChI) is a non-proprietary identifier for chemical substances that can be used in printed and electronic data sources. It is derived solely from a structural representation of that substance, such that a single compound always yields the same identifier. | ^InChI\=1S?\/[A-Za-z0-9\.]+(\+[0-9]+)?(\/[cnpqbtmsih][A-Za-z0-9\-\+\(\)\,\/\?\;\.]+)*$ | Entity |
9 | http://identifiers.org/kegg.pathway/ | kegg.pathway | MIR:00000012 | KEGG | KEGG PATHWAY is a collection of manually drawn pathway maps representing our knowledge on the molecular interaction and reaction networks. | ^\w{2,4}\d{5}$ | Entity |
10 | http://identifiers.org/hp/ | hp | MIR:00000571 | hp | The Human Phenotype Ontology (HPO) aims to provide a standardized vocabulary of phenotypic abnormalities encountered in human disease. Each term in the HPO describes a phenotypic abnormality, such as atrial septal defect. The HPO is currently being developed using the medical literature, Orphanet, DECIPHER, and OMIM. | ^HP:\d{7}$ | Entity |
11 | http://identifiers.org/dbsnp/ | dbsnp | MIR:00000161 | The dbSNP database is a repository for both single base nucleotide subsitutions and short deletion and insertion polymorphisms. | ^rs\d+$ | Entity | |
12 | http://identifiers.org/reactome/ | Reactome | MIR:00000018 | Reactome Stable ID | The Reactome project is a collaboration to develop a curated resource of core pathways and reactions in human biology. | (^R-[A-Z]{3}-\d+(-\d+)?(\.\d+)?$)|(^REACT_\d+(\.\d+)?$) | Entity |
13 | http://identifiers.org/pubmed/ | pubmed | MIR:00000015 | PubMed is a service of the U.S. National Library of Medicine that includes citations from MEDLINE and other life science journals for biomedical articles back to the 1950s. | ^\d+$ | Entity | |
14 | http://identifiers.org/ensembl/ | ensembl | MIR:00000003 | Ensembl is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes. This collections also references outgroup organisms. | ^((ENS[A-Z]*[FPTG]\d{11}(\.\d+)?)|(FB\w{2}\d{7})|(Y[A-Z]{2}\d{3}[a-zA-Z](\-[A-Z])?)|([A-Z_a-z0-9]+(\.)?(t)?(\d+)?([a-z])?))$ | Entity | |
15 | http://identifiers.org/uniprot/ | uniprot | MIR:00000005 | UniProtKB UniProt Protein Knowledgebase UniProt-TrEMBL UniProt/TrEMBL UniProtKB/Swiss-Prot | The UniProt Knowledgebase (UniProtKB) is a comprehensive resource for protein sequence and functional information with extensive cross-references to more than 120 external databases. Besides amino acid sequence and a description, it also provides taxonomic data and citation information. | ^([A-N,R-Z][0-9]([A-Z][A-Z, 0-9][A-Z, 0-9][0-9]){1,2})|([O,P,Q][0-9][A-Z, 0-9][A-Z, 0-9][A-Z, 0-9][0-9])(\.\d+)?$ | Entity |
16 | http://identifiers.org/unigene/ | unigene | MIR:00000346 | A UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location. | ^\d+$ | Entity | |
17 | http://identifiers.org/ensembl.transcript/ | ensembl.transcript | Entity | ||||
18 | http://identifiers.org/hgnc.symbol/ | hgnc.symbol | MIR:00000362 | HUGO Gene Nomenclature Committee Symbol | The HGNC (HUGO Gene Nomenclature Committee) provides an approved gene name and symbol (short-form abbreviation) for each known human gene. All approved symbols are stored in the HGNC database, and each symbol is unique. This collection refers to records using the HGNC symbol. | ^[A-Za-z-0-9_]+(\@)?$ | Entity |
19 | http://identifiers.org/ensembl.gene/ | ensembl.gene | Entity | ||||
20 | http://identifiers.org/clinicalsignificance/ | clinicalsignificance | Entity | ||||
21 | http://identifiers.org/ensembl.translation/ | ensembl.translation | Entity | ||||
22 | http://identifiers.org/mgi/ | mgi | MIR:00000037 | MGD MGI Mouse Genome Informatics | The Mouse Genome Database (MGD) project includes data on gene characterization, nomenclature, mapping, gene homologies among mammals, sequence links, phenotypes, allelic variants and mutants, and strain data. | ^MGI:\d+$ | Entity |
23 | http://identifiers.org/pdb/ | pdb | MIR:00000020 | PDB | The Protein Data Bank is the single worldwide archive of structural data of biological macromolecules. | ^[0-9][A-Za-z0-9]{3}$ | Entity |
24 | http://identifiers.org/hgvs/ | hgvs | Entity | ||||
25 | http://identifiers.org/ccds/ | ccds | MIR:00000375 | CCDS | The Consensus CDS (CCDS) project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The CCDS set is calculated following coordinated whole genome annotation updates carried out by the NCBI, WTSI, and Ensembl. The long term goal is to support convergence towards a standard set of gene annotations. | ^CCDS\d+\.\d+$ | Entity |
26 | http://identifiers.org/kegg.compound/ | kegg.compound | MIR:00000013 | KEGG | KEGG compound contains our knowledge on the universe of chemical substances that are relevant to life. | ^C\d+$ | Entity |
27 | http://identifiers.org/biocarta.pathway/ | biocarta.pathway | MIR:00000421 | BioCarta is a supplier and distributor of characterized reagents and assays for biopharmaceutical and academic research. It catalogs community produced online maps depicting molecular relationships from areas of active research, generating classical pathways as well as suggestions for new pathways. This collections references pathway maps. | ^([hm]\_)?\w+Pathway$ | Entity | |
28 | http://identifiers.org/clinvar/ | clinvar | MIR:00000596 | ClinVar archives reports of relationships among medically important variants and phenotypes. It records human variation, interpretations of the relationship specific variations to human health, and supporting evidence for each interpretation. Each ClinVar record (RCV identifier) represents an aggregated view of interpretations of the same variation and condition from one or more submitters. Submissions for individual variation/phenotype combinations (SCV identifier) are also collected and made available separately. This collection references the Variant identifier. | ^\d+$ | Entity | |
29 | http://identifiers.org/wikipathways/ | wikipathways | MIR:00000076 | WikiPathways is a resource providing an open and public collection of pathway maps created and curated by the community in a Wiki like style. All content is under the Creative Commons Attribution 3.0 Unported license. | WP\d{1,5}(\_r\d+)?$ | Entity | |
30 | http://identifiers.org/rxcui/ | rxnorm | Entity | ||||
31 | http://identifiers.org/efo/ | efo | MIR:00000391 | EFO | The Experimental Factor Ontology (EFO) provides a systematic description of many experimental variables available in EBI databases. It combines parts of several biological ontologies, such as anatomy, disease and chemical compounds. The scope of EFO is to support the annotation, analysis and visualization of data handled by the EBI Functional Genomics Team. | ^\d{7}$ | Entity |
32 | http://identifiers.org/pubchem.compound/ | pubchem.compound | MIR:00000034 | PubChem Compound PubChem CID | PubChem provides information on the biological activities of small molecules. It is a component of NIH's Molecular Libraries Roadmap Initiative. PubChem Compound archives chemical structures and records. | ^\d+$ | Entity |
33 | http://identifiers.org/unii/ | unii | MIR:00000531 | Substance Registration System Unique ingredient identifier | The purpose of the joint FDA/USP Substance Registration System (SRS) is to support health information technology initiatives by generating unique ingredient identifiers (UNIIs) for substances in drugs, biologics, foods, and devices. The UNII is a non- proprietary, free, unique, unambiguous, non semantic, alphanumeric identifier based on a substanceÛªs molecular structure and/or descriptive information. | ^[A-Z0-9]+$ | Entity |
34 | http://identifiers.org/chembl/ | chembl | Entity | ||||
35 | http://identifiers.org/iuphar.ligand/ | iuphar.ligand | MIR:00000457 | The IUPHAR Compendium details the molecular, biophysical and pharmacological properties of identified mammalian sodium, calcium and potassium channels, as well as the related cyclic nucleotide-modulated ion channels and the recently described transient receptor potential channels. It includes information on nomenclature systems, and on inter and intra-species molecular structure variation. This collection references ligands. | ^\d+$ | Entity | |
36 | http://identifiers.org/ncbigene/ | ncbigene | MIR:00000069 | Entrez Gene | Entrez Gene is the NCBI's database for gene-specific information, focusing on completely sequenced genomes, those with an active research community to contribute gene-specific information, or those that are scheduled for intense sequence analysis. | ^\d+$ | Entity |
37 | http://identifiers.org/chebi/ | chebi | Entity | ||||
38 | http://identifiers.org/zfin/ | zfin | Entity | ||||
39 | http://identifiers.org/omim/ | omim | Online Mendelian Inheritance in Man is a catalog of human genes and genetic disorders. | ^[*#+%^]?\d{6}$ | Entity | ||
40 | http://identifiers.org/inchikey/ | inchikey | MIR:00000387 | hashed InChI | The IUPAC International Chemical Identifier (InChI, see MIR:00000383) is an identifier for chemical substances, and is derived solely from a structural representation of that substance. Since these can be quite unwieldly, particularly for web use, the InChIKey was developed. These are of a fixed length (25 character) and were created as a condensed, more web friendly, digital representation of the InChI. | ^[A-Z]{14}\-[A-Z]{10}(\-[A-Z])? | Entity |
41 | http://identifiers.org/doid/ | do | MIR:00000233 | DO | The Disease Ontology has been developed as a standardized ontology for human disease with the purpose of providing the biomedical community with consistent, reusable and sustainable descriptions of human disease terms, phenotype characteristics and related medical vocabulary disease concepts. | ^DOID\:\d+$ | Entity |
42 | http://identifiers.org/taxonomy/ | taxonomy | MIR:00000006 | NEWT NCBI taxonomy | The taxonomy contains the relationships between all living forms for which nucleic acid or protein sequence have been determined. | ^\d+$ | Entity |
43 | http://identifiers.org/orphanet/ | orphanet | MIR:00000220 | Orpha | Orphanet is a reference portal for information on rare diseases and orphan drugs. ItÛªs aim is to help improve the diagnosis, care and treatment of patients with rare diseases. | ^\d+$ | Entity |
44 | http://identifiers.org/pharmgkb.pathways/ | pharmgkb.pathway | MIR:00000089 | The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and clinical information about people who have participated in pharmacogenomics research studies. The data includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains. PharmGKB Pathways are drug centric, gene based, interactive pathways which focus on candidate genes and gene groups and associated genotype and phenotype data of relevance for pharmacogenetic and pharmacogenomic studies. | ^PA\d+$ | Entity | |
45 | http://biothings.io/terms/drugname/ | drugname | Entity | ||||
46 | http://identifiers.org/mesh/ | mesh | MIR:00000560 | Medical Subject Headings | MeSH (Medical Subject Headings) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. This thesaurus is used by NLM for indexing articles from biomedical journals, cataloguing of books, documents, etc. | ^(C|D)\d{6}$ | Entity |
47 | http://identifiers.org/ncit/ | nci | MIR:00000139 | NCI thesaurus | NCI Thesaurus (NCIt) provides reference terminology covering vocabulary for clinical care, translational and basic research, and public information and administrative activities, providing a stable and unique identification code. | ^C\d+$ | Entity |
48 | http://identifiers.org/snomedct/ | snomed | MIR:00000269 | SNOMED CT (Systematized Nomenclature of Medicine -- Clinical Terms), is a systematically organized computer processable collection of medical terminology covering most areas of clinical information such as diseases, findings, procedures, microorganisms, pharmaceuticals, etc. | ^(\w+)?\d+$ | Entity | |
49 | http://identifiers.org/umls/ | umls | MIR:00000559 | Unified Medical Language System | The Unified Medical Language System is a repository of biomedical vocabularies. Vocabularies integrated in the UMLS Metathesaurus include the NCBI taxonomy, Gene Ontology, the Medical Subject Headings (MeSH), OMIM and the Digital Anatomist Symbolic Knowledge Base. UMLS concepts are not only inter-related, but may also be linked to external resources such as GenBank. | ^C\d+$ | Entity |
50 | http://biothings.io/concepts/genotypes/ | Genotype Object | Object | ||||
51 | http://biothings.io/concepts/phenotypes/ | Phenotype Object | Object | ||||
52 | http://biothings.io/concepts/ptm/ | PTM Object | Object | ||||
53 | http://biothings.io/concepts/molecular_processing/ | Molecular Processing Object | Object | ||||
54 | http://biothings.io/concepts/topology/ | TOPLOGY Object | Object | ||||
55 | http://biothings.io/concepts/clinical_evidence/ | Clinical Evidence Object | Object | ||||
56 | http://biothings.io/concepts/disease_name/ | Disease Name | Entity | ||||
57 | http://biothings.io/concepts/drug_interaction/ | Drug Interaction Object | Object |