Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Legacy server down #70

Open
nursatranscriptomine opened this issue Jan 10, 2023 · 4 comments
Open

Legacy server down #70

nursatranscriptomine opened this issue Jan 10, 2023 · 4 comments
Assignees

Comments

@nursatranscriptomine
Copy link

Hi

We use the legacy Monarch server for data download features that are not available (that we know of) in the new version.

We get the error below when attempting to access legacy.monarchinitiative.org.

Is the legacy server still available or has it been retired? If the latter can you point us to any download functionality in the new version?

Thanks

Neil McKenna

Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.

@kevinschaper
Copy link
Member

Hi Neil,

I restarted the service and it appears to have come back.

We don't have a shutdown date yet, but we do plan on shutting down the legacy service. We're currently developing what is essentially v3 of Monarch, where v1 is the legacy service and v2 is what's running now. Can you let me know your use case, so that we can look at supporting it in the new api?

@nursatranscriptomine
Copy link
Author

nursatranscriptomine commented Jan 10, 2023 via email

@kevinschaper
Copy link
Member

kevinschaper commented Jan 12, 2023

Interestingly, following that tsv link essentially brings you from the legacy v1 architecture to the public Solr server that's part of the v2 architecture:

https://solr.monarchinitiative.org/solr/golr/select?defType=edismax&qt=standard&indent=on&wt=csv&rows=100000&start=0&fl=subject,subject_label,subject_taxon,subject_taxon_label,object,object_label,relation,relation_label,evidence,evidence_label,source,is_defined_by,qualifier&facet=true&facet.mincount=1&facet.sort=count&json.nl=arrarr&facet.limit=25&facet.method=enum&csv.encapsulator=%22&csv.separator=%09&csv.header=true&csv.mv.separator=%7C&fq=subject_category:%22gene%22&fq=object_closure:%22MP:0008782%22&facet.field=subject_taxon_label&q=*:*

It's probably a terrible long term idea to replace what you have with this url (substituting in different values for MP:0008782), but in a pinch, for the time being, it should work even if the legacy server freezes again.

From our existing production API, you can get JSON returned which contains the gene labels

If you happen to have curl and jq installed, this will extract gene symbols as essentially a single column quoted tsv. The url will work otherwise on its own of course, but obviously it's JSON format. Is tsv output a critical need for you?

curl -sX GET "https://api.monarchinitiative.org/api/bioentity/phenotype/MP:0008782/genes?rows=10000" -H "accept: application/json" | jq '.associations[].object.label'

We still have a little bit of work to before a beta of our v3 api is up and running, but the data artifacts are available (though still subject to change!) and might come in handy.

http://data.monarchinitiative.org/monarch-kg-dev/latest/monarch-kg-denormalized-edges.tsv.gz is the file that we'll be using to populate our new Solr instance. If I download it, unzip and query with q like so:

q "select subject_label from monarch-kg-denormalized-edges.tsv where object ='MP:0008782'"
myeloid cell leukemia sequence 1
nuclear factor of activated T cells, cytoplasmic, calcineurin dependent 1
signal transducer and activator of transcription 6
POU domain, class 2, associating factor 1
B cell CLL/lymphoma 11A (zinc finger protein)
telomerase RNA component
phosphatase and tensin homolog 
...

It looks like it's returning gene names rather than symbols, so I think it's not ready for prime time just yet (I'd like that field to be populated with gene symbols). An alternative (using just file artifacts), would be our sqlite database artifact which wraps node and edge tsv tables up in a database file: http://data.monarchinitiative.org/monarch-kg-dev/latest/monarch-kg.db.gz

sqlite3 monarch-kg.db "select distinct nodes.symbol from nodes, edges where nodes.id = edges.subject and edges.object = 'MP:0008782' and edges.predicate = 'biolink:has_phenotype'"
Bcl2
Cd22
Ebf1
Ets1
Grb2
Blnk
Myc
Prkcd
Plcg2
Mcl1
Nfatc1
Stat6
Pou2af1
Bcl11a
Pten
Terc
Spib
Smad7
Smarcc1
Ep300
Ikbkb
Faim
Tlr2
Sh3bp2
Adgrg3
Sfn
Huwe1
Peli1
Parp14
Fam72a
Ube2n
Micu1
Fnip1
Pdap1
Mir150
Atmin
Gm614
Nfkbid

The monarch-kg-dev artifacts aren't my suggested solution right now, but they're available for a look.

Is your ideal to continue having an API endpoint that takes the phenotypic feature ID and returns in tsv format?

Also, we're working on a Solr docker container with pre-populated data available that anyone can run in their own stack, is that appealing to you at all?

@nursatranscriptomine
Copy link
Author

nursatranscriptomine commented Jan 12, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants