WikiPathways is a biological pathway database and describes the interactions between biochemical entities in biological processes [Q21092742,Q28090976,Q24082733,Q42896569]. It can be downloaded and used in various formats, one of which is the Resource Description Framework (RDF) [Q26261238].
The WikiPathways SPARQL endpoint can be found at http://sparql.wikipathways.org/. SPARQL allows you to query much of the content of the the WikiPathways data in a machine readable way, which has been used, for example, in the Open PHACTS project [Q27061937,Q54404976].
This book discusses how SPARQL can be used to extract information, using numerous example queries, like the following to get metadata about the data loaded into the SPARQL endpoint.
The following query provides some information about what is currently loaded in the public SPARQL endpoint at http://sparql.wikipathways.org:
metadata
Which gives as output:
metadata
The give some idea of the content of the SPARQL endpoint, this section gives some overall statistics.
We can list the number of pathways for each species available in WikiPathways with this query:
pathwayCountBySpecies
It shows us that there is a strong bias towards human pathways:
pathwayCountBySpecies
Counting metabolites is tricky, as metabolites that are biologically the same (e.g. different charge startes) can have different identifiers. A further complications is that not all metabolites in WikiPathways always have stereochemistry defined, for example because it is biologically obvious, as for amino acids. But we can count the number of Wikidata identifiers to get a reasonable estimate:
metaboliteCountBySpecies
This tells us:
metaboliteCountBySpecies