Skip to content

Commit

Permalink
add new functions for querying ChEMBL API, and add Vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
jjjermiah committed Mar 15, 2024
1 parent 01e9b08 commit 48b2035
Show file tree
Hide file tree
Showing 9 changed files with 422 additions and 6 deletions.
7 changes: 5 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: AnnotationGx
Title: AnnotationGx: A package for building, updating and querying an
annotation database for pharmaco-genomic data
Version: 0.0.0.9080
Version: 0.0.0.9081
Authors@R: c(
person("Jermiah", "Joseph", role = c("aut", "cre"),
email = "[email protected]"),
Expand Down Expand Up @@ -32,7 +32,10 @@ Suggests:
covr,
readxl,
knitr,
rmarkdown
rmarkdown,
BiocStyle,
RefManageR,
sessioninfo
Config/testthat/edition: 3
Config/testthat/parallel: true
Config/testthat/start-first: watcher, parallel*
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# Generated by roxygen2: do not edit by hand

export(annotatePubchemCompound)
export(getChemblFilterTypes)
export(getChemblMechanism)
export(getChemblResourceFields)
export(getChemblResources)
export(getOncotreeMainTypes)
export(getOncotreeTumorTypes)
export(getOncotreeVersions)
Expand All @@ -14,6 +16,7 @@ export(getUnichemSources)
export(mapCID2Properties)
export(mapCell2Accession)
export(mapCompound2CID)
export(queryChemblAPI)
export(queryUnichem)
export(standardize_names)
import(BiocParallel)
Expand Down
53 changes: 52 additions & 1 deletion R/chembl.R
Original file line number Diff line number Diff line change
Expand Up @@ -78,16 +78,39 @@

# Add the query parameters
query <- list()
query[["format"]] <- format
fld <- paste0(field, "__", filter_type)
query[[fld]] <- value
query[["format"]] <- format
url$query <- query

final_url <- httr2::url_build(url)
final_url |> .build_request()
}


#' Query the ChEMBL API
#'
#' This function queries the ChEMBL API using the specified parameters and returns the response in JSON format.
#'
#' @param resource The resource to query in the ChEMBL API.
#' @param field The field to filter on in the ChEMBL API.
#' @param filter_type The type of filter to apply in the ChEMBL API.
#' @param value The value to filter on in the ChEMBL API.
#' @param format The format of the response (default is "json").
#'
#' @return The response from the ChEMBL API in JSON format.
#'
#' @examples
#' queryChemblAPI("mechanism", "molecule_chembl_id", "in", "CHEMBL1413")
#'
#' @export
queryChemblAPI <- function(resource, field, filter_type, value, format = "json") {
.build_chembl_request(resource, field, filter_type, value, format) |>
.perform_request() |>
.parse_resp_json()
}


#' Get ChEMBL Mechanism
#'
#' This function retrieves information about the mechanism of action for a given ChEMBL ID.
Expand Down Expand Up @@ -162,3 +185,31 @@ getChemblResourceFields <- function(resource) {
checkmate::assert_choice(resource, .chembl_resources())
.chembl_resource_schema(resource)[["fields"]] |> names()
}

#' getChemblResources function
#'
#' This function retrieves the Chembl resources.
#'
#' @return A list of Chembl resources.
#'
#' @examples
#' getChemblResources()
#'
#' @export
getChemblResources <- function(){
.chembl_resources()
}

#' Get the Chembl filter types
#'
#' This function retrieves the Chembl filter types.
#'
#' @return A list of Chembl filter types.
#'
#' @examples
#' getChemblFilterTypes()
#'
#' @export
getChemblFilterTypes <- function(){
.chembl_filter_types()
}
18 changes: 18 additions & 0 deletions man/getChemblFilterTypes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions man/getChemblResources.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 29 additions & 0 deletions man/queryChemblAPI.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

44 changes: 41 additions & 3 deletions tests/testthat/test_chembl.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ test_that("build_chembl_request constructs the correct URL", {
format <- "json"

# Call the function
url <- .build_chembl_request(resource, field, filter_type, value, format)
url <- AnnotationGx:::.build_chembl_request(resource, field, filter_type, value, format)

# Check the constructed URL
expected_url <- "https://www.ebi.ac.uk/chembl/api/data/target?format=json&target_chembl_id__exact=CHEMBL2144069"
expected_url <-"https://www.ebi.ac.uk/chembl/api/data/target?target_chembl_id__exact=CHEMBL2144069&format=json"
expect_equal(url$url, expected_url)
})

Expand All @@ -35,7 +35,7 @@ test_that("getChemblMechanism works", {

url <- getChemblMechanism(chembl_id, returnURL = T)
expect_list(url)
expect_equal(url[[1]], "https://www.ebi.ac.uk/chembl/api/data/mechanism?format=json&molecule_chembl_id__in=CHEMBL1413")
expect_equal(url[[1]], "https://www.ebi.ac.uk/chembl/api/data/mechanism?molecule_chembl_id__in=CHEMBL1413&format=json")
})


Expand All @@ -55,3 +55,41 @@ test_that("getChemblResourceFields works", {
"site_id", "target_chembl_id", "variant_sequence"
))
})

test_that("queryChemblAPI constructs the correct URL and returns parsed JSON response", {
# Set up test data
resource <- "mechanism"
field <- "mechanism_of_action"
filter_type <- "icontains"
value <- "Muscarinic acetylcholine receptor"
format <- "json"
expected_url <- "https://www.ebi.ac.uk/chembl/api/data/mechanism?mechanism_of_action__icontains=Muscarinic%20acetylcholine%20receptor&format=json"

request <- AnnotationGx:::.build_chembl_request(resource, field, filter_type, value, format)
expect_equal(request$url, expected_url)

# Call the function
response <- queryChemblAPI(resource, field, filter_type, value, format)

expect_class(response, "list")

expect_length(response, 2)
})

test_that("getChemblFilterTypes works", {
result <- getChemblFilterTypes()

expect_class(result, "character")
expect_length(result, 19)

expect_true("in" %in% result)
})

test_that("getChemblResources works", {
result <- getChemblResources()

expect_class(result, "character")
expect_length(result, 32)

expect_true("activity" %in% result)
})
134 changes: 134 additions & 0 deletions vignettes/ChEMBL.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
title: "Querying ChEMBL Database"
author:
- name: Jermiah Joseph, Shahzada Muhammad Shameel Farooq, and Christopher Eeles
output:
BiocStyle::html_document:
self_contained: yes
toc: true
toc_float: true
toc_depth: 2
code_folding: show
date: "`r doc_date()`"
package: "`r pkg_ver('AnnotationGx')`"
vignette: >
%\VignetteIndexEntry{Querying ChEMBL Database}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---


```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
crop = NULL ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html
)
```


# Introduction to ChEMBL API

**WARNING: This vignette is a work in progress. If you have questions or would
like to see more features, please open an issue at
[bhklab/AnnotationGx](https://github.com/bhklab/AnnotationGx)**

The ChEMBL database contains information on bioactive drug-like small molecules.
The information includes 2-D structures, calculated properties; logP, Molecular
Weight, Lipinski Parameters, and abstracted bioactivities; binding
constants and ADMET data. The data is curated from primary scientific literature.
The ChEMBL API allows for the data to be made available for retrieval in a
programmatic fashion. We can use the API to query CHEMBL ID of a compound, retrieve
all molecule mechanisms of action, query compound_record resource and molecule
resource from the ChEMBL database.


## Setup

```{r setup agx}
library(AnnotationGx)
```



## Retrieve molecule mechanisms of action from ChEMBL {#chembl-mechanisms}

Given a ChEMBL ID, we can retrieve the molecule mechanisms of action from the
ChEMBL database using the `getChemblMechanism()` function.

**
NOTE: This is a specialized function that queries the API for the *mechanism* resource only.
To query other resources, please see the [Custom Queries](#custom-queries) section.
**



``` {r run one query}
mechs <- getChemblMechanism("CHEMBL1413")
mechs
```

In the above example, multiple mechanisms of action are returned.


## Custom Queries {#custom-queries}

The ChEMBL API allows for a wide range of queries. We have specialized one function,
but are open to incorporating more. Please open an issue at [bhklab/AnnotationGx](https://github.com/bhklab/AnnotationGx)
with an idea of a specialized function that meets a use case.

A query to the API follows the following format:
```
https://www.ebi.ac.uk/chembl/api/data/[resource]?[field]__[filter_type]=[value]&format=[format]
```
More information can be found at the [API Documentation](https://chembl.gitbook.io/chembl-interface-documentation/web-services/chembl-data-web-**services******)

In summary, the requirements for a query are:

1. The `resource` to be queried
2. The reource `field` to be queried
3. The `filter_type` to be used
4. The `value` to be used for the filter
5. (optional) The `format` of the returned data (default is JSON)

For example, the query for the example in [the above section](#chembl-mechanisms) would be:
"https://www.ebi.ac.uk/chembl/api/data/mechanism?molecule_chembl_id__in=CHEMBL1413&format=json"
where:

- `resource` is "mechanism"
- `field` is "molecule_chembl_id"
- `filter_type` is "in"
- `value` is "CHEMBL1413"
- `format` is "json"

These parameters can be used in the `queryChemblAPI(resource, field, filter_type, value, format = "json")`
function to query the ChEMBL API.

**NOTE: unlike the `getChemblMechanism()` function which returns a `data.table`, the `queryChemblAPI()` function
returns the raw data unformatted**

``` {r run custom query}
queryChemblAPI("mechanism", "molecule_chembl_id", "in", "CHEMBL1413")
```

The `getChemblResources()` function returns a list of possible
resources that can be queried:

``` {r query resources}
getChemblResources()
```


The `getChemblResourceFields(resource)` function returns a list of possible
fields that can be queried for a given resource:
``` {r query resource fields}
getChemblResourceFields("mechanism")
```


The `getChemblFilterTypes()` function returns a list of possible filter types.

``` {r query filter types}
getChemblFilterTypes()
```
Loading

0 comments on commit 48b2035

Please sign in to comment.