RxRx19b is the second component of the RxRx19 dataset series released by Recursion sharing data from a high-dimensional human cellular assay for COVID-19 associated disease. RxRx19b models the COVID-19-associated cytokine storm. For more information about RxRx19b please visit RxRx.ai and the associated preprint, Functional immune mapping with deep-learning enabled phenomics applied to immunomodulatory and COVID-19 drug discovery.
RxRx19b is part of a larger set of Recursion datasets that can be found at RxRx.ai and on GitHub. For questions about this dataset and others please email [email protected].
The metadata can be found in metadata.csv
and downloaded from here. The schema of the metadata is as follows:
Attribute | Description |
---|---|
site_id | Unique identifier of a given site |
well_id | Unique identifier of a given well |
cell_type | Cell type tested |
experiment | Experiment identifier |
plate | Plate number within the experiment |
well | Location on the plate |
site | Indication of the location in the well where image was taken (always 1 in RxRx19b) |
disease_condition | The disease condition tested in the well (healthy , healthy cytokine cocktail; storm-severe , severe cytokine storm cocktail; or blank, no cytokines) |
treatment | Compound tested in the well (if any) |
treatment_conc | Compound concentration tested (in uM) |
SMILES | Formula of tested compound (as CXSMILES/ChemAxon Extended SMILES) |
The images are found in images/*
and can be downloaded from here (n.b. this is 409GB).
The image data are 2048x2048 8-bit png
files. The image paths, such as HUVEC-1/Plate1/AA02_s1_w3.png
, can be read as:
Experiment Name: Cell type and experiment number (HUVEC experiment 1) Plate Number (1) Well location on plate (column AA, row 2) Site (1) Channel (3)
All six channels (w1
- w6
) make up an single image of a given site
.
Physical resolution: 0.65 micron/pixel.
The deep learning embeddings can be found in embeddings.csv
and downloaded from here (n.b. this is 41MB).
Each row in the csv has a site_id
as described in the metadata schema. The remaining 128 columns are the embedding for that respective site.
- August 2020: initial release
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.