Skip to content

Commit

Permalink
Merge pull request #10 from ncihtan/cds-access
Browse files Browse the repository at this point in the history
Update CDS Access Instructions (DRS Manifest Method)
  • Loading branch information
alexeliotlash authored Apr 15, 2024
2 parents 0bf6adb + acc8a98 commit ed65007
Show file tree
Hide file tree
Showing 13 changed files with 67 additions and 44 deletions.
31 changes: 29 additions & 2 deletions access_controlled/CDS_access.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,17 @@ order: 997

The [CDS Portal](https://dataservice.datacommons.cancer.gov/), within NCI's Cancer Research Data Commons (CRDC), provides an interface to filter and select data from a variety of NCI programs, including controlled-access, primary sequence data from the Human Tumor Atlas Network (HTAN).

In order to access these HTAN data within the [CDS Portal](https://dataservice.datacommons.cancer.gov/), navigate to the portal in a web browser and click on the **Explore CDS Data** button on the landing page.
# DRS Manifest Files

To access data via CDS, first generate a CDS Data Repository Service (DRS) manifest containing the files you would like to obtain. DRS manifests are CSV files and require at minimum the **name** and **drs_uri** of each file of interest. For HTAN data, DRS manifests can be generated in one of three ways:

1. CDS Portal
2. HTAN Data Portal
3. Google BigQuery (Coming Soon!)

## 1. Generating a Manifest File from the CDS Portal

In order to access HTAN imaging data within the [CDS Portal](https://dataservice.datacommons.cancer.gov/), navigate to the portal in a web browser and click on the **Explore CDS Data** button on the landing page.

<p align="center"><img width="364" alt="1" src="https://github.com/ncihtan/htan_missing_manual/assets/123744798/40aff1af-a58f-49dc-9253-6ee5e67ef419">
</p>
Expand Down Expand Up @@ -47,5 +57,22 @@ Clicking on the cart icon, will bring up a list of the selected files. Click on

&nbsp;

Once this file manifest is downloaded, it will have to be uploaded into your [Seven Bridges Cancer Genomics Cloud](https://www.cancergenomicscloud.org/) account, in order for you to be able to download, or otherwise compute on, these data.

## 2. Generating a Manifest File from the HTAN Data Portal

From the [HTAN Data Portal](https://humantumoratlas.org/), click **CDS/SB-CGC (dbGaP)** under the **Data Access** filter.

![HTAN Portal: Accessing Genomic Data in CDS](../img/cds_genomics1.png)

Navigate to the **Files** tab, check the box next to **Filename** in upper left, and then click **Download selected files**.
![HTAN Portal: Selecting Genomic Files](../img/cds_genomics2.png)

Click **Download Manifest**, which will download a local file called `cds_manifest.csv`.
![HTAN Portal: Download DRS Manifest](../img/cds_genomics3.png)


## 3. Generating a Manifest File from Google BigQuery (Coming Soon!)


# Accessing Data
Once you have your manifest, follow the instructions on SB-CGC's [Import from a DRS server](https://docs.cancergenomicscloud.org/docs/import-from-a-drs-server#import-from-a-manifest-file) documentation page to import data from a manifest file.
23 changes: 0 additions & 23 deletions access_controlled/SB-CGC_access.md

This file was deleted.

Binary file added img/cds_genomics1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/cds_genomics2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/cds_genomics3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/cds_img7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 38 additions & 19 deletions open_access/cds_imaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,40 +6,59 @@ order: 995

HTAN Imaging Level 2 data is now available through the [NCI SB-CGC Cancer Data Service (CDS)](https://datacommons.cancer.gov/repository/cancer-data-service).

!!!
**NOTE**: dbGaP approval for HTAN study [phs002371](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs002371.v3.p1) is required in order to access HTAN lower-level genomics data, such as RNAseq FASTQ and BAM files.
!!!

Data access via Seven Bridges Cancer Genomics Cloud (SB-CGC) requires a CGC account [[register here](https://docs.cancergenomicscloud.org/docs/sign-up-for-the-cgc)]. For further information on using SB-CGC resources including programmatic access options, you can explore their [online documentation](https://docs.cancergenomicscloud.org/docs).

# DRS Manifest Files

To access data via CDS, first generate a CDS Data Repository Service (DRS) manifest containing the files you would like to obtain. DRS manifests are CSV files and require at minimum the **name** and **drs_uri** of each file of interest. For HTAN data, DRS manifests can be generated in one of three ways:

1. CDS Portal
2. HTAN Data Portal
3. Google BigQuery (Coming Soon!)

## 1. Generating a Manifest File from the CDS Portal

In order to access HTAN imaging data within the [CDS Portal](https://dataservice.datacommons.cancer.gov/), navigate to the portal in a web browser and click on the **Explore CDS Data** button on the landing page.

<p align="center"><img width="364" alt="1" src="https://github.com/ncihtan/htan_missing_manual/assets/123744798/40aff1af-a58f-49dc-9253-6ee5e67ef419">
</p>

&nbsp;

On the Data Explorer page, expand the STUDY section on the left sidebar, scroll down, and check the box next to **Human Tumor Atlas (HTAN) imaging data**.

![CDS Portal: Accessing HTAN Imaging Data](../img/cds_img4.png)

This action will change the summary panel to reflect selecting HTAN data only.

## Filtering for HTAN Images in the CDS File Repository
Scroll down, or click on the **Collapse View** tab on the upper right just below the query summary line in order to see the tabulated view of all of the participants, samples or files in HTAN.

From the [SB-CGC dashboard](https://cgc.sbgenomics.com/home/), click **Cancer Data Service Explorer** under the **Data** tab.
![CDS Portal: Accessing HTAN Imaging Data](../img/cds_img7.png)

![CDS: Accessing the CDS file explorer](../img/cds_img1.png)
Click on the **Add All Files** button, or select the check boxes next to all Participants, Samples or Files for a subselection and then click on the **Add Selected** button. This action will update your cart icon in the upper right corner.

Select **Explore files**
![CDS Portal: Accessing HTAN Imaging Data](../img/cds_img5.png)

![CDS: Accessing the CDS file explorer](../img/cds_img2.png)
Clicking on the cart icon, will bring up a list of the selected files. Click on the **Download Manifest** button in the upper right to download a CSV-formated (Excel compatible) file of this file list.

From the sidebar, filter by **Dataset**: HTAN and **Experimental Strategy**: ImagingLevel2
![CDS Portal: Adding Data to Cart](../img/cds_img6.png)

![CDS: Filter by HTAN study](../img/cds_img3.png) ![CDS: Filter for imaging data](../img/cds_img4.png)

This provides a listing of all HTAN Imaging Level 2 data that is currently available through CDS.
## 2. Generating a Manifest File from the HTAN Data Portal

![CDS: HTAN Imaging Data on CDS](../img/cds_img5.png)
From the [HTAN Data Portal](https://humantumoratlas.org/), click **CDS/SB-CGC (Open Access)** under the **Data Access** filter.

![HTAN Portal: Accessing Imaging Data in CDS](../img/cds_img1.png)

## Download Images from CDS
Navigate to the **Files** tab, check the box next to **Filename** in upper left, and then click **Download selected files**.
![HTAN Portal: Selecting Imaging Files](../img/cds_img2.png)

Additional filters are available for further selection including Data format, Site, etc., as well as text search fields to search files by Filename, case ID (HTAN Participant ID), and sample ID (HTAN Biospecimen ID).
Click **Download Manifest**, which will download a local file called `cds_manifest.csv`.
![HTAN Portal: Download DRS Manifest](../img/cds_img3.png)

Once you have filtered to your files of interest, click **Copy to project** to add the selected files to the SB-CGC project of your choosing (create a new project if you do not have one set up).

![CDS: Add selected files to project](../img/cds_img6.png)
## 3. Generating a Manifest File from Google BigQuery (Coming Soon!)

You will be automatically re-directed to the **Files** tab of your SB-CGC project. From here, check the boxes of the files you would like to save. Clicking **Download** will download the selected images to your local machine.

![CDS: Download selected imaging files](../img/cds_img7.png)
# Accessing Data
Once you have your manifest, follow the instructions on SB-CGC's [Import from a DRS server](https://docs.cancergenomicscloud.org/docs/import-from-a-drs-server#import-from-a-manifest-file) documentation page to import data from a manifest file.

0 comments on commit ed65007

Please sign in to comment.