From 666e7fbbe55734068dea6b7f942a9dca26a835a5 Mon Sep 17 00:00:00 2001 From: Will Hannon Date: Fri, 1 Mar 2024 13:54:03 -0800 Subject: [PATCH] Fix spelling and grammar --- .vscode/settings.json | 8 +++ introduction/getting-started/index.md | 20 +++---- introduction/what-is-dms-viz/index.md | 12 ++-- preparing-data/command-line-api/index.md | 36 ++++++------ preparing-data/data-requirements/index.md | 22 ++++---- project-info/contributing-guide/index.md | 44 +++++++-------- visualizing-data/vignettes/index.md | 68 +++++++++++------------ visualizing-data/web-tool-api/index.md | 32 +++++------ 8 files changed, 124 insertions(+), 118 deletions(-) create mode 100644 .vscode/settings.json diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 0000000..50ca08b --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,8 @@ +{ + "cSpell.words": [ + "epitope", + "epitopes", + "psuedotyping", + "wildtype" + ] +} \ No newline at end of file diff --git a/introduction/getting-started/index.md b/introduction/getting-started/index.md index 115b07a..0393627 100644 --- a/introduction/getting-started/index.md +++ b/introduction/getting-started/index.md @@ -2,17 +2,16 @@ ## Overview -Using **`dms-viz`** involves two steps. First, using a command line tool called [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/), you specifcy some information about your dataset to generate a `.json` format specification file. Second, you open up the [web-based tool](https://dms-viz.github.io/) and upload your specification file to generate an interactive visualization. Below are some quickstart instructions to get you oriented. +Using **`dms-viz`** involves two steps. First, using a command line tool called [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/), you specify some information about your dataset to generate a `.json` format specification file. Second, you open up the [web-based tool](https://dms-viz.github.io/) and upload your specification file to generate an interactive visualization. Below are some quickstart instructions to get you oriented. ::: tip Want to Skip Ahead? -If you're insterested in the detailed command line API, check out the reference [here](/preparing-data/command-line-api/). If you've already formatted your data and you're ready to start visualizing it, check out the instructions for that [here](/visualizing-data/web-tool-api/). +If you're interested in the detailed command line API, check out the reference [here](/preparing-data/command-line-api/). If you've already formatted your data and you're ready to start visualizing it, check out the instructions for that [here](/visualizing-data/web-tool-api/). ::: - -## Prerequsites +Prerequisites To start using **`dms-viz`** with your own data, you'll need to install the command line tool [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/). To use `configure-dms-viz`, you must ensure that you have the correct version of Python (3.9 or later) installed on your system. -If you are unsure whether you have the correct version of Python installed, open a terminal window (Command Prompt in Windows, Terminal in macOS, or a terminal emulator in Linux) and type the following command and press Enter: +If you are unsure whether you have the correct version of Python installed, open a terminal window (Command Prompt in Windows, Terminal in macOS, or a terminal emulator in Linux) type the following command and press Enter: ```bash python --version @@ -36,7 +35,7 @@ Currently, `configure-dms-viz` is distributed on [PyPI](https://pypi.org/), allo pip install configure-dms-viz ``` -Now, `configure-dms-viz` should have been installed and you shouldn't see any error messages. You can double-check that the installation worked correctly typing the following into the terminal: +Now, `configure-dms-viz` should have been installed and you shouldn't see any error messages. You can double-check that the installation worked correctly by typing the following into the terminal: ```bash configure-dms-viz --help @@ -49,7 +48,7 @@ You should see the help message for the tool printed to the terminal. `configure_dms_viz` is a command-line tool designed to create a `JSON` format specification file for **`dms-viz`**. You provide the data that you'd like to visualize along with additional information to customize the analysis. The resulting specification file can be uploaded to [**`dms-viz`**](https://dms-viz.github.io/) for interactive visualization of your data. Below is an overview of the process of using `configure_dms_viz`. ::: tip Looking for more details? -For a detailed explaination of the features of `configure_dms_viz` check out the reference [here](/preparing-data/command-line-api/). +For a detailed explanation of the features of `configure_dms_viz` check out the reference [here](/preparing-data/command-line-api/). ::: `configure-dms-viz` has two commands, `format` and `join`. To format a single dataset for **`dms-viz`**, you execute the `configure-dms-viz format` command with the required and optional arguments as needed: @@ -98,7 +97,7 @@ configure-dms-viz format \ --tooltip-cols "{'times_seen': '# Obsv', 'effect': 'Func Eff.'}" ``` -Here, we've specified that we want the dataset to be called `LyCoV-1404` and we've pointed to file location of the [input data](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/sars2/escape/LyCoV-1404_avg.csv) and [sitemap](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/sars2/site_numbering_map.csv). In addition, we've specified that we want to use the protein structure `6xr8` from the [RSCB PDB](https://www.rcsb.org/) and that we want to visualize the `escape_mean` column of the input dataset. We've also specified some _optional_ arguments including [additional data](/preparing-data/command-line-api/#join-data), [filters](/preparing-data/command-line-api/#filter-cols), [tooltips](/preparing-data/command-line-api/#tooltip-cols), and the [name](/preparing-data/command-line-api/#name) we want to show up for the metric we're visualizing. +Here, we've specified that we want the dataset to be called `LyCoV-1404` and we've pointed to the file location of the [input data](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/sars2/escape/LyCoV-1404_avg.csv) and [sitemap](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/sars2/site_numbering_map.csv). In addition, we've specified that we want to use the protein structure `6xr8` from the [RSCB PDB](https://www.rcsb.org/) and that we want to visualize the `escape_mean` column of the input dataset. We've also specified some _optional_ arguments including [additional data](/preparing-data/command-line-api/#join-data), [filters](/preparing-data/command-line-api/#filter-cols), [tooltips](/preparing-data/command-line-api/#tooltip-cols), and the [name](/preparing-data/command-line-api/#name) we want to show up for the metric we're visualizing. The result of this command should be a message printed to the terminal that looks like this: @@ -123,7 +122,7 @@ Success! The visualization json was written to 'tests/sars2/output/LyCoV-1404.js This message provides some information about the `configure-dms-viz format` run on your dataset. In addition to this message, there should be a `.json` file located where you specified the output path ([`tests/sars2/output/LyCoV-1404.json`](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/sars2/output/LyCoV-1404.json)). -This is how you can use `configure-dms-viz` to format a single dataset. You can optionally combine multiple datasets into a single `.json` specification file using the `configure-dms-viz join` command. this command takes a list of `.json` files as an arguments along with an optional description of the datasets. For more details on combining datasets, check out the [API](/preparing-data/command-line-api/). +This is how you can use `configure-dms-viz` to format a single dataset. You can optionally combine multiple datasets into a single `.json` specification file using the `configure-dms-viz join` command. this command takes a list of `.json` files as arguments along with an optional description of the datasets. For more details on combining datasets, check out the [API](/preparing-data/command-line-api/). For now, since we're only visualizing a single dataset, we can skip this step. In the next section, you'll take this `.json` visualization file and visualize your data with [**`dms-viz`**](https://dms-viz.github.io/). @@ -142,8 +141,7 @@ To upload a local file, you simply click on the `Upload Data` section and choose Since the `.json` file created above should now be stored locally on your machine, you can upload this file using this approach. ### Remote - -Alternativley, if your raw `.json` file is hosted somewhere online – like on GitHub, for example – you can provide the link to this file by clicking on the `Remote` button under the `Upload Data` section. +Alternatively, if your raw `.json` file is hosted somewhere online – like on GitHub, for example – you can provide the link to this file by clicking on the `Remote` button under the `Upload Data` section.
Remote Upload diff --git a/introduction/what-is-dms-viz/index.md b/introduction/what-is-dms-viz/index.md index e920c9c..eb0548a 100644 --- a/introduction/what-is-dms-viz/index.md +++ b/introduction/what-is-dms-viz/index.md @@ -1,26 +1,26 @@ # What is dms-viz? -Hi there 👋, if you've got some mutation-level data that you want to view on an interactive 3D protein structure, you're in the right place! **`dms-viz`** is a tool that helps you take quantitative data associated with mutations to a protein and analyze that data using intutive visual summaries in the context of an interactive 3D protein structure. Visualizations created with **`dms-viz`** are intended to be _flexible_, _customizable_, and _shareable_. +Hi there 👋, if you've got some mutation-level data that you want to view on an interactive 3D protein structure, you're in the right place! **`dms-viz`** is a tool that helps you take quantitative data associated with mutations to a protein and analyze that data using intuitive visual summaries in the context of an interactive 3D protein structure. Visualizations created with **`dms-viz`** are intended to be _flexible_, _customizable_, and _shareable_. ::: tip Ready to use the tool? -You can skip to the [Quickstart](/introduction/getting-started/) to learn how to prepare your own data, or you can see what the visualization tool looks like [here](https://dms-viz.github.io/). +You can skip to the [Quickstart](/introduction/getting-started/) to learn how to prepare your data, or you can see what the visualization tool looks like [here](https://dms-viz.github.io/). ::: ## Purpose Understanding how mutations impact a protein's functions is valuable for many types of biological questions. High-throughput techniques such as deep-mutational scanning (DMS) have greatly expanded the number of mutation-function datasets. For instance, DMS has been used to determine how mutations to viral proteins affect antibody escape, receptor affinity, and essential functions such as viral genome transcription and replication. -The mutation-based data generated by these approaches is often best understood in the context of a protein’s 3D structure; for instance, to assess questions like how mutations that affect antibody escape relate to the physical antibody binding epitope on the protein. However, current approaches for visualizing mutation data in the context of a protein’s structure are often cumbersome and require multiple steps and softwares. To streamline the visualization of mutation-associated data in the context of a protein structure, we developed a web-based tool, **`dms-viz`**. With **`dms-viz`**, users can straightforwardly visualize mutation-based data such as those from DMS experiments in the context of a 3D protein model in an interactive format. +The mutation-based data generated by these approaches is often best understood in the context of a protein’s 3D structure; for instance, to assess questions like how mutations that affect antibody escape relate to the physical antibody binding epitope on the protein. However, current approaches for visualizing mutation data in the context of a protein’s structure are often cumbersome and require multiple steps and software. To streamline the visualization of mutation-associated data in the context of a protein structure, we developed a web-based tool, **`dms-viz`**. With **`dms-viz`**, users can straightforwardly visualize mutation-based data such as those from DMS experiments in the context of a 3D protein model in an interactive format. ## Why use dms-viz? - **Flexible Inputs** - Our command-line tool, `configure-dms-viz`, helps streamline data formatting by facillitating the integration of data from different sources into a singular, universal `JSON` specification file. Moreover, `configure-dms-viz` helps you define custom filters and tooltips, and identify common errors. + Our command-line tool, `configure-dms-viz`, helps streamline data formatting by facilitating the integration of data from different sources into a singular, universal `JSON` specification file. Moreover, `configure-dms-viz` helps you define custom filters and tooltips, and identify common errors. - **Customizable Visualizations** - We've designed **`dms-viz`** with customization in mind. You can tailor the appearance of the protein structure to fit your needs. Futhermore, you can extend the functionality of the tool with custom filters, tooltips, colors, and more. + We've designed **`dms-viz`** with customization in mind. You can tailor the appearance of the protein structure to fit your needs. Furthermore, you can extend the functionality of the tool with custom filters, tooltips, colors, and more. - **Shareable URLs** @@ -30,7 +30,7 @@ The mutation-based data generated by these approaches is often best understood i **`dms-viz`** has two components: -1. A command line interface (CLI) for formating data that was written in `Python` using the [click](https://click.palletsprojects.com/en/8.1.x/) API. +1. A command line interface (CLI) for formatting data that was written in `Python` using the [click](https://click.palletsprojects.com/en/8.1.x/) API. 2. A web-based visualization tool written in 'vanilla' `Javascript` using primarily the libraries [D3.js](https://d3js.org/) for making the visualizations and [NGL.js](https://nglviewer.org/#page-top) for creating interactive molecular structures. If you're interested in contributing, check out the [Contributing Guide](/project-info/contributing-guide/) for details. diff --git a/preparing-data/command-line-api/index.md b/preparing-data/command-line-api/index.md index 4c7881c..4ee9d0f 100644 --- a/preparing-data/command-line-api/index.md +++ b/preparing-data/command-line-api/index.md @@ -17,7 +17,7 @@ configure-dms-viz format \ [optional_arguments] ``` -This creates a single dataset that can be loaded into **`dms-viz`**. However, in some cases, you might want to visualize multiple datasets simultaneously. To do this, you use the `configure-dms-viz join` command. The `join` command takes a list of formatted `.json` files and combines them into a single `.json` specification file containing each dataset. Optionally, you can also provide a description of the file by specifying the path to a `.md` file with your desired description: +This creates a single dataset that can be loaded into **`dms-viz`**. However, in some cases, you might want to visualize multiple datasets simultaneously. To do this, you use the `configure-dms-viz join` command. The `join` command takes a list of formatted `.json` files and combines them into a single `.json` specification file containing each dataset. Optionally, you can also describe the file by specifying the path to a `.md` file with your desired description: ```bash configure-dms-viz join \ @@ -28,13 +28,13 @@ configure-dms-viz join \ ## `configure-dms-viz format` -_This subcommand formats your data for **`dms-viz`**. Below is a description of each arguement._ +_This subcommand formats your data for **`dms-viz`**. Below is a description of each argument._ - ### `--input` `` - Path to a `.csv` file with site- and mutation-level data to visualize on a protein structure. [See details here](/preparing-data/data-requirements/) for required columns and format. + Path to a `.csv` file with site- and mutation-level data to visualize with a protein structure. [See details [here](/preparing-data/data-requirements/) for the required columns and format. - ### `--name` @@ -46,13 +46,13 @@ _This subcommand formats your data for **`dms-viz`**. Below is a description of `` - Path to a `.csv` file containing a map between reference sites in the experiment and sequential sites. [See details here](/preparing-data/data-requirements/) for required columns and format. + Path to a `.csv` file containing a map between reference sites in the experiment and sequential sites. [See details here](/preparing-data/data-requirements/) for the required columns and format. - ### `--metric` `` - Name of the column that contains the value to visualize on the protein structure. This tells the tool which column you want to visualize on a protein strucutre. + Name of the column that contains the value to visualize with the protein structure. This tells the tool which column you want to visualize on a protein structure. - ### `--structure` @@ -70,25 +70,25 @@ _This subcommand formats your data for **`dms-viz`**. Below is a description of `` - If there are multiple measurements per mutation, the name of the column that contains that condition distinguishing these measurements. + If there are multiple measurements for each mutation, the name of the column that contains the condition distinguishing these measurements. - ### `--metric-name` `` - The name that will show up for your metric in the plot. This let's you customize the names of your columns in your visualization. For example, if your metric column is called `escape_mean` you can rename it to `Escape` for the visualization. + The name that will show up for your metric in the plot. This lets you customize the names of your columns in your visualization. For example, if your metric column is called `escape_mean` you can rename it to `Escape` for the visualization. -- ### `--conditon_name` +- ### `--condition_name` `` - The name that will show up for your condition column in the title of the plot legend. For example, if your condition column is 'epitope', you might rename it to be capilized as 'Epitope' in the legend title. + The name that will show up for your condition column in the title of the plot legend. For example, if your condition column is 'epitope', you might rename it to be capitalized as 'Epitope' in the legend title. - ### `--join-data` `` - A comma separated list of `.csv` file with data to join to the visualization data. This data can then be used in the visualization tooltips or filters. [See details here](/preparing-data/data-requirements/) for formatting requirements. + A comma-separated list of `.csv` files with data to join to the visualization data. This data can then be used in the visualization tooltips or filters. [See details here](/preparing-data/data-requirements/) for formatting requirements. - ### `--tooltip-cols` @@ -112,7 +112,7 @@ _This subcommand formats your data for **`dms-viz`**. Below is a description of `` - A space-delimited string of chain names that correspond to the chains in your PDB structure that correspond to the reference sites in your data (i.e., `'C F M G J P'`). This is only necesary if your PDB structure contains chains that you do not have site- and mutation-level measurements for. + A space-delimited string of chain names that correspond to the chains in your PDB structure that correspond to the reference sites in your data (i.e., `'C F M G J P'`). This is only necessary if your PDB structure contains chains that lack site- and mutation-level measurements. - ### `--excluded-chains` @@ -130,31 +130,31 @@ _This subcommand formats your data for **`dms-viz`**. Below is a description of `` - A comma separated list (with no spaces) of HEX format colors for representing different conditions, i.e. `"#0072B2,#CC79A7,#4C3549,#009E73"`. + A comma-separated list (with no spaces) of HEX format colors for representing different conditions, i.e. `"#0072B2,#CC79A7,#4C3549,#009E73"`. - ### `--negative-colors` `` - A comma separated list (with no spaces) of HEX format colors for representing the negative end of the scale for different conditions, i.e. `"#0072B2,#CC79A7,#4C3549,#009E73"`. If not provided, the inverse of each color is automatically calculated. + A comma-separated list (with no spaces) of HEX format colors for representing the negative end of the scale for different conditions, i.e. `"#0072B2,#CC79A7,#4C3549,#009E73"`. If not provided, the inverse of each color is automatically calculated. - ### `--check-pdb` `` - Whether to perform checks on the provided pdb structure including checking if the 'included chains' are present, what % of data sites are missing, and what % of wildtype residues in the data match at corresponding sites in the structure. + Whether to perform checks on the provided PDB structure including checking if the 'included chains' are present, what % of data sites are missing, and what % of wildtype residues in the data match at corresponding sites in the structure. - ### `--exclude-amino-acids` `` - A comma separated list of amino acids that shouldn't be used to calculate the summary statistics (i.e. `"\*, -"`) + A comma-separated list of amino acids that shouldn't be used to calculate the summary statistics (i.e. `"\*, -"`) - ### `--description` `` - A short description of the dataset that will show up in the tool if the user clicks a button for more information. + A short description of the dataset that shows up in the tool if the user clicks a button for more information. - ### `--title` @@ -164,7 +164,7 @@ _This subcommand formats your data for **`dms-viz`**. Below is a description of ## `configure-dms-viz join` -_This subcommand joins multiple formatted `.json` datasets into one that you can then visualize with **`dms-viz`**. Below is a description of each arguement._ +_This subcommand joins multiple formatted `.json` datasets into one that you can then visualize with **`dms-viz`**. Below is a description of each argument._ :::warning Make sure that you're joining files with unique values for the dataset [name](/preparing-data/command-line-api/#name). @@ -174,7 +174,7 @@ Make sure that you're joining files with unique values for the dataset [name](/p `` - A comma separated list of paths to the `.json` visualization files created by `configure-dms-viz format`. I.e. `--input path/to/my/specification_1.json, path/to/my/specification_2.json, path/to/my/specification_3.json` + A comma-separated list of paths to the `.json` visualization files created by `configure-dms-viz format`. I.e. `--input path/to/my/specification_1.json, path/to/my/specification_2.json, path/to/my/specification_3.json` - ### `--output` diff --git a/preparing-data/data-requirements/index.md b/preparing-data/data-requirements/index.md index 2989ab1..451e454 100644 --- a/preparing-data/data-requirements/index.md +++ b/preparing-data/data-requirements/index.md @@ -1,10 +1,10 @@ # Data Requirements -To use **`dms-viz`**, you'll need two files. First, you'll need some [input data](#input-data) that contains the mutation-based data you'd like to visualize. Second, you'll need [a map](#sitemap) of the sites mutagenized in your dataset to the sites in the reference and protein structure. +To use **`dms-viz`**, you'll need two files. First, you'll need some [input data](#input-data) that contains the mutation-based data you'd like to visualize. Second, you'll need [a map](#sitemap) of the sites that are mutated in your dataset to the sites in the reference and protein structure. _optionally_, if you have [additional data files](#join-data), you can join these with your input data. -Below are the detailed requirements for each datafile along with example datasets. +Below are the detailed requirements for each data file along with example datasets. ## Input Data @@ -24,15 +24,15 @@ The input data must contain the following columns with **exactly** these names: - ### `mutant` - This column should contain the identity of the **mutation** that each measurement is associated with. These mutations should be represented using the [IUPAC single letter codes](https://www.bioinformatics.org/sms/iupac.html) along with symbols for stop codons and gaps (i.e., `R, M, P, *, -`). If you need to extend or shrink this alphabet, you can do so using the [`--alphabet`](/preparing-data/command-line-api/#alphabet) flag of `configure-dms-viz`. + This column should contain the identity of the **mutation** that each measurement is associated with. These mutations should be represented using the [IUPAC single-letter codes](https://www.bioinformatics.org/sms/iupac.html) along with symbols for stop codons and gaps (i.e., `R, M, P, *, -`). If you need to extend or shrink this alphabet, you can do so using the [`--alphabet`](/preparing-data/command-line-api/#alphabet) flag of `configure-dms-viz`. - ### `wildtype` - This column should contain the **wildtype** identity of residues at a given site in the protien. For example, if a Proline (`P`) was mutatated to an Alanine (`A`) at position 120 in the protein (`P120A`), there should be a `P` in the wildtype column for every row where the value of the site column is 120. This column will also be used to check how well the protein structure you provided matches the wildtype sites in your data. Significant discepencies can indicate that you're `reference`, `sequential`, and `protein` sites are misaligned. + This column should contain the **wildtype** identity of residues at a given site in the protein. For example, if a Proline (`P`) was mutated to an Alanine (`A`) at position 120 in the protein (`P120A`), there should be a `P` in the wildtype column for every row where the value of the site column is 120. This column will also be used to check how well the protein structure you provided matches the wildtype sites in your data. Significant discrepancies can indicate that you're `reference`, `sequential`, and `protein` sites are misaligned. --- -In addition to these three mandatory columns, you will also need to specify a `metric` column. The identity of this column is specified with [`--metric`](/preparing-data/command-line-api/#metric) flag of `configure-dms-viz`, and it can have any name: +In addition to these three mandatory columns, you will also need to specify a `metric` column. The identity of this column is specified with the `--metric` flag of `configure-dms-viz`, and it can have any name: - ### `` @@ -44,11 +44,11 @@ _Optionally_, depending on the design of your experiment, you can also include a - ### `condition` - This column should only be included if there are multiple measurements in the [``](/preparing-data/command-line-api/#metric) column for the same `site`/`mutation` combinations. An example of this would be if you have a measurement like antibody escape for multiple 'epitopes' in an antigen. This column contains a unique identifier that's used to deliniate between these measurements for each mutation. This 'identifier' will show up in an interactive legend next to the visualization. + This column should only be included if there are multiple measurements in the [``](/preparing-data/command-line-api/#metric) column for the same `site`/`mutation` combinations. An example of this would be if you have a measurement like an antibody's escape for multiple 'epitopes' in an antigen. This column contains a unique identifier that's used to delineate between these measurements for each mutation. This 'identifier' will show up in an interactive legend next to the visualization. ## Sitemap -The **Sitemap** is a tabular `.csv` dataframe that specifies the order of the [`site`](/preparing-data/data-requirements/#site-or-reference-site) (`reference_site`) column in your input data and, _optionally_, how the [`site`](/preparing-data/data-requirements/#site-or-reference-site) column corresponds to the numbering in the [protein structure](/preparing-data/command-line-api/#structure) you provide. +The **Sitemap** is a tabular `.csv` file that specifies the order of the [`site`](/preparing-data/data-requirements/#site-or-reference-site) (`reference_site`) column in your input data and, _optionally_, how the [`site`](/preparing-data/data-requirements/#site-or-reference-site) column corresponds to the numbering in the [protein structure](/preparing-data/command-line-api/#structure) you provide. ::: warning Important! The sitemap must be in `.csv` format. If your data is tabular but in another format, please convert it to `.csv`. @@ -58,7 +58,7 @@ The sitemap must be in `.csv` format. If your data is tabular but in another for This column **must** correspond to the `site` or `reference_site` column in your [input data](#input-data). If the [`protein_site`](/preparing-data/data-requirements/#protein-site) isn't provided, this column is also assumed to correspond to the identity of the sites in the [protein structure](/preparing-data/command-line-api/#structure) - The `reference_site` refers to the identity of the sites that are mutagenized in your dataset. These sites will ultimatley label the x-axis of the visualization. These '_reference_' sites can sometimes differ from the `sequential_site` ([described below](/preparing-data/data-requirements/#sequential-site)); for example, the current SARS-CoV-2 Spike protein variants have insertaion and deletions that cause the widely used Wuhan-Hu-1 'reference' numbering to differ from the sequential, numeric order of the data. + The `reference_site` refers to the identity of the sites that are mutated in your dataset. These sites will ultimately label the x-axis of the visualization. These '_reference_' sites can sometimes differ from the `sequential_site` ([described below](/preparing-data/data-requirements/#sequential-site)); for example, the current SARS-CoV-2 Spike protein variants have insertions and deletions that cause the widely used Wuhan-Hu-1 'reference' numbering to differ from the sequential, numeric order of the data. - ### `sequential_site` @@ -66,15 +66,15 @@ The sitemap must be in `.csv` format. If your data is tabular but in another for - ### `protein_site` - _Optionally_, this column is only necessary if the `reference_site` sites are different from the sites (residue numbering) in your provided protein strucutre. If they are different, this column is the position in the protein structure that corresponds to the `reference_site` values in your data. + _Optionally_, this column is only necessary if the `reference_site` sites are different from the sites (residue numbering) in your provided protein structure. If they are different, this column is the position in the protein structure that corresponds to the `reference_site` values in your data. - ### `chains` -_Optionally_, this column is only necessary if you've provided the `protein_site` column and there are multiple `reference_site` sites for the same value of `protein_site`. This might be the case if your data corresponds to _discontinuous chains_ in the protein structure. For example, if your data is measured over two separate chains with overlapping numbering schemes. For example, Influenza HA protein structures usually have separate chains with overlapping numbering for the stalk and the head. So the reference sites 102 and 30(HA) might both correspond to the residue number 102 in the PDB file. In that case, the only way to distinguish between them on the structure is with the identity of the chain (i.e. A vs. B). This column should have chains in the same format as the chains provided to [`--included-chains`](/preparing-data/command-line-api/#included-chains) (i.e. a space separated string of chains: "A B C D"). +_Optionally_, this column is only necessary if you've provided the `protein_site` column and there are multiple `reference_site` sites for the same value of `protein_site`. This might be the case if your data corresponds to _discontinuous chains_ in the protein structure. For example, if your data is measured over two separate chains with overlapping numbering schemes. For example, Influenza HA protein structures usually have separate chains with overlapping numbering for the stalk and the head. So the reference sites 102 and 30(HA) might both correspond to the residue number 102 in the PDB file. In that case, the only way to distinguish between them on the structure is with the identity of the chain (i.e. A vs. B). This column should have chains in the same format as the chains provided to [`--included-chains`](/preparing-data/command-line-api/#included-chains) (i.e. a space-separated string of chains: "A B C D"). ## Join Data -_Optionally_, you might have some additional data that you want to combine with your [Input Data](#input-data). You do this so you can include columnns from this **Join Data** in the [filters](/preparing-data/command-line-api/#filter-cols) or [tooltips](/preparing-data/command-line-api/#tooltip-cols) of your visualization. This option helps streamline that workflow. +_Optionally_, you might have some additional data that you want to combine with your [Input Data](#input-data). You do this so you can include columns from this **Join Data** in the [filters](/preparing-data/command-line-api/#filter-cols) or [tooltips](/preparing-data/command-line-api/#tooltip-cols) of your visualization. This option helps streamline that workflow. ::: warning Important! The join data must be in `.csv` format. If your data is tabular but in another format, please convert it to `.csv`. diff --git a/project-info/contributing-guide/index.md b/project-info/contributing-guide/index.md index fc1336e..c8a4dd9 100644 --- a/project-info/contributing-guide/index.md +++ b/project-info/contributing-guide/index.md @@ -2,31 +2,31 @@ Welcome to the `dms-viz` project! [dms-viz](https://dms-viz.github.io/) is an interactive tool for visualizing mutation-level data in the context of a 3D protein structure. The tool consists of two parts: -1. A [Command Line Interface (CLI)](https://github.com/dms-viz/configure_dms_viz) written in Python used to format data into a `.json` file that can be uploaded +1. A [Command Line Interface (CLI)](https://github.com/dms-viz/configure_dms_viz) written in Python is used to format data into a `.json` file that can be uploaded to 2. An [interactive web-based visualization](https://github.com/dms-viz/dms-viz.github.io) tool written with Javascript, [D3.js](https://d3js.org/), and [NGL.js](https://nglviewer.org/). -Because this project is built in two separate components, each in their own repositories, this contributing/developing guide is split into two parts. +Because this project is built in two separate components, each in its own repository, this contributing/developing guide is split into two parts. -If you're intersted in contributing to this project, please reach out on [GitHub](https://github.com/dms-viz)! +If you're interested in contributing to this project, please reach out on [GitHub](https://github.com/dms-viz)! ## Contributing to `configure_dms_viz` -Thank you for your interest in contributing to `configure_dms_viz`! Here is guide on how to develop this package as well as some guidelines for contributing. +Thank you for your interest in contributing to `configure_dms_viz`! Here is a guide on how to develop this package as well as some guidelines for contributing. ### Developing
-#### 1. Set Up Your Environment: +#### 1. Set Up Your Environment We use [`Poetry`](https://python-poetry.org/) for dependency management and packaging. If you don't have it installed, get it [here](https://python-poetry.org/docs/#installation). -#### 2. Fork the Repository: +#### 2. Fork the Repository Before you start making changes, fork the repository to your own GitHub account. -#### 3. Clone Your Fork: +#### 3. Clone Your Fork Clone your forked repository to your local machine. @@ -35,7 +35,7 @@ git clone https://github.com/dms-viz/configure_dms_viz.git cd configure_dms_viz ``` -#### 4. Install Dependencies: +#### 4. Install Dependencies With `Poetry`, setting up the project environment and installing dependencies is easy: @@ -43,11 +43,11 @@ With `Poetry`, setting up the project environment and installing dependencies is poetry install ``` -### Contributing Guidelines: +### Contributing Guidelines
-#### 1. Work on a New Branch: +#### 1. Work on a New Branch Don't work directly on the main branch. Create a new branch for your feature or bug fix. @@ -55,29 +55,29 @@ Don't work directly on the main branch. Create a new branch for your feature or git checkout -b your-new-feature-or-fix ``` -#### 2. Document Your Changes: +#### 2. Document Your Changes -Make sure to comment your code appropriately. If you're introducing a new feature or making significant changes, update the README.md file as necessary. +Make sure to add comments to your code appropriately. If you're introducing a new feature or making significant changes, update the README.md file as necessary. -#### 3. Commit Your Changes: +#### 3. Commit Your Changes Make granular commits with meaningful commit messages. This makes it easier to review your contributions. -#### 4. Versioning: +#### 4. Versioning Versioning follows semantic versioning (i.e. `X.Y.Z.`) where each component represents: -1. Major version (`X`): This number is incremented when there are breaking changes that require updates to the web tool API. +1. Major version (`X`): This number is incremented when there are breaking changes that require updates to the web tool API. -2. Minor version (`Y`): This number is incremented when new features are added in a backward-compatible manner. +2. Minor version (`Y`): This number is incremented when new features are added in a backward-compatible manner. -3. Patch version (`Z`): This number is incremented when backward-compatible bug fixes are introduced. +3. Patch version (`Z`): This number is incremented when backward-compatible bug fixes are introduced. ::: warning Important! Make sure that the version is incremented in the `pyproject.toml`, otherwise publishing to PyPI will fail. Also, make sure to update the `CHANGELOG` to document your changes. ::: -#### 4. Push to Your Fork: +#### 4. Push to Your Fork Push the changes to your forked repository. @@ -85,7 +85,7 @@ Push the changes to your forked repository. git push origin your-new-feature-or-fix ``` -#### 5. Create a Pull Request: +#### 5. Create a Pull Request Once you're done with your changes and you think it's ready for review, create a pull request from your forked repository to the original repository. @@ -95,7 +95,7 @@ The code is formatted using `Black`, which will be installed as a development de ## Contributing to `dms-viz.github.io` -Thanks for your interest into contributing to the visualization component of **`dms-viz`**! Below is a quick guide for developing the website along with some guidelines for contributing. +Thanks for your interest in contributing to the visualization component of **`dms-viz`**! Below is a quick guide for developing the website along with some guidelines for contributing. ### Developing @@ -143,11 +143,11 @@ Remember to fetch the latest changes from the main repository before you start w ### Code Guidelines -We aim for clean and consistent code across the entire project. To this end, we use `ESLint` for linting and `Prettier` for code formatting. Make sure to install these extensions to your code editor. Before making a Pull Request, ensure your code adheres to these formatting guidelines. +We aim for clean and consistent code across the entire project. To this end, we use `ESLint` for linting and `Prettier` for code formatting. Make sure to install these extensions in your code editor. Before making a Pull Request, ensure your code adheres to these formatting guidelines. ### Versioning -To ensure backward compatibility with older verions of specifications generated by `configure_dms-viz`, our web-based visualization tool employs a systematic versioning strategy. Whenever there are major updates or modifications to the tool, which might affect the existing JSON specifications or the overall behavior of the visualization, we introduce a new version. Here's how the versioning system works: +To ensure backward compatibility with older versions of specifications generated by `configure_dms-viz`, our web-based visualization tool employs a systematic versioning strategy. Whenever there are major updates or modifications to the tool, that might affect the existing JSON specifications or the overall behavior of the visualization, we introduce a new version. Here's how the versioning system works: #### Version Routes diff --git a/visualizing-data/vignettes/index.md b/visualizing-data/vignettes/index.md index 3a30643..c33328c 100644 --- a/visualizing-data/vignettes/index.md +++ b/visualizing-data/vignettes/index.md @@ -2,9 +2,9 @@ ## 1. Mapping the neutralization profile of antibodies and sera against HIV envelope -The [Bloom lab](https://research.fredhutch.org/bloom/en.html) has developed a [psuedotyping-based deep mutational scanning platform](https://doi.org/10.1016%2Fj.cell.2023.02.001) that makes it possible to assess the effects of thousands of mutations on properties like antibody neutalization for a diverse array of viral glycoproteins. [Radford et. al.,](https://doi.org/10.1016/j.chom.2023.05.025) used this platform to map the neutralization proflies of polyclonal serum samples that are able to neutralize diverse strains of HIV. By characterizing the specificity of these clinically important serum samples agains the HIV envelope (Env), they provide a great resource for assessing anti-HIV immune responses and informing prevention strategies. +The [Bloom lab](https://research.fredhutch.org/bloom/en.html) has developed a pseudotyping-based [deep mutational scanning platform](https://doi.org/10.1016%2Fj.cell.2023.02.001) that makes it possible to assess the effects of thousands of mutations on properties like antibody neutralization for a diverse array of viral glycoproteins. [Radford et. al](https://doi.org/10.1016/j.chom.2023.05.025) used this platform to map the neutralization profiles of polyclonal serum samples that are able to neutralize diverse strains of HIV. By characterizing the specificity of these clinically important serum samples against the HIV envelope (Env), they provide a great resource for assessing anti-HIV immune responses and informing prevention strategies. -Structural context is an important component of interpreting this kind of data. By mapping the neutralization profile of various serum samples or antibodies onto a structure of HIV Env, it's possible to visualize a 3D footprint of antibody binding. This type of visualize can help determine if multiple antibodies or serum samples are targeting the same structural epitopes on HIV Env, and therefore can help identifiy regions of importance for eliciting a broadly neutralizing immune response. Additionaly, [Radford et. al.,](https://doi.org/10.1016/j.chom.2023.05.025) was able to deconvolute the contribution of multiple epitopes to neutralization by individual polyclonal sera. +Structural context is an important component of interpreting this kind of data. By mapping the neutralization profile of various serum samples or antibodies onto a structure of HIV Env, it's possible to visualize a 3D footprint of antibody binding. This type of visualization can help determine if multiple antibodies or serum samples are targeting the same structural epitopes on HIV Env, and therefore can help identify regions of importance for eliciting a broadly neutralizing immune response. Additionally, [Radford et. al](https://doi.org/10.1016/j.chom.2023.05.025) was able to deconvolve the contribution of multiple epitopes to neutralization by individual polyclonal sera. **`dms-viz`** is a great tool for analyzing this type of experiment. It integrates the ability to explore the totality of your data through summary metrics and detailed plots, while also showing a representation of this data on the structure of HIV Env. It's trivial to visualize the contribution of multiple epitopes to neutralization using the [`--condition`](/preparing-data/command-line-api/#condition) feature of **`dms-viz`**. @@ -17,21 +17,21 @@ You can find the original antibody escape data for this study [here](https://git [`configure-dms-viz`](/preparing-data/command-line-api/) is designed to prepare a single dataset at a time. For each of the 7 datasets in this study, the values for each of the [command line arguments](/preparing-data/command-line-api/) is described in this [`datasets.csv`](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/HIV-Envelope-BF520-DMS/datasets.csv) file. Here is an example of a single command for the serum sample `IDC508`: ```bash -configure-dms-viz format - --input tests/HIV-Envelope-BF520-DMS/input/IDC508_avg.csv - --sitemap tests/HIV-Envelope-BF520-DMS/sitemap/sitemap.csv - --output tests/HIV-Envelope-BF520-DMS/output/IDC508.json - --name "IDC508" - --metric "escape_mean" - --metric-name "Escape" - --condition "epitope" - --condition-name "Epitope" - --join-data tests/HIV-Envelope-BF520-DMS/join-data/functional_effects.csv - --structure "6UDJ" - --included-chains "C F M G J P" - --excluded-chains "B L R A Q K" - --tooltip-cols "{'times_seen': '# Obsv', 'effect': 'Func Eff.'}" - --filter-cols "{'effect': 'Functional Effect', 'times_seen': 'Times Seen'}" +configure-dms-viz format \ + --input tests/HIV-Envelope-BF520-DMS/input/IDC508_avg.csv \ + --sitemap tests/HIV-Envelope-BF520-DMS/sitemap/sitemap.csv \ + --output tests/HIV-Envelope-BF520-DMS/output/IDC508.json \ + --name "IDC508" \ + --metric "escape_mean" \ + --metric-name "Escape" \ + --condition "epitope" \ + --condition-name "Epitope" \ + --join-data tests/HIV-Envelope-BF520-DMS/join-data/functional_effects.csv \ + --structure "6UDJ" \ + --included-chains "C F M G J P" \ + --excluded-chains "B L R A Q K" \ + --tooltip-cols "{'times_seen': '# Obsv', 'effect': 'Func Eff.'}" \ + --filter-cols "{'effect': 'Functional Effect', 'times_seen': 'Times Seen'}" \ --title "IDC508" ``` @@ -59,9 +59,9 @@ Which results in the `.json` specification located [here](https://github.com/dms ## 2. Inferring the fitness landscape of the SARS-CoV-2 proteome from pyhlogenetic data -The scale of genomic sequencing surveillance of SARS-CoV-2 has led to the public avaliability of millions of SARS-CoV-2 sequences. [Bloom and Neher](https://doi.org/10.1101/2023.01.30.526314) developed an approach that leverages this massive amount of sequencing data to estimate the fitness effects of mutations in all SARS-CoV-2 proteins. Their approach works by computing the expected count of each mutation under neutral selection and comparing this count to the observed count of mutations along the [phylogeny](https://genome.ucsc.edu/cgi-bin/hgPhyloPlace). The result is an estimate of fitness that is very helpful for understanding the evolutionary contraint on the SARS-CoV-2 proteome. This kind of data is particularly useful for assessing the constraint on possible therapuetic targets that are untractable targets for deep mutational scanning. +The scale of genomic sequencing surveillance of SARS-CoV-2 has led to the public availability of millions of SARS-CoV-2 sequences. [Bloom and Neher](https://doi.org/10.1101/2023.01.30.526314) developed an approach that leverages this massive amount of sequencing data to estimate the fitness effects of mutations in all SARS-CoV-2 proteins. Their approach works by computing the expected count of each mutation under neutral selection and comparing this count to the observed count of mutations along the [phylogeny](https://genome.ucsc.edu/cgi-bin/hgPhyloPlace). The result is an estimate of fitness that is very helpful for understanding the evolutionary constraint on the SARS-CoV-2 proteome. This kind of data is particularly useful for assessing the constraint on possible therapeutic targets that are untractable targets for deep mutational scanning. -Structure-guided design of anti-viral therapuetics is a promising approach to developing effective drugs against SARS-CoV-2. It's a major goal of consortiums like the [ASAP Discovery Consortium](https://asapdiscovery.org/) to incorporate evolutionary contraint into the design of therapeutic ligands. The data from [Bloom and Neher](https://doi.org/10.1101/2023.01.30.526314) can be combined with a structural representation of each viral targets to show the propensity of the virus to escape in the binding pockets being targeted by medical chemists doing structure-aided design. **`dms-viz`** offers a convient way to visually assess this for a wide range of structure-ligand pairs. +Structure-guided design of anti-viral therapeutics is a promising approach to developing effective drugs against SARS-CoV-2. It's a major goal of consortiums like the [ASAP Discovery Consortium](https://asapdiscovery.org/) to incorporate evolutionary constraints into the design of therapeutic ligands. The data from [Bloom and Neher](https://doi.org/10.1101/2023.01.30.526314) can be combined with a structural representation of each viral target to show the propensity of the virus to escape in the binding pockets being targeted by medical chemists doing structure-aided design. **`dms-viz`** offers a simple way to visually assess this for a wide range of structure-ligand pairs. To check out the data and code for this study, [click here](https://github.com/jbloomlab/SARS2-mut-fitness). To see how to prepare this kind of data and explore the results of [Bloom and Neher](https://doi.org/10.1101/2023.01.30.526314) yourself, check out the tutorial below. @@ -69,7 +69,7 @@ To check out the data and code for this study, [click here](https://github.com/j You can find the original mutation fitness data for this study [here](https://github.com/jbloomlab/SARS2-mut-fitness). I've organized this data if you want to follow along [here](https://github.com/dms-viz/configure_dms_viz/tree/main/tests/SARS2-Mutation-Fitness). -[`configure-dms-viz`](/preparing-data/command-line-api/) is designed to prepare a single dataset at a time. For each of the 23 SARS-CoV-2 proteins in this study, the values for each of the [command line arguments](/preparing-data/command-line-api/) is described in this [`datasets.csv`](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-Mutation-Fitness/datasets.csv) file. Here is an example of a single command for the Spike protein: +[`configure-dms-viz`](/preparing-data/command-line-api/) is designed to prepare a single dataset at a time. For each of the 23 SARS-CoV-2 proteins in this study, the values for each of the [command line arguments](/preparing-data/command-line-api/) are described in this [`datasets.csv`](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-Mutation-Fitness/datasets.csv) file. Here is an example of a single command for the Spike protein: ```bash configure-dms-viz format @@ -90,7 +90,7 @@ configure-dms-viz format --description "The Spike Glycoprotein. The Structure is has one RBD in the up position. [Structure: 6VYB]" ``` -This results in an output `.json` file that can be visualized in the **`dms-viz`** right away. However, if you want to visualize all 23 experiments together, it's possible to combine them together into a single `.json` file using the `configure-dms-viz join` command described in the [example above](#_1-mapping-the-neutralization-profile-of-antibodies-and-sera-against-hiv-envelope). This results in the `.json` specification located [here](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-Mutation-Fitness/output/SARS2-Mutation-Fitness.json). You can visualize this with **`dms-viz`** below, or you can [click here](https://dms-viz.github.io/v0/?data=https%3A%2F%2Fraw.githubusercontent.com%2Fdms-viz%2Fconfigure_dms_viz%2Fmain%2Ftests%2FSARS2-Mutation-Fitness%2Foutput%2FSARS2-Mutation-Fitness.json&s=mean&fi=%257B%2522expected_count%2522%253A10.060741%257D) to see the visualization on a separate page. +This results in an output `.json` file that can be visualized in the **`dms-viz`** right away. However, if you want to visualize all 23 experiments together, it's possible to combine them into a single `.json` file using the `configure-dms-viz join` command described in the [example above](#_1-mapping-the-neutralization-profile-of-antibodies-and-sera-against-hiv-envelope). This results in the `.json` specification located [here](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-Mutation-Fitness/output/SARS2-Mutation-Fitness.json). You can visualize this with **`dms-viz`** below, or you can [click here](https://dms-viz.github.io/v0/?data=https%3A%2F%2Fraw.githubusercontent.com%2Fdms-viz%2Fconfigure_dms_viz%2Fmain%2Ftests%2FSARS2-Mutation-Fitness%2Foutput%2FSARS2-Mutation-Fitness.json&s=mean&fi=%257B%2522expected_count%2522%253A10.060741%257D) to see the visualization on a separate page.
@@ -107,13 +107,13 @@ This results in an output `.json` file that can be visualized in the **`dms-viz` The influenza RNA-dependent RNA polymerase (RdRp) is a key determinant of zoonosis for novel influenza viruses. However, little is known about the evolutionary potential and effect of mutations on influenza RdRp function. [Li et. al.,](https://doi.org/10.1101/2023.08.27.554986) set out to change this by measuring the effect of thousands of mutations on the replicative fitness of influenza RdRp by performing deep mutational scanning on the PB1 subunit of the A/WSN/1933(H1N1) strain. [Li et. al.,](https://doi.org/10.1101/2023.08.27.554986) provide a comprehensive map of PB1 mutation fitness that serves as a helpful resource for those interested in understanding influenza replication. -**`dms-viz`** provides a great platform to share deep mutational scanning data as a resource. It offers stable links that contains information about the parameters selected in the visualization, making it possible to highlight and share specific findings. Also, since the influenza RdRp is a heterotrimer of which PB1 is only a single subunit, **`dms-viz`** provides a flexible way to represent a highlight specific subunits of the structure. +**`dms-viz`** provides a great platform to share deep mutational scanning data as a resource. It offers stable links that contain information about the parameters selected in the visualization, making it possible to highlight and share specific findings. Also, since the influenza RdRp is a heterotrimer of which PB1 is only a single subunit, **`dms-viz`** provides a flexible way to represent a highlight specific subunits of the structure. To see how to prepare this kind of data and explore the results of [Li et. al.,](https://doi.org/10.1101/2023.08.27.554986) yourself, check out the tutorial below. ### Using **`dms-viz`** -You can find the original mutation fitness data for this study [here](https://www.biorxiv.org/content/biorxiv/early/2023/08/27/2023.08.27.554986/DC8/embed/media-8.csv?download=true). I've organized this data if you want to follow along [here](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/IAV-PB1-DMS/input/pb1_fitness.csv). I did a little bit of pre-processing on this data in python to meet the [data requirements](/preparing-data/data-requirements/): +You can find the original mutation fitness data for this study [here](https://www.biorxiv.org/content/biorxiv/early/2023/08/27/2023.08.27.554986/DC8/embed/media-8.csv?download=true). I've organized this data if you want to follow along [here](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/IAV-PB1-DMS/input/pb1_fitness.csv). I did a little bit of pre-processing on this data in Python to meet the [data requirements](/preparing-data/data-requirements/): ```python # Import and format the data from the supplement @@ -129,19 +129,19 @@ fitness_df.rename(columns={'substitution': 'mutant'}, inplace=True) fitness_df.to_csv("../data/fitness.csv", index=False) ``` -The values for each of the [command line arguments](/preparing-data/command-line-api/) is described in this [`datasets.csv`](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-Mutation-Fitness/datasets.csv) file. Here is the resulting command: +The values for each of the [command line arguments](/preparing-data/command-line-api/) are described in this [`datasets.csv`](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-Mutation-Fitness/datasets.csv) file. Here is the resulting command: ```bash -configure-dms-viz format - --input tests/IAV-PB1-DMS/input/pb1_fitness.csv - --sitemap tests/IAV-PB1-DMS/sitemap/sitemap.csv - --output tests/IAV-PB1-DMS/sitemap/pb1.json - --name "IAV PB1" - --metric "fitness" - --metric-name "Replicative Fitness" - --structure "7NHX" - --included-chains "B" - --title "IAV PB1 Deep Mutational Scan" +configure-dms-viz format \ + --input tests/IAV-PB1-DMS/input/pb1_fitness.csv \ + --sitemap tests/IAV-PB1-DMS/sitemap/sitemap.csv \ + --output tests/IAV-PB1-DMS/sitemap/pb1.json \ + --name "IAV PB1" \ + --metric "fitness" \ + --metric-name "Replicative Fitness" \ + --structure "7NHX" \ + --included-chains "B" \ + --title "IAV PB1 Deep Mutational Scan" \ --description "Deep mutational scan of influenza virus A/WSN/1933(H1N1) PB1 RdRp subunit" ``` diff --git a/visualizing-data/web-tool-api/index.md b/visualizing-data/web-tool-api/index.md index b3f7b92..45e1d5c 100644 --- a/visualizing-data/web-tool-api/index.md +++ b/visualizing-data/web-tool-api/index.md @@ -2,7 +2,7 @@ ## Uploading Data -There are two ways to upload data into **`dms-viz`**. You can either upload a **local** specification file from your computer, or you can provide a link to a **remote** specification file hosted somewhere like [GitHub](https://github.com/). You'll find the options for uploading data under the `Upload Data` section of the sidemenu. +There are two ways to upload data into **`dms-viz`**. You can either upload a **local** specification file from your computer, or you can provide a link to a **remote** specification file hosted somewhere like [GitHub](https://github.com/). You'll find the options for uploading data under the `Upload Data` section of the side menu. ### Local @@ -14,19 +14,19 @@ To upload a local file, you simply click on the `Upload Data` section and choose ### Remote -Alternativley, if your raw `.json` file is hosted somewhere online – like on GitHub, for example – you can provide the link to this file by clicking on the `Remote` button under the `Upload Data` section. +Alternatively, if your raw `.json` file is hosted somewhere online – like on GitHub, for example – you can provide the link to this file by clicking on the `Remote` button under the `Upload Data` section.
Remote Upload
-You can try yourself by pasting the following link into the URL text box: +You can try it yourself by pasting the following link into the URL text box: ```md https://raw.githubusercontent.com/dms-viz/configure_dms_viz/main/tests/sars2/output/sars2.json ``` -This approach has some advantages. For example, after providing a link to your data, this link is saved in the URL, allowing you to share a view of **`dms-viz`** with the data pre-loaded and ready to view. Also, this approach allows you to proivde a markdown description (also hosted remotely) of the datasets. +This approach has some advantages. For example, after providing a link to your data, this link is saved in the URL, allowing you to share a view of **`dms-viz`** with the data pre-loaded and ready to view. Also, this approach allows you to provide a markdown description (also hosted remotely) of the datasets. #### Providing A Description @@ -40,7 +40,7 @@ Just like the remote `.json` specification, the link to the markdown is saved in ## Chart Configuration -**`dms-viz`** provides a handful of ways to navigate and customize the visualization. You can find these options under the `Chart Options` tab in the sidemenu. +**`dms-viz`** provides a handful of ways to navigate and customize the visualization. You can find these options under the `Chart Options` tab in the side menu.
Chart Options @@ -48,15 +48,15 @@ Just like the remote `.json` specification, the link to the markdown is saved in - ### `Dataset` - Although `configure-dms-viz` will return only a single `.json` specification, it is possible to combine multiple `.json` specification in a single file to visualize. If there are multiple datasets in the `.json` file, you can navigate between these using the `Dataset` dropdown menu. The name that appears in the dropdown for each dataset depends on the [`--name`](/preparing-data/command-line-api/#name) flag. + Although `configure-dms-viz` will return only a single `.json` specification, it is possible to combine multiple `.json` specifications in a single file to visualize. If there are multiple datasets in the `.json` file, you can navigate between these using the `Dataset` dropdown menu. The name that appears in the dropdown for each dataset depends on the [`--name`](/preparing-data/command-line-api/#name) flag. Additionally, next to the `Dataset` dropdown menu there is an information icon ⓘ. By clicking on this icon, a short description of the dataset appears above the top plot. The description can be specified using the [`--description`](/preparing-data/command-line-api/#description) flag. - ### `Condition` - If your input data has multiple measurements per mutation/site combination that are distinguished by some _condition_ (specified by the [`--condition`](/preparing-data/command-line-api/#condition) flag), an interactive legend appears under `Chart Options`. + If your input data has multiple measurements for each mutation/site combination that are distinguished by some _condition_ (specified by the [`--condition`](/preparing-data/command-line-api/#condition) flag), an interactive legend appears under `Chart Options`. - Although the default label that appears above the legend is `Condition`, you can specify the label using the [`--conditon_name`](/preparing-data/command-line-api/#condition_name) flag. + Although the default label that appears above the legend is `Condition`, you can specify the label using the [`--condition_name`](/preparing-data/command-line-api/#condition_name) flag. - ### `Summary Metric` @@ -72,7 +72,7 @@ Just like the remote `.json` specification, the link to the markdown is saved in ## Protein Configuration -**`dms-viz`** provides a handful of ways to navigate and customize the 3D protein structure. You can find these options under the `Protein Configuration` tab in the sidemenu. +**`dms-viz`** provides a handful of ways to navigate and customize the 3D protein structure. You can find these options under the `Protein Configuration` tab in the side menu.
Protein Options @@ -87,7 +87,7 @@ There are _four_ components of the protein structure whose appearances can be co | `Peripheral` | The chains that are aren't excluded (by the [`--exclude-chains`](/preparing-data/command-line-api/#exclude-chains) flag) but don't have corresponding data (aren't included in the [`--included-chainss`](/preparing-data/command-line-api/#included-chains)). | | `Ligand` | If `Show Ligands` is checked, the ligands (i.e. glycans, small molecules, etc...) in the structure. | -For each of these separate compnents, there are options to change the following: +For each of these separate components, there are options to change the following: - ### `Representation` @@ -95,11 +95,11 @@ For each of these separate compnents, there are options to change the following: - ### `Color` - The color of the molecules. All of the molecules for a given component will be the same color with the exeption ligands which can be colored by their element. + The color of the molecules. All of the molecules for a given component will be the same color, however, ligands can be colored by element. - ### `Opacity` - The opacity of the component. This can be helpful for illustrating the molecular structure with the `metric` superimposed on the surface of the protein. + The opacity of the component. This can help illustrate the molecular structure with the `metric` superimposed on the surface of the protein. ## Interaction @@ -113,7 +113,7 @@ You can **zoom** in and out of regions of your data by **brushing** (_click and -You can **mouseover** sites on the line/point and mutations on the heatmap to see details in a pop-up **tooltip** and you can **select sites** to see in the **heatmap** by **clicking** on points in the line/point plot. +You can **mouse over** sites on the line/point and mutations on the heatmap to see details in a pop-up **tooltip** and you can **select sites** to see in the **heatmap** by **clicking** on points in the line/point plot. -If there is more than one condition in your data, an interactive legend will appear in the `Chart Options` You can **select a condition** to color the protein structure with by **clicking** on an condition in the legend. +If there is more than one condition in your data, an interactive legend will appear in the `Chart Options` You can **select a condition** to color the protein structure with by **clicking** on a condition in the legend.