Skip to content

pyNBS Command Line Manual

Justin Huang edited this page Feb 3, 2018 · 11 revisions

To execute the NBS algorithm from a command line script, we have provided a Python script, run_pyNBS.py that wraps the pyNBS package into a single executable line from the command line. We have also provided a template bash shell script that the user can edit, which will generate a custom pyNBS command for them. The documentation to edit the bash shell script to generate a custom command line call can be found within the shell script itself, but it can also be viewed in it's entirety here.

The python script requires Python 2.7, pyNBS and all of its dependencies to be already installed. To see the help manual and parameters for run_pyNBS.py in the terminal, please use the command:
Python run_pyNBS.py -h

Notes about run_pyNBS.py:

  • Parameters passed to the Python script will overwrite the associated default parameters or parameters set by the parameter file passed to -params
  • pyNBS requires a network file and will construct both a Network Regularizer and "network propagation kernel".
  • This script performs network propagation using the network kernel constructed and will use the network regularizer in netNMF.

We also describe the parameters for run_pyNBS.py here:

Required parameters

  • sm_data_file Path to binary mutation matrix file. May be a delimited matrix (with headers) or a delimited 2-column list where each line is a sample and the gene mutated separated by a common delimiter. See the somatic mutation data file documentation for additional details on the file format of sm_data_file.
  • network_file Path to molecular network file. File must be a delimited table with at least 2 columns where each line is a gene interaction separated by a common delimiter and the first 2 columns represent interacting proteins. See the network file documentation for additional details on the file format of network_file.

Optional parameters

  • -h, --help Show the Python help manual for this function
  • -params, --params_file Optional path to pyNBS parameters file. If no file is given, default values for all internal pyNBS parameters will be set. If file is given, the default values of internal pyNBS parameters given in the file will be overwritten by the values in the file. Exceptions to this rule are parameter values that are given to run_pyNBS.py from the command line. Command line entered parameters will supersede any parameter file or default parameter values. See the pyNBS parameters file documentation for additional parameter details and file format.
  • -o, --outdir Path to an output directory for saving pyNBS results. This value will be passed into **saveargs['outdir'] for the functions in called by pyNBS that are expected to save some output. Each function already has a default set as to whether or not that step will save the output to file. Parameters for changing which files in the pyNBS to save can be changed in the pyNBS parameters file. run_pyNBS.py will attempt to create the directory at the outdir file path if it does not already exist. The default folder for ```outdir`` if this parameter is not given is the current working directory if it is not pre-set by a parameter file.
  • -j, --job_name The filename prefix used to tag the files saved by a particular run of pyNBS. This value will be passed into **saveargs['job_name'] for functions being passed tje **save_args dictionary.
  • -a, --alpha Propagation constant to use in the propagation of mutations over molecular network. The range of this value should be from 0.0-1.0 inclusive and the fefault value is set at 0.7. For additional information on this parameter, please see the documentation for the network_propagation function. For supplemental analysis on this parameter, please see the pyNBS parameter benchmarking supplemental notebook.
  • -k, --K This is the number of clusters to stratify the patient samples into in the NBS algorithm. This parameter will be used as the k parameter in the mixed_netNMF, NBS_single and consensus_hclust_hard functions. For supplemental analysis on this parameter, please see the pyNBS parameter benchmarking supplemental notebook.
  • -n, --niter The number of iterations to perform the core pyNBS steps (sub-sampling, propagation and network-regularized NMF) for consensus clustering. This is the number of times NBS_single will be called before the consensus_hclust_hard function is called.
  • -surv, --survival_data Optional path to patient clinical data. If given, (either by command line or params file pyNBS will attempt to perform survival analysis and plot a Kaplan-Meier plot. Otherwise, no survival analysis will be performed. The file must be 5-column delimited file and is described in detail in the survival data file format page.
  • -t, --threads The number of threads to be used by the pyNBS process. The default number of threads is set to 2 (not to be confused with the number of cores used). Certain processes will execute more quickly if more threads are used.
  • -nv, --no_verbose Verbosity flag for suppressing reporting of pyNBS algorithm progress. Not using this flag will use verbose reporting (default behavior).
Clone this wiki locally