-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Wang Yunfei edited this page Feb 8, 2017
·
4 revisions
- ngslib is a Python based package aims in Next-Generation Sequencing Analysis.
- ngslib is used for manipulating genome annotation and sequence files, such as Fasta, Bed, GenePred, BAM, Wiggle and BigWig formats.
- ngslib uses the "lib" and "inc" directories from Jim Kent's toolkit, and users should read the README file inside and behave accordingly.
- All files are copyrighted, but license is hereby granted for personal, academic and non-profit use. Commercial users should contact Yunfei Wang in details.
#Prerequisites:
Python packages: (will be installed automatically while installing ngslib)
- numpypackage for scientific computing with Python
- pysam for SAM/BAM file manipulation
- argparseParser for command-line options. (Python 2.6 only)
Other requirements in some rare cases: (need to be installed separately)
- python-dev for the Python.h include file. (Ubuntu only, sudo apt-get install python-dev)
- libpng for compile Kentlib. (Ubuntu only, sudo apt-get install libpng-dev)
> easy_install --prefix=install_path ngslib
This package is based on Python 2.6 or 2.7. This package has been tested on CentOS 6.4, Fedora 17, RedHat 5.5 and Ubuntu 12.04. Other platforms might not work well.
Download source file:
> easy_install --editable --build-directory download_path ngslib
General installation instructions:
-
Set PYTHONPATH environment variable.
-
Specify the install path by "--prefix=install_path". In general, set "--prefix=$HOME/local"
> cd ngslib
> python setup.py build
> python setup.py install --prefix=install_path
Data Structure
- Seq(Seq): sequence format.
- Fasta(Fasta): Fasta format.
- Bed: Genome interval format.
- GeneBed: Gene annotation format.
- BedList: List of Bed and its derived formats.
- BedMap: Arrange Bed or GeneBed using Bin index technology for fast overlapping search.
Modules
- BioReader: A general parser for Bed, Wiggle, Peak, GeneBed and other formats.
- StringFile: read string as a file.
- DB: Build index for fast accessing for biological files.
- BigWigFile: Fast access of BigWig file.
- FastaFile: Fast retrieve sequence from huge genome in Fasta format.
- TwoBitFile: Fast retrieve sequence from huge genome in TwoBit format.
- wRNA: RNA structure prediction and visualization.
- Pipeline: Build pipeline using python wrapped shell commands and tools.
- Utils: Utilities
Scripts
- wBedToFasta.py
- wBedExtend.py
- wBamToWig.py
- Fast.
- Uniform coding style and Universal interface.
- Simplified.
- Clarified.
- Please cite https://pypi.python.org/pypi/ngslib.