Skip to content

This is the modified version of Pysam, a Python module for reading and manipulating SAM/BAM/VCF/BCF files. This implementation enables reading block-level symmetric-key encrypted BGZF files functionality.

License

Notifications You must be signed in to change notification settings

Munchic/pysam

 
 

Repository files navigation

Pysam with reading encrypted BGZF files functionality

https://circleci.com/gh/Munchic/pysam/tree/feature%2Fconnect-crypto-htslib.svg?style=shield

Pysam is a python module for reading and manipulating files in the SAM/BAM format. The SAM/BAM format is a way to store efficiently large numbers of alignments (Li 2009), such as those routinely created by next-generation sequencing methods.

Pysam is a lightweight wrapper of the samtools C-API. Pysam also includes an interface for tabix.

Installation:

$ git clone https://github.com/Munchic/pysam/
$ cd pysam
$ python setup.py install --user

Usage (adapted from https://samtools.github.io/bcftools/bgzf-aes-encryption.pdf):

# Generate a random private key and its hash (digest)
$ KEY=`dd if=/dev/urandom bs=1 count=32 2>/dev/null | xxd -ps -c32`
$ HASH=`echo $KEY | openssl sha256 | cut -f2 -d ' '`
$ echo -e "$HASH\t$KEY" > hts-keys.txt

# Configure environment variables, compress + encrypt, and index with built-in crypto htslib
$ export HTS_KEYS=hts-keys.txt
$ HTS_ENC=${PRIVATE_KEY} htslib/bgzip -c in.vcf > enc.vcf.gz
$ HTS_ENC=${PRIVATE_KEY} htslib/tabix enc.vcf.gz

# Read encrypted tabix-indexed file in Python with Pysam
import pysam
file = pysam.TabixFile("enc.vcf.gz")
for read in file.fetch('chr1'):
    print(read)

The latest version is available through pypi. To install, simply type:

pip install pysam

If you are using the conda packaging manager (e.g. miniconda or anaconda), you can install pysam from the bioconda channel:

conda config --add channels r

conda config --add channels bioconda

conda install pysam

Pysam documentation is available through https://readthedocs.org/ from here

Questions and comments are very welcome and should be sent to the pysam user group

About

This is the modified version of Pysam, a Python module for reading and manipulating SAM/BAM/VCF/BCF files. This implementation enables reading block-level symmetric-key encrypted BGZF files functionality.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 83.0%
  • Python 15.2%
  • Makefile 0.5%
  • C++ 0.4%
  • Perl 0.3%
  • Roff 0.3%
  • Other 0.3%