Skip to content

Releases: rwk-unil/xSqueezeIt

v4.0.0 release

11 Nov 14:45
Compare
Choose a tag to compare

xSqueezeIt and the XSI file format - version 4.0.0

Publication : XSI - A genotype compression tool for compressive genomics in large biobanks

Please cite this article if you use XSI in your research.

Wertenbroek, R., Rubinacci, S., Xenarios, I., Thoma, Y., & Delaneau, O. (2022). XSI—a genotype compression tool for compressive genomics in large biobanks. Bioinformatics, 38(15), 3778-3784.
@article{wertenbroek2022xsi,
  title={XSI—a genotype compression tool for compressive genomics in large biobanks},
  author={Wertenbroek, Rick and Rubinacci, Simone and Xenarios, Ioannis and Thoma, Yann and Delaneau, Olivier},
  journal={Bioinformatics},
  volume={38},
  number={15},
  pages={3778--3784},
  year={2022},
  publisher={Oxford University Press}
}

VCF / BCF Genotype data compressor based on sparse representation for rare variants and positional Burrows-Wheeler transform (PBWT) followed by 16-bit Word Aligned Hybrid (WAH) encoding for common variants. (Minor Allele Frequency threshold is selectable for rare/common variants). XSI provides fast random access, is extensible, and allows for "compressive acceleration" to speed-up research.

Variant information is left in BCF format to remain compatible with HTSLIB / BCFTools, genotype data is custom encoded as described above. The encoded genotype data can then optionally be further compressed with zstd https://github.com/facebook/zstd/.

Build

Dockerfile

# Build
docker build -f Dockerfile -t xsi:xsi .
# Run
docker run xsi:xsi xsqueezeit # arguments ...

Building the xSqueezeIt command line tool

This build requires GCC 8+ because modern C++17 features are used.

# Clone
git clone https://github.com/rwk-unil/xSqueezeIt.git --branch v4.0.0 --single-branch
cd xSqueezeIt

# Clone and build htslib (if you already have htslib set Makefile accordingly and skip)
git submodule update --init --recursive htslib
cd htslib
autoheader
autoconf
automake --add-missing 2>/dev/null
./configure
make
sudo make install
sudo ldconfig
cd ..

# Clone and build zstd (if you already have zstd set Makefile accordingly and skip)
git clone https://github.com/facebook/zstd.git
cd zstd
make
cd ..

# Build application
make