-
Notifications
You must be signed in to change notification settings - Fork 0
Evaluating SV callers
Luca Santuari edited this page Apr 6, 2018
·
4 revisions
To quantify the ability of each caller to detect SVs, we use multiple ground truth datasets. These are collections of SVs that have been validated with multiple sequencing technologies and through comparison with indipendent datasets. Here is a tentative list:
- The pilot genome NA12878 from Genome in a Bottle (GiaB) Consortium
- 30x downsampled paired end BAM file RMNISTHS_30xdownsample.bam
- Deletion and insertion sets from the analysis with svclassify, source: svclassify Supplemental Data:
- Deletions: 'Personalis_1000_Genomes_deduplicated_deletions.bed'
- Insertions: 'Spiral_Genetics_insertions.bed'
- Gold standard deletion and insertion set available as Supplemental material of Mills et al., 2011
- The COLO829 dataset of the Hartwig Medical Foundation
- samples from Craig et al., 2016
- The OC26 dataset (Kloosterman group)
Internal dataset for Tumor and Normal BAM files (17% clipped reads).
mergeVCF can be used to merge VCF files. The output is a VCF file interleaved with informations on records.
SURVIVOR has a merge tool that can be used for merging VCF files