Merge SVs called by different tools

Yuwei BaoMay 5, 2023

SURVIVOR [1]

It can be used to identify consensus calling or to compare SVs across samples.

Installation

git clone https://github.com/fritzsedlazeck/SURVIVOR.git
cd SURVIVOR/Debug
make

Usage

It is a common practice to use multiple callers for one sample [2] First, we collect all files that we want to merge in a file. You might want to consider sorting the vcf files prior to merging.

Prepare input files either by

ls *vcf > sample_files

or create a file vcf_files_list.txt, which contains

/path/to/caller1_output.vcf
/path/to/caller2_output.vcf
/path/to/caller3_output.vcf

Then use SURVIVOR to obtain a call set. Using the parameters suggested by this paper [3]:

Consensus somatic SVs from multiple somatic SV callsets were generated using “merge” function of SURVIVOR (version: 1.0.7) with the parameters “max distance between breakpoints = 1000,” “Minimum number of supporting caller = 2,” and “Minimum size of SVs to be taken into account = 50.”

# ./SURVIVOR merge vcf_files_list.txt max_distance_sameSV minimum_num_calls max_distance_breakpoint merged_sv.vcf
./SURVIVOR merge sample_files 1000 2 1 1 0 50 sample_merged.vcf

Parameters

File with VCF names and paths
max distance between breakpoints (0-1 percent of length, 1- number of bp) 
Minimum number of supporting caller
Take the type into account (1==yes, else no)
Take the strands of SVs into account (1==yes, else no)
Disabled.
Minimum size of SVs to be taken into account.
Output VCF filename

This will merge all the vcf files specified in sample_files together using a maximum allowed distance of 1kb, as measured pairwise between breakpoints (begin1 vs begin2, end1 vs end2). Furthermore we ask SURVIVOR only to report calls supported by 2 callers and they have to agree on the type (1) and on the strand (1) of the SV. Note you can change this behavior by altering the numbers from 1 to e.g. 0. In addition, we told SURVIVOR to only compare SV that are at least 30bp long and print the output in sample_merged.vcf.[2:1]


  1. https://github.com/fritzsedlazeck/SURVIVORopen in new window ↩︎

  2. https://github.com/fritzsedlazeck/SURVIVOR/wikiopen in new window ↩︎ ↩︎

  3. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9648002/#Sec11open in new window ↩︎