Filter bam using samtools
March 21, 2023
All the following information is from samtools (v1.15.1)
[1]
[2]
1. fixmate:Fill in mate coordinates, ISIZE and mate related flags from a name-sorted alignment.
samtools fixmate [-rpcm] [-O format] in.nameSrt.bam out.bam
Parameters
-r
: Remove secondary and unmapped reads.-m
: Add ms (mate score) tags. These are used bymarkdup
to select the best reads to keep.
[3]
2. markdup:Mark duplicate alignments from a coordinate sorted file that has been run through samtools fixmate
with the -m
option. This program relies on the MC and ms tags that fixmate
provides.
samtools markdup [-l length] [-r] [-s] [-T] [-S] in.algsort.bam out.bam
Parameters
-r
: Remove duplicate reads.
[4]
3. sort:Sort alignments by leftmost coordinates, or by read name when -n is used. An appropriate @HD-SO sort order header tag will be added or an existing one updated if necessary.
Usage
samtools sort [-l level] [-u] [-m maxMem] [-o out.bam] [-O format] [-M] [-K kmerLen] [-n] [-t tag] [-T tmpprefix] [-@ threads] [in.sam|in.bam|in.cram]
Parameters:
-l INT Set compression level, from 0 (uncompressed) to 9 (best)
Set the desired compression level for the final output file, ranging from 0 (uncompressed) or 1 (fastest but minimal compression) to 9 (best compression but slowest to write), similarly to gzip(1)'s compression level setting.
-@ INT
Set number of sorting and compression threads. By default, operation is single-threaded.
-o FILE
Write the final sorted output to FILE, rather than to standard output.
[5]
4. viewView and convert SAM/BAM/CRAM files
Usage
samtools view [options] <in.bam>|<in.sam>|<in.cram> [region ...]
Parameters
-S Ignored (input format is auto-detected)
-b, --bam Output BAM
-@, --threads INT
Number of additional threads to use [0]
-o, --output FILE Write output to FILE [standard output]
Other people's realated blogs
- Dave Tang: Learning the BAM format
- Felix Yanhui Fan: bam file format and samtools usage