Filter bam using gatk

Yuwei BaoMarch 21, 2023

All the following information is from gatk (v4.2.6.1) [1]

1. MarkDuplicates (Picard): Identifies duplicate reads [2]

java -jar picard.jar MarkDuplicates \
      -I input.bam \
      -O marked_duplicates.bam \
      -M marked_dup_metrics.txt \
      --REMOVE_DUPLICATES TRUE \

Parameters

  • marked_dup_metrics.txt was created to store duplication metrics.
  • --REMOVE_DUPLICATES is optional. If true do not write duplicates to the output file instead of writing them with appropriate flags set.

2. AddOrReplaceReadGroups (Picard): Assigns all the reads in a file to a single new read-group. [3]

java -jar picard.jar AddOrReplaceReadGroups \
       I=input.bam \
       O=output.bam \
       RGID=4 \
       RGLB=lib1 \
       RGPL=ILLUMINA \
       RGPU=unit1 \
       RGSM=read_group_sample_name

3. BaseRecalibrator: Generates recalibration table for Base Quality Score Recalibration (BQSR) [4]

gatk BaseRecalibrator \
   -I my_reads.bam \
   -R reference.fasta \
   --known-sites sites_of_variation.vcf \
   --known-sites another/optional/setOfSitesToMask.vcf \
   -O recal_data.table

Parameters

  • --known-sites: One or more databases of known polymorphic sites used to exclude regions around known polymorphisms from analysis.

4. ApplyBQSR: Apply base quality score recalibration [5]

gatk ApplyBQSR \
   -R reference.fasta \
   -I input.bam \
   --bqsr-recal-file recalibration.table \
   -O output.bam

5. BuildBamIndex (Picard): Generates a BAM index ".bai" file [6]

java -jar picard.jar BuildBamIndex \
      I=input.bam

  1. https://gatk.broadinstitute.org/hc/en-us/articles/5358824293659--Tool-Documentation-Indexopen in new window ↩︎

  2. https://gatk.broadinstitute.org/hc/en-us/articles/5358880192027-MarkDuplicates-Picard-open in new window ↩︎

  3. https://gatk.broadinstitute.org/hc/en-us/articles/5358911906459-AddOrReplaceReadGroups-Picard-open in new window ↩︎

  4. https://gatk.broadinstitute.org/hc/en-us/articles/5358896138011-BaseRecalibratoropen in new window ↩︎

  5. https://gatk.broadinstitute.org/hc/en-us/articles/5358826654875-ApplyBQSRopen in new window ↩︎

  6. https://gatk.broadinstitute.org/hc/en-us/articles/5358886012443-BuildBamIndex-Picard-open in new window ↩︎