GATK Data Pre-processing

Yuwei BaoOctober 2, 2022

GATK is a Genome Analysis Toolkit. Here are some notes about GATK:

GATK data pre-processing

GATK data pre-processingopen in new window

Here is a workflow from GATK:

GATK_preprocessing.png

  1. Raw Mapped Reads (Bam) -> MarkDuplicates Check and compare results
samtools view BEFORE_MARKDUPLICATES.bam | wc -l
samtools view AFTER_MARKDUPLICATES.bam | wc -l
  1. -> BaseRecalibrator + ApplyBQSR -> Analysis-Ready Rads (Bam)