ContEst
Overview
ContEst (Contamination Estimation) is a tool developed at the Broad Institute for estimating cross-sample contamination in tumor/normal sequencing pairs. It models the fraction of reads derived from a contaminating sample using SNP genotype likelihoods. Samples exceeding a contamination threshold are excluded from downstream somatic variant calling.
Used by
- Applied as a quality-control step in the TCGA MC3 pipeline; samples with ContEst contamination > 4% were excluded (12 rules applied total); run by Broad Firehose alongside MuTect and Indelocator PMID:29596782
- Applied in the prad_p1000 prostate cancer WES pipeline (ContEst < 5% threshold; mean contamination 0.6%) PMID:29610475
Notes
- Contamination estimate > 4–5% is the typical exclusion threshold used in TCGA-affiliated pipelines.
- Run at Broad Firehose as part of the MC3 Broad pipeline arm.
Sources
This page was processed by crosslinker on 2026-05-15.