Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia

Author

The Cancer Genome Atlas Research Network

Doi

PMID: 23634996 · DOI: 10.1056/NEJMoa1301689 · Journal: N Engl J Med (2013)

TL;DR

The Cancer Genome Atlas Research Network sequenced 200 clinically annotated adult de novo AML cases — 50 by whole-genome sequencing and 150 by whole-exome sequencing, complemented by RNA-seq, microRNA-seq, DNA-methylation profiling, and SNP arrays. Adult AML genomes are mutation-sparse (mean 13 coding mutations per sample, ~5 in recurrently mutated genes), yet nearly every sample (199/200) harbored at least one driver in one of nine functional categories: transcription-factor fusions (18%), NPM1 (27%), tumor suppressors (16%), DNA-methylation genes (44%), activated signaling (59%), chromatin modifiers (30%), myeloid transcription factors (22%), cohesin-complex (13%), and spliceosome (14%). A novel co-occurring triplet of NPM1 + DNMT3A + FLT3 mutations defined a distinct epigenetic AML subtype.

Cohort & data

  • 200 adult de novo AML cases (AML) drawn from a single-institution Washington University tissue-banking protocol (samples banked Nov 2001–Mar 2010), selected to reflect a real-world distribution of FAB and cytogenetic subtypes.
  • Dataset: laml_tcga_pub (TCGA AML, NEJM 2013).
  • Assays:
    • whole-genome-seq on 50 tumor/normal-skin pairs (mean coverage 30.54×).
    • whole-exome-seq on the remaining 150 tumor/normal-skin pairs (mean coverage 167.50×).
    • rna-seq on 179 samples; Affymetrix U133 Plus 2 expression on 197 samples.
    • microRNA sequencing on 194 samples.
    • 450k-methylation-array (Illumina Infinium HumanMethylation450 BeadChip) on 192 samples.
    • affymetrix-snp6 on tumor and matched normal skin for all 200 patients.
  • All candidate somatic variants validated with hybridization-capture deep digital sequencing.

Key findings

  • Mutation burden. 2,315 somatic SNVs and 270 small indels in coding regions across 200 samples (mean 13 tier-1 mutations/sample, range 0–51); 1,539 SNVs (66%) missense and 191 of 270 indels (71%) frameshift PMID:23634996.
  • Significantly mutated genes. 23 genes reached FDR <0.05 in the MuSiC SMG test, including DNMT3A, FLT3, NPM1, IDH1, IDH2, CEBPA, plus newly implicated U2AF1, EZH2, SMC1A, and SMC3; an additional 237 genes were mutated in ≥2 samples.
  • Cohort stratification by fusion / risk. Of the 200 samples, 11 had KMT2A (MLL) fusions with the fewest recurrent tier-1 mutations (mean 2.09 vs 5.24 overall, P=0.002); 20 PMLRARA samples had mean 3.25 (P=0.001); 7 RUNX1RUNX1T1 samples had mean 7.85 (P=0.04); 13 unfavorable-risk samples with TP53 mutations had mean 7.00 (P=0.049).
  • Co-mutation. NPM1 co-occurred with DNMT3A (P<6.3×10⁻⁷) and with FLT3 (P<1.9×10⁻⁶). Concurrent NPM1 + DNMT3A + FLT3 samples clustered together in mRNA, miRNA, and DNA-methylation space, defining a putative novel AML subtype.
  • Mutual exclusivity (Dendrix++). Three sets emerged: (A) transcription-factor fusions, NPM1, RUNX1, TP53, CEBPA; (B) FLT3 vs other tyrosine kinases, serine–threonine kinases, phosphatases, and RAS-family proteins; (C) ASXL1 vs cohesin-complex genes, other myeloid transcription factors, and other epigenetic modifiers. PMLRARA, MYH11CBFB, and MLL-fusion genes were mutually exclusive of NPM1 and DNMT3A (P<0.007, P<0.04, P<0.04). RUNX1 and TP53 mutations were mutually exclusive of FLT3 and NPM1.
  • Signaling. FLT3 mutated in 56/200 samples; 62 additional samples had mutations in other kinases, phosphatases, or RAS-family proteins (KIT, KRAS, NRAS, PTPN11 dominated). 59% of samples had a signaling-pathway mutation.
  • Gene fusions. De novo RNA-seq assembly across 179 samples identified 118 fusions in 80 samples (mean 1.5; range 0–8; 71 distinct events). 74 in-frame fusions included PMLRARA, MYH11CBFB, RUNX1RUNX1T1, BCRABL1, PICALMMLLT10/AF10, NUP98NSD1, and multiple KMT2A (MLL) partners; 15 new in-frame events were identified. One out-of-frame fusion (GAS6–FAM70B) recurred in 3 samples.
  • miRNA variants. Somatic SNVs in miRNA genes in 7/200 (4%); 4 of these targeted the seed region of mature miR-142-3p, expected to alter mRNA target specificity.
  • Allelic expression bias. RNA-seq showed increased or exclusive expression of mutant DNMT3A, RUNX1, PHF6, and TP53 alleles in several cases, frequently explained by LOH or partial uniparental disomy.
  • Clonal architecture. Deep digital sequencing of WGS samples enabled VAF-based clustering; >50% of tumors had both a founding clone and ≥1 subclone, with up to three independent subclones detectable in a single tumor.
  • DNA methylation. IDH1/IDH2-mutant samples showed extensive methylation gains versus normal CD34+CD38− cells, while KMT2A fusions and concurrent NPM1+DNMT3A+FLT3 samples showed extensive methylation loss. Across the cohort, 160,519 CpG loci (42% of those tested) were significantly differentially methylated (67% gain, 33% loss). Strongest signatures occurred in CpG-sparse regions rather than CpG islands. Triple-mutant samples lost methylation at 328/382 differentially methylated regions >1 kb (86%).
  • Expression clusters. NMF consensus clustering identified 7 RNA-seq groups and 5 miRNA-seq groups; groups mapped onto FAB subtypes (group 4↔︎M1, group 3↔︎M3, group 5↔︎M4, group 7↔︎M5) and miRNA group 3 was strongly associated with NPM1, DNMT3A, FLT3, and cohesin-complex mutations with high miR-10a and low miR-424.
  • Genome stability. Median 1 somatic copy-number variant and <1 fusion event per genome; no chromothripsis observed; strong correlation (Pearson r=0.78) between coding and noncoding mutation counts, consistent with most mutations being pre-leukemic background events in hematopoietic stem cells.

Genes & alterations

  • NPM1 — mutated in 27% of cases; central node co-occurring with DNMT3A and FLT3 and mutually exclusive of transcription-factor fusions.
  • FLT3 — mutated in 56/200 (28%); part of the activated-signaling category; mutually exclusive of mutations in other tyrosine kinases / RAS-family genes.
  • DNMT3A — among the most significantly mutated DNA-methylation genes; allelic expression bias observed; co-occurs with NPM1 and FLT3.
  • IDH1, IDH2 — recurrent; mutant samples show distinctive CpG methylation gains versus normal CD34+CD38− cells.
  • TP53 — strongly associated with unfavorable / complex cytogenetics; mutually exclusive of FLT3 and NPM1; allelic expression bias observed.
  • RUNX1 — recurrent driver; mutually exclusive of FLT3 and NPM1; allelic expression bias observed.
  • CEBPA — recurrent; partner in the mutual-exclusivity set with transcription-factor fusions, NPM1, RUNX1, TP53.
  • TET2 — DNA-methylation-related driver category.
  • KIT, KRAS, NRAS, PTPN11 — non-FLT3 activated signaling genes contributing to the 59% signaling-pathway prevalence.
  • ASXL1 — anchor of the third mutual-exclusivity set (vs cohesin and other epigenetic modifiers).
  • EZH2, KDM6A, KMT2A, KMT2C — chromatin-modifying-gene category.
  • U2AF1, SF3B1, SRSF2 — spliceosome-complex genes (14% prevalence).
  • SMC1A, SMC3, RAD21, STAG2 — cohesin-complex genes (13% prevalence).
  • PHF6, WT1 — additional recurrent drivers; PHF6 showed allelic expression bias.
  • PMLRARA, RUNX1RUNX1T1, MYH11CBFB — favorable-risk transcription-factor fusions; mutually exclusive with NPM1 and DNMT3A.
  • KMT2A (MLL) fusion partners observed: MLLT3 (AF9), MLLT10 (AF10), and others; MLL-fused samples carried the fewest cooperating mutations.
  • NUP98NSD1, PICALMMLLT10, BCRABL1 — additional recurrent in-frame fusions detected by RNA-seq.

Clinical implications

  • The concurrent NPM1 + DNMT3A + FLT3 genotype, combined with distinct mRNA, miRNA, and CpG-sparse methylation signatures, may define a novel intermediate-risk AML subtype warranting separate classification PMID:23634996.
  • The authors argue current classification schemes (which already incorporate FLT3, NPM1, CEBPA, and KIT) are insufficient and that newer markers such as DNMT3A, IDH1/IDH2, and TET2 add prognostic information for intermediate-risk patients.
  • The 23 significantly mutated genes and the nine functional categories provide a foundation for refined molecular classification of AML.
  • The full dataset was released through the TCGA data portal as a public resource for risk stratification and pathogenesis studies (laml_tcga_pub).

Limitations & open questions

  • Whole-genome sequencing mean coverage was 30.54×, limiting detection of subclones with VAF <10%; exome at 167.50× modestly improved sub-clonal detection but did not significantly increase tier-1 mutation yield (14.5 WGS vs 12.7 exome per sample, P=0.17).
  • Many of the 260 genes mutated in ≥2 samples have unclear pathophysiologic relevance and require functional validation, especially fusion partners observed in only one sample.
  • The biological significance of strong methylation differences in CpG-sparse genomic regions (rather than CpG islands) is not yet understood.
  • Cohort-size limits some subgroup comparisons (e.g., MLL-fusion vs PML-RARA mutation-count differences) that the authors note will need confirmation in larger series.
  • Only 4% of samples had miRNA mutations, but the pathogenic impact of mutated miR-142 seed sequences remains to be functionally proven.
  • Tier 2/3 non-coding and mitochondrial variants were catalogued but their pathogenic relevance is unclear.

Citations from this paper used in the wiki

  • “We analyzed the genomes of 200 clinically annotated adult cases of de novo AML, using either whole-genome sequencing (50 cases) or whole-exome sequencing (150 cases), along with RNA and microRNA sequencing and DNA-methylation analysis.” (Abstract)
  • “A total of 23 genes were significantly mutated, and another 237 were mutated in two or more samples.” (Abstract)
  • “Nearly all samples had at least 1 nonsynonymous mutation in one of nine categories of genes… transcription-factor fusions (18%), NPM1 (27%), tumor-suppressor genes (16%), DNA-methylation–related genes (44%), signaling genes (59%), chromatin-modifying genes (30%), myeloid transcription-factor genes (22%), cohesin-complex genes (13%), and spliceosome-complex genes (14%).” (Abstract / Results)
  • “Eleven samples had MLL fusions; this group had the fewest recurrent tier 1 mutations, with a mean of 2.09, as compared with a mean of 5.24 for all 200 samples (P=0.002).” (Results, p.3)
  • “The likelihood that these mutations occurred together by chance is extremely small (P<6.3×10⁻⁷ for NPM1 and DNMT3A and P<1.9×10⁻⁶ for NPM1 and FLT3).” (Results, p.5–6)
  • “Significant changes in DNA methylation were identified across AML samples at 160,519 CpG loci (42% of sites tested), with 67% resulting in a gain of methylation and 33% resulting in a loss.” (Results, p.7)
  • “De novo assembly of RNA-sequencing data for 179 AML samples identified 118 gene fusions in 80 samples (mean, 1.5 per sample), of which 71 were distinct events.” (Results, p.5)

This page was processed by crosslinker on 2026-05-09.