Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas
PMID: 27158780 · DOI: 10.1038/ng.3564 · Journal: Nature Genetics (2016)
TL;DR
Campbell et al. compared exome sequences and SNP-array copy-number profiles from 660 lung adenocarcinoma (ADC) and 484 lung squamous cell carcinoma (SqCC) tumor/normal pairs (1,144 NSCLCs total) to identify novel drivers and to contrast molecular landscapes between the two histologies. Recurrently altered genes in lung SqCC more closely resembled those of head & neck squamous and bladder carcinomas than they did lung ADC. The authors nominated novel significantly mutated genes (PPP3CA, DOT1L, CMTR2/FTSJD1 in ADC; RASA1 in SqCC; KLF5, EP300, CREBBP in both); novel amplification peaks (MIR21 in ADC, MIR205 and YES1 in SqCC, MAPK1 in both); and additional Ras/Raf/RTK pathway candidates (SOS1, VAV1, RASA1, ARHGAP35) in oncogene-negative lung ADCs. 47% of lung ADC and 53% of lung SqCC tumors had ≥5 predicted neoepitopes, supporting broad immunotherapy applicability across both subtypes (PMID:27158780).
Cohort & data
- Lung ADC: 660 tumor/normal exome pairs — 274 previously unpublished, 227 from prior TCGA (luad_tcga_pub), 159 from the Imielinski et al cohort (luad_broad) (PMID:27158780).
- Lung SqCC: 484 tumor/normal exome pairs — 308 previously unpublished, 176 previously described in TCGA (lusc_tcga_pub) (PMID:27158780).
- Cancer types: LUAD and LUSC (combined cohort referred to as “Pan-Lung” / NSCLC).
- Cohort/study: nsclc_tcga_broad_2016.
- Assays: Exome capture with Agilent SureSelect Human All Exon 50MB followed by Illumina paired-end sequencing (whole-exome-seq); SNP-array copy number on Affymetrix SNP 6.0 (affymetrix-snp6); RNA-seq for 495 ADCs and 476 SqCCs (rna-seq); mutation calling with mutect and indelocator; driver discovery with MutSig2CV (mutsig); recurrent SCNAs with gistic (GISTIC2.0); purity/ploidy with absolute; mutational signatures via non-negative matrix factorization (bayesian-nmf, mutational-signatures); fusion calls via prada (PMID:27158780).
Key findings
- Median somatic mutation rate was 8.7/Mb in lung ADC and 9.7/Mb in lung SqCC; MutSig2CV identified 38 significantly mutated genes in ADC and 20 in SqCC (q < 0.1) (PMID:27158780).
- Only six genes — TP53, RB1, ARID1A, CDKN2A, PIK3CA, NF1 — were significantly mutated in both histologies; TP53, CDKN2A, and PIK3CA were significantly more frequently mutated in SqCC than ADC (p < 0.01, Fisher’s exact) (PMID:27158780).
- Lung SqCC most closely resembled HNSC and BLCA (>25% overlap of significantly mutated genes), while lung ADC most resembled GBM and CRC; the two NSCLC subtypes overlapped only ~12% with each other (p = 0.105) (PMID:27158780).
- Six mutational signatures identified by NMF, mapping to COSMIC SI4 (smoking, C>A transversions), SI7 (UV), SI13/SI2 (APOBEC), SI15/SI6 (MMR), and SI5 (clock-like). Smoking-signature SI4 separated never- vs ever-smokers far better in ADC (AUC = 0.87) than in SqCC (AUC = 0.62) — implying inaccurate smoking annotation in some SqCC cases (PMID:27158780).
- 87% of lung ADCs from never-smokers were transversion-low (TV-L; ≤0.696 SI4 mutations/Mb; p = 8.5 × 10⁻³⁷), but only 45% of TV-L lung ADCs came from never-smokers (PMID:27158780).
- Three lung SqCCs (~1%) showed UV-pattern mutations (SI7) and one (TCGA-18-3409) had a documented prior basal-cell carcinoma — likely cutaneous metastases misclassified as lung primaries (PMID:27158780).
- Seven tumors (4 ADC, 3 SqCC) had MMR-like signature (SI15/SI6) with elevated SSNVs and indels (p < 0.001) and reduced MLH1 expression (p = 0.011) (PMID:27158780).
- Pan-Lung joint analysis revealed 14 additional significantly mutated genes including KLF5 (zinc-finger hotspot), the EP300/CREBBP HAT-domain hotspot region, FBXW7, and B2M (FDR q = 0.006) (PMID:27158780).
- Novel focal amplifications: CCND3 and MIR21/TUBD1 in ADC; YES1 and MIR205 in SqCC; MAPK1 in the Pan-Lung analysis (PMID:27158780).
- Novel focal deletions: SMARCA4 and ARID2 in ADC; ZMYND11, CREBBP, ROBO1, USP22, KDM6A in SqCC; B2M and TRAF3 recurrently lost in both (PMID:27158780).
- Sex-specific enrichment: EGFR mutations in females and SMARCA4 mutations in males in ADC; RB1 mutations in females and PASK mutations exclusively in males in SqCC (FDR q < 0.1) (PMID:27158780).
- Oncogene-negative lung ADC: 242/660 ADCs lacked a known activating RTK/Ras/Raf alteration. Among them, SOS1 (recurrent autoinhibitory-domain p.N233Y, n=4), VAV1 (p.S67Y near CH/Ac/PH interface), RASA1, and ARHGAP35 were significantly enriched (q < 0.1), as were amplifications around FGFR1/WHSC1L1, PDGFRA/KIT/KDR, and MAPK1 (q < 0.25) (PMID:27158780).
- After incorporating these new candidates, 499/660 (76%) lung ADCs harbored an alteration in a known or putative RTK/Ras/Raf driver; in the 227-tumor expert-pathology-reviewed subset with RNA-seq, the rate rose to 193/227 (85%) (PMID:27158780).
- Co-occurrence: STK11 mutations significantly co-occurred with activating KRAS (p = 1.1 × 10⁻⁶), and 28 oncogene-negative ADCs additionally carried STK11 mutations — suggesting an unrecognized KRAS-related event complementary to STK11 loss (PMID:27158780).
- High EGFR amplification overlapped with activating EGFR mutations (p = 1.9 × 10⁻⁸); MET amplifications co-occurred with NF1 mutations (p = 0.019) (PMID:27158780).
- Neoantigens: Recurrent predicted neoepitopes included PIK3CA p.E542K, NFE2L2 p.E79Q, BRAF p.G466V, EGFR p.G719A, TP53 p.V157F/p.G154V/p.R175G/p.P278A, and a previously unappreciated recurrent MB21D2 (C3orf59) p.Q311E. 47% of lung ADC and 53% of lung SqCC tumors had ≥5 predicted neoepitopes (PMID:27158780).
- Neoepitope and nonsynonymous mutation counts were significantly lower in never-smoker ADCs vs ever-smoker ADCs (p < 0.001, Wilcoxon rank-sum), but did not differ between ever-smoker ADCs and SqCCs (PMID:27158780).
Genes & alterations
- TP53, CDKN2A, PIK3CA — significantly more frequently mutated in lung SqCC than ADC (p < 0.01) (PMID:27158780).
- STK11, RBM10, KEAP1, RAF1, RIT1, MET — significantly mutated exclusively in lung ADC vs other TCGA tumor types (q < 0.1) (PMID:27158780).
- NFE2L2, KDM6A, RASA1, NOTCH1, HRAS — significantly mutated in lung SqCC but not other cancer types (excluding HNSC, BLCA) (PMID:27158780).
- AKT1 — recurrent p.E17K mutation in lung ADC; CDK4 — recurrent p.R24L; DNMT3A — modest q-value in ADC (PMID:27158780).
- PPP3CA (calcineurin catalytic subunit) — novel SMG in lung ADC; missense mutations cluster in the autoinhibitory C-terminal domain (suggesting gain of function) and co-occur with activating KRAS (p = 0.033) (PMID:27158780).
- DOT1L (H3K79 methyltransferase) — novel SMG in 3% of lung ADCs, enriched for truncating mutations (PMID:27158780).
- KMT2C (MLL3) and SETD2 — significantly mutated methyltransferases in lung ADC (PMID:27158780).
- CMTR2 / FTSJD1 (cap methyltransferase) — novel SMG in lung ADC enriched for frameshifts; also recurrent in SF3B1 and SNRPD3 (PMID:27158780).
- RASA1 (p120GAP) and CUL3 (KEAP1 partner) — novel SMGs in lung SqCC enriched for frameshift mutations (p < 0.001 for RASA1 frameshift enrichment) (PMID:27158780).
- EGFR — kinase-domain duplication in lung ADC sample TCGA-49-4512; complex indels in EGFR or MET in 11 tumors (PMID:27158780).
- MAP2K1 — recurrent in-frame insertion in lung ADC (PMID:27158780).
- MET fusions: novel MET–CAPZA2 fusion (with neighboring gene) and previously reported KIF5B–MET (PMID:27158780).
- NTRK2 fusions: TRIM24–NTRK2 in ADC and a novel NTRK2–TP63 fusion in lung SqCC (PMID:27158780).
- MET and ERBB2 high-level amplifications — enriched in tumors lacking other RTK/Ras/Raf activating events (p < 0.01) (PMID:27158780).
- Novel amplification target genes: KAT6A, ZNF217, MYCL (ADC); IGF1R, KDM5A, PTP4A1/PHF3, MYCL (SqCC) (PMID:27158780).
- B2M — focally deleted in both ADC and SqCC, enriched for loss-of-function mutations (p < 0.01), Pan-Lung FDR q = 0.006 — implicated in MHC-I antigen presentation loss (PMID:27158780).
- MB21D2 (C3orf59) — recurrent p.Q311E with predicted neoepitope properties; not previously implicated in lung cancer (PMID:27158780).
Clinical implications
- Targeted therapy is largely subtype-specific. The subtype-distinct landscapes of recurrently mutated drivers and SCNAs imply that approved RTK-directed therapies (e.g., EGFR-TKI for EGFR mutant ADC, ALK and ROS1 inhibitors for ALK/ROS1 fusion ADC) are expected to apply almost exclusively to lung ADC, while lung SqCC has few directly actionable kinase drivers (PMID:27158780).
- Expanded druggable pool in oncogene-negative lung ADC. Adding SOS1, VAV1, RASA1, ARHGAP35, and amplifications at FGFR1/WHSC1L1, PDGFRA/KIT/KDR, and MAPK1 raises the proportion of lung ADCs with a candidate Ras/Raf/RTK pathway driver to 76% (overall) and 85% (in expert-reviewed subset) — narrowing the unexplained fraction (PMID:27158780).
- Predicted EGFR kinase-domain duplication (KDD) in TCGA-49-4512 — separately reported as afatinib-responsive in lung cancer (citation 31 in the paper) (PMID:27158780).
- Immunotherapy applies broadly across NSCLC histologies. With ≥5 predicted neoepitopes in 47% of ADCs and 53% of SqCCs, the authors argue that checkpoint immunotherapy is expected to benefit both subtypes (in contrast to histology-restricted targeted therapies) (PMID:27158780).
- Smoking history is the dominant correlate of neoantigen load in ADC. Never-smoker ADCs have significantly fewer neoepitopes than ever-smoker ADCs (p < 0.001) — relevant for biomarker stratification of immunotherapy candidates (PMID:27158780).
- Recurrent neoepitope hotspots — PIK3CA p.E542K, NFE2L2 p.E79Q, BRAF p.G466V, EGFR p.G719A, multiple TP53 hotspots, and MB21D2 p.Q311E — are candidate shared neoantigens for off-the-shelf vaccine design (PMID:27158780).
- No mutation-status association with patient survival or tumor stage survived multiple-hypothesis correction (with or without controlling for stage) (PMID:27158780).
Limitations & open questions
- Whole-exome only. The study could not detect mutations in non-coding regions or regulatory elements; the authors flag this as a target for future whole-genome lung cancer studies (PMID:27158780).
- Single sample per patient. Intra-tumoral heterogeneity could not be assessed (in contrast to multi-region sequencing studies) (PMID:27158780).
- Underpowered for rare RTK/Ras/Raf events. With 15–25% of lung ADCs still lacking a detectable activating alteration, additional rare recurrent drivers in known and novel pathway members likely remain undiscovered (PMID:27158780).
- Incomplete RNA-seq coverage — fusion calling and MET exon-14 skipping may be underestimated because matched RNA-seq was not available for every tumor (PMID:27158780).
- Smoking annotation inaccuracy — SI4-based classification predicted ever-smoker status well in ADC (AUC 0.87) but poorly in SqCC (AUC 0.62), suggesting some clinically annotated SqCC “never-smokers” are misclassified (PMID:27158780).
- Possible cutaneous metastases — three lung SqCCs with UV signature SI7 may represent skin SCC metastatic to the lung, not true lung primaries (PMID:27158780).
Citations from this paper used in the wiki
- “we examined exome sequences and copy number profiles of 660 lung ADC and 484 lung SqCC tumor/normal pairs” (Abstract).
- “Novel significantly mutated genes included PPP3CA, DOT1L, and FTSJD1 in lung ADC, RASA1 in lung SqCC, and KLF5, EP300, and CREBBP in both tumor types.” (Abstract).
- “Novel amplification peaks encompassed MIR21 in lung ADC, MIR205 in lung SqCC, and MAPK1 in both.” (Abstract).
- “Lung ADCs lacking receptor tyrosine kinase/Ras/Raf alterations revealed mutations in SOS1, VAV1, RASA1, and ARHGAP35.” (Abstract).
- “47% of the lung ADC and 53% of the lung SqCC tumors had at least 5 predicted neoepitopes.” (Abstract).
- “Only 6 genes, TP53, RB1, ARID1A, CDKN2A, PIK3CA, and NF1, were significantly mutated in both tumor types.” (Results — Comparison of somatically altered genes).
- “Recurrently mutated and amplified genes in lung SqCC most closely resembled the alterations in head and neck squamous cell carcinoma (HNSC) and bladder cancer (BLCA).” (Results — Comparison of somatically altered genes).
- “87% of lung ADCs from never smokers were categorized as transversion-low (TV-L; ≤0.696 of SI4 per Mb; p = 8.5 × 10⁻³⁷).” (Results — Mutational signatures).
- “499 (76%) lung ADCs displayed an alteration in known or putative Ras/Raf/RTK driver genes” and “193 out of 227 (85%) lung ADCs that previously underwent secondary expert pathological review and had RNA-seq data available for fusion analysis contained a predicted activating alteration in the RTK/Ras/Raf pathway.” (Results — Identifying Ras/Raf/RTK drivers).
- “STK11 mutations significantly overlapped with activating KRAS mutations (p = 1.1 × 10⁻⁶)” (Results — Identifying Ras/Raf/RTK drivers).
- “Both nonsynonymous mutation and neoepitope counts … were significantly lower in lung ADCs from never smokers compared to lung ADCs from ever smokers (p < 0.001; Wilcoxon rank-sum test).” (Results — Assessment of neoantigen load).
This page was processed by crosslinker on 2026-05-14.