Genomic Correlates of Immune-Cell Infiltrates in Colorectal Carcinoma

Authors

Marios Giannakis

Xinmeng Jasmine Mu

Sachet A. Shukla

Zhi Rong Qian

Ofir Cohen

Reiko Nishihara

Samira Bahl

Yin Cao

Ali Amin-Mansour

Mai Yamauchi

Yasutaka Sukawa

Chip Stewart

Mara Rosenberg

Kosuke Mima

Kentaro Inamura

Katsuhiko Nosho

Jonathan A. Nowak

Michael S. Lawrence

Edward L. Giovannucci

Andrew T. Chan

Kimmie Ng

Jeffrey A. Meyerhardt

Eliezer M. Van Allen

Gad Getz

Stacey B. Gabriel

Eric S. Lander

Catherine J. Wu

Charles S. Fuchs

Shuji Ogino

Levi A. Garraway

Doi

PMID: 27149842 · DOI: 10.1016/j.celrep.2016.03.075 · Journal: Cell Reports (2016)

TL;DR

Giannakis et al. performed whole-exome sequencing on 619 colorectal carcinomas drawn from two long-running U.S. prospective cohorts (Nurses’ Health Study and Health Professionals Follow-up Study), then integrated mutation and predicted-neoantigen data with H&E-based immune scoring and decades of clinical follow-up. They expand the CRC driver-gene list (e.g., BCL9L, RBM10, CTCF, KLF5, TGIF1), show that neoantigen load correlates with overall lymphocytic infiltration, TILs, CD45RO+ memory T cells, and improved CRC-specific survival — even within microsatellite-stable, POLE-wild-type tumors — and demonstrate positive selection for HLA class I and antigen-processing-machinery mutations in TIL-rich tumors as a candidate immune-escape mechanism.

Cohort & data

  • 619 incident CRC cases with matched FFPE tumor-normal pairs from the Nurses’ Health Study (NHS, est. 1976) and Health Professionals Follow-up Study (HPFS, est. 1986); cancer type COADREAD; cBioPortal study coadread_dfci_2016.
  • Assay: whole-exome sequencing on Illumina HiSeq 2000 with SureSelect v.2 capture; mean coverage 90x, 87% of bases at >=20x (PMID:27149842).
  • Mutation calling: MuTect for SNVs; concordant calls from Indelocator and Strelka for indels; BWA-MEM realignment; FFPE C>T deamination artifacts filtered by single-strand bias (PMID:27149842).
  • Significantly mutated genes inferred via the MutSigCV suite (Lawrence et al., 2014) plus manual curation; hypermutator threshold = 12 mutations/Mb; reference build hg19.
  • HLA typing and class I mutation calling: POLYSOLVER (Shukla et al., 2015). Neoantigen prediction used NetMHCpan v2.4 on 9- and 10-mer mutant peptides bound to personal HLA alleles, with default affinity cutoff <500 nM (alternate cutoffs of 150 and 50 nM also tested).
  • Pathology: H&E review by S.O. graded four lymphocytic-reaction components (Crohn’s-like, peritumoral, intratumoral periglandular, TILs) on 0–3 scales; overall lymphocytic score 0–12 (Ogino et al., 2009). Immunohistochemistry on TMAs quantified CD3+, CD8+, CD45RO+, FOXP3+ T cell densities in 299 samples (Nosho et al., 2010).
  • Microsatellite status assayed with a 10-marker PCR panel (D2S123, D5S346, D17S250, BAT25, BAT26, BAT40, D18S55, D18S56, D18S67, D18S487); MSS = non-MSI-high. Four MSS samples carrying POLE exonuclease-domain hotspots (p.Pro286Arg, p.Val411Leu, p.Ser459Phe) were flagged separately.
  • Survival data on 597 individuals (median follow-up 9.4 years, IQR 5.8–13.1); raw sequence files deposited under dbGAP:phs000722.

Key findings

  • In 488 non-hypermutated CRCs, MutSigCV identified 90 significantly mutated genes; 73 of these were new statistical drivers for CRC relative to TCGA (Cancer Genome Atlas Network, 2012), Seshagiri et al. 2012, and Lawrence et al. 2014 (PMID:27149842).
  • Newly nominated CRC drivers span WNT signaling (BCL9L, TCF7), RAS signaling (PRKCQ, MAP2K1, MAP2K7), TGF-beta signaling (TGIF1), RNA processing (RBM10, RBM12), chromatin/transcription (CTCF, KLF5), and tumor suppressors recurrent in other cancers (PTEN, RB1).
  • Frameshift indels generated a disproportionately high share of predicted neoantigens relative to SNVs (8.1% of mutations vs. 22.6% of neoantigens are indels in the breakdown shown in Figure 3) (PMID:27149842).
  • Recurrent neopeptides arose from known CRC drivers including RNF43, KRAS, and NRAS (Table S6).
  • Neoantigen load was significantly correlated with the overall lymphocytic-reaction score (Spearman rho = 0.29, p = 2.6e-11; cutoff <500 nM). Correlation persisted at stricter affinity cutoffs of 150 nM (rho = 0.30, p = 1.5e-12) and 50 nM (rho = 0.32, p = 9.3e-14) (PMID:27149842).
  • Among the four lymphocytic components, the strongest associations with neoantigen load were TILs (rho = 0.36, p = 2.0e-19) and the Crohn’s-like reaction (rho = 0.27, p = 6.1e-10).
  • Neoantigen load correlated significantly with CD45RO+ memory T cell density (p = 0.05) but not with CD8+, CD3+, or FOXP3+ densities (Figure 4C; n = 299).
  • The neoantigen-TIL correlation held even within MSS, POLE-wild-type tumors (p = 0.035 for TIL2+ vs. TIL1; p = 0.015 for TIL2+ vs. TIL0; Wilcoxon rank-sum), arguing neoantigen load is a genomic determinant of immune infiltration independently of MSI status.
  • High neoantigen load (>450 predicted neoantigens; n = 149) was associated with improved CRC-specific survival vs. medium-low (n = 448): log-rank p = 0.004; multivariate HR = 0.57 (95% CI 0.35–0.93), p = 0.03, adjusted for stage, age, gender, tumor location, differentiation. Overall survival difference was non-significant (p = 0.057), attributed to high non-cancer mortality in this older cohort (Figure 5; Tables S8–S9).
  • HLA class I mutations were detected by POLYSOLVER in 66 of 619 samples (11%), totaling 96 HLA mutations. 18 of 66 (27%) carried multiple HLA-allele hits; all 18 were hypermutated, vs. 71% of single-mutation cases (34/48).
  • Putative loss-of-function HLA events (nonsense, splice, frameshift) were comparable between hypermutated (56%) and non-hypermutated (57%) groups. Mutations were enriched in exon 4 (TCR-binding domain) even after exon-length normalization (Figure S4).
  • Of 37 mutations in the peptide-binding domains (39% of HLA mutations), 17 (46%) hit residues in direct peptide contact, predicted to disrupt peptide presentation.
  • HLA mutations were significantly more frequent in TIL-high tumors (chi-squared p = 1.2e-22 across all samples; p = 1.6e-08 restricted to MSS POLE-wild-type), surviving adjustment for overall mutation rate via logistic regression.
  • Mutated HLA alleles harbored more neoantigens than non-mutated alleles after normalizing for total coding mutations per patient (Wilcoxon p = 0.006), consistent with positive selection on alleles disproportionately presenting tumor neoantigens.
  • Antigen-processing-machinery (APM) mutations, in aggregate, were enriched in TIL-rich tumors. The pathway includes MHC class I components (B2M 2/17 mutated in non-infiltrated vs. infiltrated; HLA class I 3/29), folding chaperones (CANX 0/5, HSPA5 2/3), the ER peptide-loading complex (TAP1 1/7, TAP2 1/5, TAPBP 0/6, CALR 0/3, PDIA3 1/2) (Figure 6C; Table S10).

Genes & alterations

  • BCL9L — newly significant CRC driver; WNT co-factor; overexpression promotes intestinal tumor progression in mouse models (Brembeck et al. 2011).
  • TCF7 — newly significant CRC driver in the WNT pathway.
  • CTCF — recurrently mutated transcriptional regulator; CTCF/cohesin-binding sites previously shown to be hotspots in MSS CRC (Katainen et al. 2015).
  • KLF5 — Zn-finger transcription factor regulating intestinal stem-cell niche; deletion suppresses CRC oncogenesis (Nakaya et al. 2014).
  • TGIF1 — TGF-beta signaling co-repressor; new CRC driver.
  • RBM10 and RBM12 — RNA splicing factors; RBM10 mutations previously linked to lung adenocarcinoma (TCGA 2014) and pancreatic adenocarcinoma (Witkiewicz et al. 2015).
  • PRKCQ, MAP2K1, MAP2K7 — RAS-pathway kinases newly nominated as CRC drivers.
  • PTEN, RB1 — pan-cancer tumor suppressors confirmed as significantly mutated in CRC by virtue of cohort size.
  • RNF43, KRAS, NRAS — known CRC drivers identified as sources of recurrent neopeptides (Table S6).
  • POLE — exonuclease-domain hotspots p.Pro286Arg, p.Val411Leu, p.Ser459Phe found in 4 MSS samples; carriers had elevated neoantigen load and were excluded from MSS-restricted TIL analyses.
  • B2M, CALR, CANX, HSPA5, PDIA3, TAP1, TAP2, TAPBP — components of the MHC class I antigen-processing machinery; collectively enriched for somatic mutations in TIL-rich tumors, consistent with immune-escape selection (Figure 6C; Table S10).
  • HLA class I (HLA-A/B/C) — 96 somatic mutations in 11% of cases; biased toward exon 4 (TCR-binding domain) and to peptide-contact residues in exons 2–3; mutated alleles carry more neoantigens than non-mutated alleles (Wilcoxon p = 0.006).

Clinical implications

  • High neoantigen load is an independent, favorable prognostic marker for CRC-specific survival (multivariate HR 0.57, 95% CI 0.35–0.93, p = 0.03), supporting a biological role for tumor immunogenicity in disease outcome (PMID:27149842).
  • The persistence of the neoantigen–TIL correlation within MSS, POLE-wild-type tumors implies a subset of MSS CRCs may be biologically positioned to respond to immune-checkpoint inhibition, even though Le et al. (2015, NEJM) reported responses to PD-1 blockade only in mismatch-repair-deficient CRC. The authors propose this MSS-high-neoantigen subset as a candidate population for future checkpoint-inhibitor trials.
  • POLE-mutated CRCs were too few (n = 4) for survival statistics here, but the authors speculate these tumors should be responsive to checkpoint blockade based on neoantigen load alone.
  • Positive selection on HLA class I and broader APM mutations in TIL-rich tumors identifies a candidate adaptive resistance mechanism to immune attack, with implications for resistance to checkpoint blockade — though whether HLA/APM-mutant tumors are actually less responsive to anti-PD-1/PD-L1 was left as an open question.
  • Memory T cell (CD45RO+) infiltration was the specific T-cell subset most tightly linked to neoantigen load, reinforcing CD45RO+ density as a biologically grounded prognostic marker (Pages et al. 2005; Nosho et al. 2010).

Limitations & open questions

  • Neoantigen prediction relied on MHC class I binding affinity (NetMHCpan, <500 nM cutoff) and did not assess actual T-cell recognition, peptide processing efficiency, or expression of the source transcript.
  • HLA class I mutation calling depends on POLYSOLVER’s accuracy at calling somatic events in highly polymorphic loci; the authors do not address class II HLA loci.
  • The neoantigen “high” cutoff (>450) was derived from a sensitivity analysis (200–4,000 in steps of 50) on the same cohort; external validation of this threshold is not provided.
  • Overall survival did not reach statistical significance (p = 0.057), attributed to high competing-cause mortality in an older NHS/HPFS population (median follow-up 9.4 years).
  • Statistical power to test associations in POLE-mutated CRCs was limited (n = 4); the proposed responsiveness to checkpoint blockade in this subset is hypothesis-generating only.
  • Whether HLA/APM-mutant tumors actually exhibit primary or acquired resistance to immune-checkpoint inhibitors was not tested in this study and is explicitly flagged as an open question.
  • FFPE artifacts were filtered (C>T single-strand bias) but residual FFPE-related noise may persist; tumor purity correction is via standard mutation calling rather than purity-aware tools.
  • The study is restricted to U.S. health professionals and is predominantly Caucasian; generalizability of HLA-allele frequencies and antigen-processing dynamics to other populations is untested.

Citations from this paper used in the wiki

  • “We performed WES on 619 cases with formalin-fixed paraffin-embedded (FFPE) tumor and matched normal tissue pairs from the NHS and the HPFS … The average sequencing coverage across all samples was 90x, with an average of 87% of bases covered at 20x.”
  • “In 488 non-hypermutated tumors, we found 90 significantly mutated genes that include most genes observed in the Cancer Genome Atlas Network (2012) analysis of CRC … Among these 90 genes, 73 genes were new to CRC in terms of statistical designation as driver genes.”
  • “Integrating these pathologic and neoantigen data revealed that higher neoantigen load was associated with increased overall lymphocytic score in CRC (Spearman’s rank correlation coefficient = 0.29, p value = 2.6 x 10^-11).”
  • “When we restricted our analysis to MSS POLE-wild-type cases, we found a significant association between high neoantigen load and high degree of TILs (p value = 0.035 and p value = 0.015 for the comparison of TIL 2+ to TIL 1 and TIL 0 samples, respectively; Wilcoxon rank-sum test).”
  • “An elevated neoantigen load was associated with improved CRC-specific survival (log rank test, p value = 0.004; multivariate hazard ratio = 0.57 [95% confidence interval, 0.35–0.93], p value = 0.03).”
  • “We identified HLA class I mutations using a computational algorithm that we recently developed (Shukla et al., 2015) and found a total of 96 HLA mutations in 66 of 619 samples (11%).”
  • “We found significant enrichment of these [APM] mutations in TIL-rich tumors (Table S10). The APM pathway includes proteins involved in major histocompatibility complex class I (MHC class I) folding (CANX and HSPA5), the MHC class I complex (HLA class I and B2M), and the endoplasmic reticulum (ER) peptide-loading complex (TAP, TAPBP, CALR, and PDIA3).”
  • “Mutated HLA alleles across all samples were found to have more neoantigens than non-mutated alleles after normalizing for number of coding mutations in the patient (Wilcoxon rank-sum test, p value = 0.006).”

This page was processed by crosslinker on 2026-05-14.