Genomic Correlates of Immune-Cell Infiltrates in Colorectal Carcinoma
PMID: 27149842 · DOI: 10.1016/j.celrep.2016.03.075 · Journal: Cell Reports (2016)
TL;DR
Giannakis et al. performed whole-exome sequencing on 619 colorectal carcinomas drawn from two long-running U.S. prospective cohorts (Nurses’ Health Study and Health Professionals Follow-up Study), then integrated mutation and predicted-neoantigen data with H&E-based immune scoring and decades of clinical follow-up. They expand the CRC driver-gene list (e.g., BCL9L, RBM10, CTCF, KLF5, TGIF1), show that neoantigen load correlates with overall lymphocytic infiltration, TILs, CD45RO+ memory T cells, and improved CRC-specific survival — even within microsatellite-stable, POLE-wild-type tumors — and demonstrate positive selection for HLA class I and antigen-processing-machinery mutations in TIL-rich tumors as a candidate immune-escape mechanism.
Cohort & data
- 619 incident CRC cases with matched FFPE tumor-normal pairs from the Nurses’ Health Study (NHS, est. 1976) and Health Professionals Follow-up Study (HPFS, est. 1986); cancer type COADREAD; cBioPortal study coadread_dfci_2016.
- Assay: whole-exome sequencing on Illumina HiSeq 2000 with SureSelect v.2 capture; mean coverage 90x, 87% of bases at >=20x (PMID:27149842).
- Mutation calling: MuTect for SNVs; concordant calls from Indelocator and Strelka for indels; BWA-MEM realignment; FFPE C>T deamination artifacts filtered by single-strand bias (PMID:27149842).
- Significantly mutated genes inferred via the MutSigCV suite (Lawrence et al., 2014) plus manual curation; hypermutator threshold = 12 mutations/Mb; reference build hg19.
- HLA typing and class I mutation calling: POLYSOLVER (Shukla et al., 2015). Neoantigen prediction used NetMHCpan v2.4 on 9- and 10-mer mutant peptides bound to personal HLA alleles, with default affinity cutoff <500 nM (alternate cutoffs of 150 and 50 nM also tested).
- Pathology: H&E review by S.O. graded four lymphocytic-reaction components (Crohn’s-like, peritumoral, intratumoral periglandular, TILs) on 0–3 scales; overall lymphocytic score 0–12 (Ogino et al., 2009). Immunohistochemistry on TMAs quantified CD3+, CD8+, CD45RO+, FOXP3+ T cell densities in 299 samples (Nosho et al., 2010).
- Microsatellite status assayed with a 10-marker PCR panel (D2S123, D5S346, D17S250, BAT25, BAT26, BAT40, D18S55, D18S56, D18S67, D18S487); MSS = non-MSI-high. Four MSS samples carrying POLE exonuclease-domain hotspots (p.Pro286Arg, p.Val411Leu, p.Ser459Phe) were flagged separately.
- Survival data on 597 individuals (median follow-up 9.4 years, IQR 5.8–13.1); raw sequence files deposited under dbGAP:phs000722.
Key findings
- In 488 non-hypermutated CRCs, MutSigCV identified 90 significantly mutated genes; 73 of these were new statistical drivers for CRC relative to TCGA (Cancer Genome Atlas Network, 2012), Seshagiri et al. 2012, and Lawrence et al. 2014 (PMID:27149842).
- Newly nominated CRC drivers span WNT signaling (BCL9L, TCF7), RAS signaling (PRKCQ, MAP2K1, MAP2K7), TGF-beta signaling (TGIF1), RNA processing (RBM10, RBM12), chromatin/transcription (CTCF, KLF5), and tumor suppressors recurrent in other cancers (PTEN, RB1).
- Frameshift indels generated a disproportionately high share of predicted neoantigens relative to SNVs (8.1% of mutations vs. 22.6% of neoantigens are indels in the breakdown shown in Figure 3) (PMID:27149842).
- Recurrent neopeptides arose from known CRC drivers including RNF43, KRAS, and NRAS (Table S6).
- Neoantigen load was significantly correlated with the overall lymphocytic-reaction score (Spearman rho = 0.29, p = 2.6e-11; cutoff <500 nM). Correlation persisted at stricter affinity cutoffs of 150 nM (rho = 0.30, p = 1.5e-12) and 50 nM (rho = 0.32, p = 9.3e-14) (PMID:27149842).
- Among the four lymphocytic components, the strongest associations with neoantigen load were TILs (rho = 0.36, p = 2.0e-19) and the Crohn’s-like reaction (rho = 0.27, p = 6.1e-10).
- Neoantigen load correlated significantly with CD45RO+ memory T cell density (p = 0.05) but not with CD8+, CD3+, or FOXP3+ densities (Figure 4C; n = 299).
- The neoantigen-TIL correlation held even within MSS, POLE-wild-type tumors (p = 0.035 for TIL2+ vs. TIL1; p = 0.015 for TIL2+ vs. TIL0; Wilcoxon rank-sum), arguing neoantigen load is a genomic determinant of immune infiltration independently of MSI status.
- High neoantigen load (>450 predicted neoantigens; n = 149) was associated with improved CRC-specific survival vs. medium-low (n = 448): log-rank p = 0.004; multivariate HR = 0.57 (95% CI 0.35–0.93), p = 0.03, adjusted for stage, age, gender, tumor location, differentiation. Overall survival difference was non-significant (p = 0.057), attributed to high non-cancer mortality in this older cohort (Figure 5; Tables S8–S9).
- HLA class I mutations were detected by POLYSOLVER in 66 of 619 samples (11%), totaling 96 HLA mutations. 18 of 66 (27%) carried multiple HLA-allele hits; all 18 were hypermutated, vs. 71% of single-mutation cases (34/48).
- Putative loss-of-function HLA events (nonsense, splice, frameshift) were comparable between hypermutated (56%) and non-hypermutated (57%) groups. Mutations were enriched in exon 4 (TCR-binding domain) even after exon-length normalization (Figure S4).
- Of 37 mutations in the peptide-binding domains (39% of HLA mutations), 17 (46%) hit residues in direct peptide contact, predicted to disrupt peptide presentation.
- HLA mutations were significantly more frequent in TIL-high tumors (chi-squared p = 1.2e-22 across all samples; p = 1.6e-08 restricted to MSS POLE-wild-type), surviving adjustment for overall mutation rate via logistic regression.
- Mutated HLA alleles harbored more neoantigens than non-mutated alleles after normalizing for total coding mutations per patient (Wilcoxon p = 0.006), consistent with positive selection on alleles disproportionately presenting tumor neoantigens.
- Antigen-processing-machinery (APM) mutations, in aggregate, were enriched in TIL-rich tumors. The pathway includes MHC class I components (B2M 2/17 mutated in non-infiltrated vs. infiltrated; HLA class I 3/29), folding chaperones (CANX 0/5, HSPA5 2/3), the ER peptide-loading complex (TAP1 1/7, TAP2 1/5, TAPBP 0/6, CALR 0/3, PDIA3 1/2) (Figure 6C; Table S10).
Genes & alterations
- BCL9L — newly significant CRC driver; WNT co-factor; overexpression promotes intestinal tumor progression in mouse models (Brembeck et al. 2011).
- TCF7 — newly significant CRC driver in the WNT pathway.
- CTCF — recurrently mutated transcriptional regulator; CTCF/cohesin-binding sites previously shown to be hotspots in MSS CRC (Katainen et al. 2015).
- KLF5 — Zn-finger transcription factor regulating intestinal stem-cell niche; deletion suppresses CRC oncogenesis (Nakaya et al. 2014).
- TGIF1 — TGF-beta signaling co-repressor; new CRC driver.
- RBM10 and RBM12 — RNA splicing factors; RBM10 mutations previously linked to lung adenocarcinoma (TCGA 2014) and pancreatic adenocarcinoma (Witkiewicz et al. 2015).
- PRKCQ, MAP2K1, MAP2K7 — RAS-pathway kinases newly nominated as CRC drivers.
- PTEN, RB1 — pan-cancer tumor suppressors confirmed as significantly mutated in CRC by virtue of cohort size.
- RNF43, KRAS, NRAS — known CRC drivers identified as sources of recurrent neopeptides (Table S6).
- POLE — exonuclease-domain hotspots p.Pro286Arg, p.Val411Leu, p.Ser459Phe found in 4 MSS samples; carriers had elevated neoantigen load and were excluded from MSS-restricted TIL analyses.
- B2M, CALR, CANX, HSPA5, PDIA3, TAP1, TAP2, TAPBP — components of the MHC class I antigen-processing machinery; collectively enriched for somatic mutations in TIL-rich tumors, consistent with immune-escape selection (Figure 6C; Table S10).
- HLA class I (HLA-A/B/C) — 96 somatic mutations in 11% of cases; biased toward exon 4 (TCR-binding domain) and to peptide-contact residues in exons 2–3; mutated alleles carry more neoantigens than non-mutated alleles (Wilcoxon p = 0.006).
Clinical implications
- High neoantigen load is an independent, favorable prognostic marker for CRC-specific survival (multivariate HR 0.57, 95% CI 0.35–0.93, p = 0.03), supporting a biological role for tumor immunogenicity in disease outcome (PMID:27149842).
- The persistence of the neoantigen–TIL correlation within MSS, POLE-wild-type tumors implies a subset of MSS CRCs may be biologically positioned to respond to immune-checkpoint inhibition, even though Le et al. (2015, NEJM) reported responses to PD-1 blockade only in mismatch-repair-deficient CRC. The authors propose this MSS-high-neoantigen subset as a candidate population for future checkpoint-inhibitor trials.
- POLE-mutated CRCs were too few (n = 4) for survival statistics here, but the authors speculate these tumors should be responsive to checkpoint blockade based on neoantigen load alone.
- Positive selection on HLA class I and broader APM mutations in TIL-rich tumors identifies a candidate adaptive resistance mechanism to immune attack, with implications for resistance to checkpoint blockade — though whether HLA/APM-mutant tumors are actually less responsive to anti-PD-1/PD-L1 was left as an open question.
- Memory T cell (CD45RO+) infiltration was the specific T-cell subset most tightly linked to neoantigen load, reinforcing CD45RO+ density as a biologically grounded prognostic marker (Pages et al. 2005; Nosho et al. 2010).
Limitations & open questions
- Neoantigen prediction relied on MHC class I binding affinity (NetMHCpan, <500 nM cutoff) and did not assess actual T-cell recognition, peptide processing efficiency, or expression of the source transcript.
- HLA class I mutation calling depends on POLYSOLVER’s accuracy at calling somatic events in highly polymorphic loci; the authors do not address class II HLA loci.
- The neoantigen “high” cutoff (>450) was derived from a sensitivity analysis (200–4,000 in steps of 50) on the same cohort; external validation of this threshold is not provided.
- Overall survival did not reach statistical significance (p = 0.057), attributed to high competing-cause mortality in an older NHS/HPFS population (median follow-up 9.4 years).
- Statistical power to test associations in POLE-mutated CRCs was limited (n = 4); the proposed responsiveness to checkpoint blockade in this subset is hypothesis-generating only.
- Whether HLA/APM-mutant tumors actually exhibit primary or acquired resistance to immune-checkpoint inhibitors was not tested in this study and is explicitly flagged as an open question.
- FFPE artifacts were filtered (C>T single-strand bias) but residual FFPE-related noise may persist; tumor purity correction is via standard mutation calling rather than purity-aware tools.
- The study is restricted to U.S. health professionals and is predominantly Caucasian; generalizability of HLA-allele frequencies and antigen-processing dynamics to other populations is untested.
Citations from this paper used in the wiki
- “We performed WES on 619 cases with formalin-fixed paraffin-embedded (FFPE) tumor and matched normal tissue pairs from the NHS and the HPFS … The average sequencing coverage across all samples was 90x, with an average of 87% of bases covered at 20x.”
- “In 488 non-hypermutated tumors, we found 90 significantly mutated genes that include most genes observed in the Cancer Genome Atlas Network (2012) analysis of CRC … Among these 90 genes, 73 genes were new to CRC in terms of statistical designation as driver genes.”
- “Integrating these pathologic and neoantigen data revealed that higher neoantigen load was associated with increased overall lymphocytic score in CRC (Spearman’s rank correlation coefficient = 0.29, p value = 2.6 x 10^-11).”
- “When we restricted our analysis to MSS POLE-wild-type cases, we found a significant association between high neoantigen load and high degree of TILs (p value = 0.035 and p value = 0.015 for the comparison of TIL 2+ to TIL 1 and TIL 0 samples, respectively; Wilcoxon rank-sum test).”
- “An elevated neoantigen load was associated with improved CRC-specific survival (log rank test, p value = 0.004; multivariate hazard ratio = 0.57 [95% confidence interval, 0.35–0.93], p value = 0.03).”
- “We identified HLA class I mutations using a computational algorithm that we recently developed (Shukla et al., 2015) and found a total of 96 HLA mutations in 66 of 619 samples (11%).”
- “We found significant enrichment of these [APM] mutations in TIL-rich tumors (Table S10). The APM pathway includes proteins involved in major histocompatibility complex class I (MHC class I) folding (CANX and HSPA5), the MHC class I complex (HLA class I and B2M), and the endoplasmic reticulum (ER) peptide-loading complex (TAP, TAPBP, CALR, and PDIA3).”
- “Mutated HLA alleles across all samples were found to have more neoantigens than non-mutated alleles after normalizing for number of coding mutations in the patient (Wilcoxon rank-sum test, p value = 0.006).”
This page was processed by crosslinker on 2026-05-14.