Validation of LymphGen classification on a 400-gene clinical next-generation sequencing panel in diffuse large B-cell lymphoma: real-world experience from a cancer center

Authors

Zhu ML

Drill E

Joffe E

Salles G

Rivas Delgado A

Zelenetz A

Palomba ML

Arcila M

Dogan A

Doi

PMID: 38497151 · DOI: 10.3324/haematol.2023.284565 · Journal: Haematologica (2024)

TL;DR

This study validated the LymphGen probabilistic classification algorithm – which assigns DLBCLNOS cases to six molecularly distinct subtypes (MCD, BN2, EZB, ST2, A53, N1) – on MSK-IMPACT Heme, a clinically validated 400-gene targeted NGS panel. Using NCI cohort data filtered to the IMPACT gene set, the approach achieved 92% overall accuracy (86% sensitivity, 98% specificity), rising to 96% accuracy in the high-confidence “Core” group. Applying LymphGen to an independent MSK clinical cohort of 396 DLBCL cases (mbn_msk_2024), 55% of cases were successfully classified into one of the six subtypes, demonstrating that LymphGen can be translated to routine clinical practice with a targeted panel.

Cohort & data

  • NCI validation cohort: Published NCI DLBCL cohort data, filtered to the 400 genes on the MSK-IMPACT Heme panel to benchmark classification accuracy against the original comprehensive NCI panel (ground truth).
  • MSK clinical cohort: 396 DLBCLNOS cases sequenced by MSK-IMPACT Heme (mbn_msk_2024); 320 of 396 cases had both BCL2 and BCL6 clinical cytogenetic results available. Data deposited at cBioPortal (study ID: mbn_msk_2024).
  • Assay: MSK-IMPACT Heme – hybridization capture-based 400-gene panel detecting SNVs, small INDELs, and CNAs in hematopoietic malignancies. Uses saliva or nail clippings as germline source.
  • Classification method: LymphGen probabilistic algorithm, applied with SNV, INDEL, CNA, and FISH translocation data for BCL2 and BCL6.

Key findings

  • NCI cohort (IMPACT gene filter): Overall accuracy 92%, sensitivity 86%, specificity 98% compared to the original comprehensive NCI panel classification. 58% of cases classified into one of the 6 subtypes (vs. 63% with comprehensive methods).
  • Core group accuracy: Restricting to high-confidence “Core” predictions (>90% probability) increased accuracy to 96%, with 86% sensitivity and 98% specificity.
  • Subtype distribution (NCI/IMPACT): MCD 11.7%, BN2 12.7%, EZB 13.7%, A53 9.2%, ST2 5.3%, N1 0.8%.
  • MSK cohort classification: 55% of 396 cases classified – MCD 10%, EZB 22%, BN2 10%, N1 3%, ST2 7%, A53 2%, composite 2%. Two-thirds of classified cases fell into the Core group, one-third into the Extended group (50-90% probability).
  • MSK Core group (n=148, 37.4% of total): MCD 18.2%, EZB 44%, BN2 22%, N1 7%, ST2 7%, A53 2%.
  • Cell-of-origin correlation: EZB subtypes were predominantly GCB; MCD and N1 subtypes were predominantly ABC/non-GCB; BN2, ST2, and A53 carried mixed GCB/non-GCB, consistent with prior NCI findings.
  • CNA impact: Removing CNA data from LymphGen input reduced overall balanced accuracy to 81% (sensitivity 83%, specificity 89%) and eliminated A53 subtype classification, which relies primarily on copy number alterations.
  • BCL2 translocations: Enriched in EZB subtype (54/67 cases, 81%); only 2 cases with BCL2 translocation fell in the unclassified group, suggesting BCL2 translocation is a robust EZB classifier.
  • BCL6 translocations: Enriched in BN2 subtype (24/31 cases, 77%).
  • MYC translocations: More frequent in BN2 when co-existing with BCL6 translocation (6/39 cases, 15.3%); also seen in small proportions of EZB (7/86, 8.1%), MCD (2/40, 5.0%), and ST2 (2/27, 7.4%).
  • N1 subtype: 50% (4/8 cases) expressed CD5 and LEF1 by immunohistochemistry, suggesting some N1 cases may represent Richter transformation of CLL.
  • Sequencing success: Over 99% of DLBCL cases were successfully sequenced, attributed to detailed pre-analytical quality checks.

Genes & alterations

  • BCL2 – translocation: enriched in EZB subtype (81% of EZB cases); robust single-gene classifier for EZB.
  • BCL6 – translocation: enriched in BN2 subtype (77% of BN2 cases).
  • MYC – translocation: co-occurring with BCL6 translocation in BN2 (15.3%); also present at low frequency in EZB, MCD, and ST2.
  • TP53 – mutations/CNA: defining feature of A53 subtype; classification relies primarily on copy number alterations.
  • MYD88 – mutations: key classifier for MCD subtype (referenced via LymphGen algorithm definition).
  • CD79B – mutations: key classifier for MCD subtype (referenced via LymphGen algorithm definition).

Clinical implications

  • LymphGen molecular subclassification of DLBCL can be reliably performed using a targeted 400-gene clinical NGS panel (MSK-IMPACT Heme), enabling translation from research to routine clinical practice.
  • High-confidence “Core” group classifications (>90% probability) are highly accurate (96%) and may guide patient selection for risk-adapted and biomarker-selected clinical trials.
  • CNA data substantially improves classification, particularly for the A53 subtype; laboratories should optimize CNA detection in their pipelines.
  • Survival analyses within the MSK cohort showed trends consistent with previously reported prognostic differences among subtypes, but statistical significance was not achieved, likely due to cohort heterogeneity and short follow-up.

Limitations & open questions

  • Survival analysis did not reach statistical significance due to cohort heterogeneity and limited follow-up duration.
  • The 400-gene panel classifies 55% of cases (vs. 63% with comprehensive methods), leaving 45% unclassified.
  • CNA calling accuracy can be compromised by poor sample quality or low tumor content, which may reduce classification performance in real-world settings.
  • Sequencing success rates (>99% in this study) may differ with other laboratory workflows and sample types.
  • The A53 subtype is essentially unclassifiable without CNA data, limiting its detection when CNA calling is unavailable.
  • Whether targeted-panel LymphGen classification can prospectively guide treatment selection remains to be demonstrated in clinical trials.

Citations from this paper used in the wiki

  • “Applying the LymphGen cluster allocation to the limited set of 400 genes on cases from the NCI cohort… demonstrated an overall accuracy of 92%, with 86% sensitivity and 98% specificity” (p. 2326).
  • “In total, 58% of cases were ultimately classified into one of the 6 subtypes… which was only marginally below the 63% reported with the use of comprehensive methods” (p. 2326).
  • “In the ‘Core’ group, classification accuracy by the IMPACT gene panel increased to 96%” (p. 2326).
  • “When the LymphGen algorithm was applied to the MSK DLBCL cohort, 55% of the cases were assigned into one of the subtypes (MCD: 10%, EZB: 22%, BN2: 10%, N1: 3%, ST2: 7%, A53: 2%, composite: 2%)” (p. 2329).
  • “BCL2 translocations were enriched in the EZB subtype (54 out of 67 cases, 81%), and BCL6 translocations were enriched in the BN2 subtype (24 out of 31 cases, 77%)” (p. 2329).
  • “50% (4 out of 8 cases) of N1 subtypes expressed CD5 and LEF1 by immunohistochemistry, indicating that some of the N1 subtype cases might represent Richter’s transformation of chronic lymphocytic leukemia” (p. 2329).

This page was processed by crosslinker on 2026-05-04.