Discriminant Analysis and Machine Learning Approach for Evaluating and Improving the Performance of Immunohistochemical Algorithms for COO Classification of DLBCL

dc.contributor.authorPerfecto-Avalos, Y.
dc.contributor.authorGarcia-Gonzalez, A.
dc.contributor.authorHernandez-Reynoso, Ana
dc.contributor.authorSánchez-Ante, G.
dc.contributor.authorOrtiz-Hidalgo, C.
dc.contributor.authorScott, S. -P
dc.contributor.authorFuentes-Aguilar, R. Q.
dc.contributor.authorDIaz-Dominguez, R.
dc.contributor.authorLeón-Martínez, G.
dc.contributor.authorVelasco-Vales, V.
dc.contributor.authorCárdenas-Escudero, M. A.
dc.contributor.authorHernández-Hernández, J. A.
dc.contributor.authorSantos, A.
dc.contributor.authorBorbolla-Escoboza, J. R.
dc.contributor.authorVillela, L.
dc.contributor.utdAuthorHernandez-Reynoso, Ana
dc.description.abstractBackground: Diffuse large B-cell lymphoma (DLBCL) is classified into germinal center-like (GCB) and non-germinal center-like (non-GCB) cell-of-origin groups, entities driven by different oncogenic pathways with different clinical outcomes. DLBCL classification by immunohistochemistry (IHC)-based decision tree algorithms is a simpler reported technique than gene expression profiling (GEP). There is a significant discrepancy between IHC-decision tree algorithms when they are compared to GEP. Methods: To address these inconsistencies, we applied the machine learning approach considering the same combinations of antibodies as in IHC-decision tree algorithms. Immunohistochemistry data from a public DLBCL database was used to perform comparisons among IHC-decision tree algorithms, and the machine learning structures based on Bayesian, Bayesian simple, Naïve Bayesian, artificial neural networks, and support vector machine to show the best diagnostic model. We implemented the linear discriminant analysis over the complete database, detecting a higher influence of BCL6 antibody for GCB classification and MUM1 for non-GCB classification. Results: The classifier with the highest metrics was the four antibody-based Perfecto-Villela (PV) algorithm with 0.94 accuracy, 0.93 specificity, and 0.95 sensitivity, with a perfect agreement with GEP (κ = 0.88, P < 0.001). After training, a sample of 49 Mexican-mestizo DLBCL patient data was classified by COO for the first time in a testing trial. Conclusions: Harnessing all the available immunohistochemical data without reliance on the order of examination or cut-off value, we conclude that our PV machine learning algorithm outperforms Hans and other IHC-decision tree algorithms currently in use and represents an affordable and time-saving alternative for DLBCL cell-of-origin identification. © 2019 The Author(s).
dc.description.departmentErik Jonsson School of Engineering and Computer Science
dc.description.sponsorshipFondo Sectorial de Investigación en Salud y Seguridad Social SSA/IMSS/ISSSTE-CONACYT-2008-1-86825 and SSA/IMSS/ISSSTECONACYT-2012-C01-180096, México.
dc.identifier.bibliographicCitationPerfecto-Avalos, Y., A. Garcia-Gonzalez, A. Hernandez-Reynoso, G. Sánchez-Ante, et al. 2019. "Discriminant analysis and machine learning approach for evaluating and improving the performance of immunohistochemical algorithms for COO classification of DLBCL." Journal of Translational Medicine 17: art. 198, doi: 10.1186/s12967-019-1951-y
dc.publisherBiomed Central Ltd.
dc.rightsCC BY 4.0 (Attribution)
dc.rights©2019 The Authors
dc.source.journalJournal of Translational Medicine
dc.subjectDiscriminant analysis
dc.subjectMachine learning
dc.subjectGerminal centers
dc.subject.meshLymphoma, B-Cell
dc.titleDiscriminant Analysis and Machine Learning Approach for Evaluating and Improving the Performance of Immunohistochemical Algorithms for COO Classification of DLBCL


Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
1.51 MB
Adobe Portable Document Format