Development and validation of deep learning classifiers to detect Epstein-Barr virus and microsatellite instability status in gastric cancer: a retrospective multicentre cohort study

Muti, H. S.; Heij, L. R.; Keller, G.; Kohlruss, M.; Langer, R.; Dislich, B.; Cheong, J. -H.; Kim, Y. -W.; Kim, H.; Kook, M. -C.; Cunningham, D.; Allum, W. H.; Langley, R. E.; Nankivell, M. G.; Quirke, P.; Hayden, J. D.; West, N. P.; Irvine, A. J.; Yoshikawa, T.; Oshima, T.; Huss, R.; Grosser, B.; Roviello, F.; D'Ignazio, A.; Quaas, A.; Alakus, H.; Tan, X.; Pearson, A. T.; Luedde, T.; Ebert, M. P.; Jager, D.; Trautwein, C.; Gaisa, N. T.; Grabsch, H. I.; Kather, J. N.

doi:10.1016/S2589-7500(21)00133-3

Background: Response to immunotherapy in gastric cancer is associated with microsatellite instability (or mismatch repair deficiency) and Epstein-Barr virus (EBV) positivity. We therefore aimed to develop and validate deep learning-based classifiers to detect microsatellite instability and EBV status from routine histology slides. Methods: In this retrospective, multicentre study, we collected tissue samples from ten cohorts of patients with gastric cancer from seven countries (South Korea, Switzerland, Japan, Italy, Germany, the UK and the USA). We trained a deep learning-based classifier to detect microsatellite instability and EBV positivity from digitised, haematoxylin and eosin stained resection slides without annotating tumour containing regions. The performance of the classifier was assessed by within-cohort cross-validation in all ten cohorts and by external validation, for which we split the cohorts into a five-cohort training dataset and a five-cohort test dataset. We measured the area under the receiver operating curve (AUROC) for detection of microsatellite instability and EBV status. Microsatellite instability and EBV status were determined to be detectable if the lower bound of the 95% CI for the AUROC was above 0·5. Findings: Across the ten cohorts, our analysis included 2823 patients with known microsatellite instability status and 2685 patients with known EBV status. In the within-cohort cross-validation, the deep learning-based classifier could detect microsatellite instability status in nine of ten cohorts, with AUROCs ranging from 0·597 (95% CI 0·522–0·737) to 0·836 (0·795–0·880) and EBV status in five of eight cohorts, with AUROCs ranging from 0·819 (0·752–0·841) to 0·897 (0·513–0·966). Training a classifier on the pooled training dataset and testing it on the five remaining cohorts resulted in high classification performance with AUROCs ranging from 0·723 (95% CI 0·676–0·794) to 0·863 (0·747–0·969) for detection of microsatellite instability and from 0·672 (0·403–0·989) to 0·859 (0·823–0·919) for detection of EBV status. Interpretation: Classifiers became increasingly robust when trained on pooled cohorts. After prospective validation, this deep learning-based tissue classification system could be used as an inexpensive predictive biomarker for immunotherapy in gastric cancer. Funding: German Cancer Aid and German Federal Ministry of Health.

Muti, H.S., Heij, L.R., Keller, G., Kohlruss, M., Langer, R., Dislich, B., et al. (2021). Development and validation of deep learning classifiers to detect Epstein-Barr virus and microsatellite instability status in gastric cancer: a retrospective multicentre cohort study. THE LANCET. DIGITAL HEALTH, 3(10), e654-e664 [10.1016/S2589-7500(21)00133-3].