Swarm Learning for decentralized and confidential clinical machine learning

Item Type:Article
Creators Name:Warnat-Herresthal, S. and Schultze, H. and Shastry, K.L. and Manamohan, S. and Mukherjee, S. and Garg, V. and Sarveswara, R. and Händler, K. and Pickkers, P. and Aziz, N.A. and Ktena, S. and Tran, F. and Bitzer, M. and Ossowski, S. and Casadei, N. and Herr, C. and Petersheim, D. and Behrends, U. and Kern, F. and Fehlmann, T. and Schommers, P. and Lehmann, C. and Augustin, M. and Rybniker, J. and Altmüller, J. and Mishra, N. and Bernardes, J.P. and Krämer, B. and Bonaguro, L. and Schulte-Schrepping, J. and De Domenico, E. and Siever, C. and Kraut, M. and Desai, M. and Monnet, B. and Saridaki, M. and Siegel, C.M. and Drews, A. and Nuesch-Germano, M. and Theis, H. and Heyckendorf, J. and Schreiber, S. and Kim-Hellmuth, S. and Nattermann, J. and Skowasch, D. and Kurth, I. and Keller, A. and Bals, R. and Nürnberg, P. and Rieß, O. and Rosenstiel, P. and Netea, M.G. and Theis, F. and Mukherjee, S. and Backes, M. and Aschenbrenner, A.C. and Ulas, T. and Breteler, M.M.B. and Giamarellos-Bourboulis, E.J. and Kox, M. and Becker, M. and Cheran, S. and Woodacre, M.S. and Goh, E.L. and Schultze, J.L.
Abstract:Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine. Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes. However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation. Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning—a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine.
Keywords:Blockchain, COVID-19, Clinical Decision-Making, Confidentiality, Datasets as Topic, Disease Outbreaks, Leukemia, Leukocytes, Lung Diseases, Machine Learning, Precision Medicine, Software, Tuberculosis
Publisher:Nature Publishing Group
Page Range:265-270
Date:10 June 2021
Additional Information:Markus Landthaler and Nikolaus Rajewsky are members of the Deutsche COVID-19 Omics Initiative (DeCOI).
Official Publication:https://doi.org/10.1038/s41586-021-03583-3
