Researchers with the University of Florida’s academic health center — UF Health — announced that they have collaborated with NVIDIA researchers to create GatorTron™, an artificial intelligence transformer natural language processing model intended to accelerate research and medical decision-making by extracting insights from massive volumes of clinical data with unprecedented speed and clarity.
The new model will speed up researchers’ ability to identify relevant patients for lifesaving clinical trials and other studies. GatorTron™ is also expected to fast-track the development of medical applications with improved capabilities. The model will be used by doctors for clinical decision support, UF Health officials said.
The GatorTron™ Language Model is the first step forward in the $100 million artificial intelligence public-private partnership announced last year by UF and NVIDIA, which resulted in the university assembling the most powerful AI supercomputer in higher education.
Until now, much of the medical information that is valuable to researchers and physicians has been buried deep in the full-text notes of patient records. Accessing that information can be time-consuming and labor intensive.
By training GatorTron™ to understand the language of these records and recognize complex medical terms, UF Health researchers and NVIDIA developers have created a way to unlock that information quickly and easily, said William Hogan, M.D., one of the project’s lead researchers, the director of biomedical informatics and data science in the UF College of Medicine’s department of health outcomes and biomedical informatics as well as a member of the UF Health Cancer Center. Hogan estimates that up to 80% of information that is valuable to researchers and physicians is contained in the full-text clinical notes of patients’ medical records. Now, the speed and precision of GatorTron™’s language recognition makes all of that quickly accessible, he said.
For the creation of GatorTron™, UF Health supplied 10 years of anonymized data from more than 2 million patients and 50 million patient interactions across an array of medical specialties, including oncology, internal medicine and critical care. Security controls to protect the privacy of patients’ data during the development of GatorTron™ were approved by the UF Institutional Review Board and UF Health Information Technology.
GatorTron™ was pre-trained on HiPerGator AI, UF’s own NVIDIA DGX SuperPOD AI supercomputer, in a mere seven days. It is the first natural language processing clinical model of its scale in the world, setting the stage for myriad downstream medical applications that were previously unachievable.
“GatorTron™ is an exceptional example of the discoveries that happen when experts in academia and industry collaborate using leading-edge artificial intelligence and world-class computing resources. Our partnership with NVIDIA is crucial to UF emerging as a destination for artificial intelligence expertise and development in health research,” said David R. Nelson, M.D., senior vice president for health affairs at UF and president of UF Health.
GatorTron™ was pioneered through the unique confluence of expertise, patient data and computing power that was brought together by the UF-NVIDIA partnership.
“The NVIDIA and UF partnership facilitated the creation of GatorTronTM through the combination of NVIDIA’s expertise and previous work on Megatron, the HiPerGator supercomputer, and the vast clinical data available at UF Health,” said Mona G. Flores, M.D., global head of medical AI at NVIDIA.
For researchers, the benefits are immediate: Before GatorTron™, creating a cohort — a specific group of patients to enroll in clinical trials or in predictive studies— could take months and involve many hours of labor to extract information from various databases.
“Now, it can be done in minutes,” Flores said.
A physician or researcher also might want to know how subsets of patients, such as different subgroups of COVID-19 patients, responded to various treatments. GatorTron™ has the ability to provide insights on these types of queries quickly and precisely, Flores said.
GatorTron™ is an ideal example of teaching computers to read medical language and mine data at a speed that humans can’t replicate, said Duane A. Mitchell, M.D., Ph.D., director of the UF Clinical and Translational Science Institute, assistant vice president for research at UF and co-leader of the UF Health Cancer Center’s Cancer Therapeutics & Host Response research program.
While other projects elsewhere have developed natural language processing models from smaller, more limited medical data sets, GatorTron™ is the first NLP model to be trained on such a large amount of clinical information, Mitchell said. GatorTron™ opens new doors: Finding and analyzing a physician’s dictated medical notes about patients is now much faster and simpler.
“One of GatorTron™’s strengths is that it is much more adept and able to read and retrieve medical information with uncommon speed and accuracy. This takes advantage of the computer power and rich medical data that UF has available,” Mitchell said.
More broadly, the ability to use natural language processing has significant positive implications for health care decision-making, Mitchell and Hogan said. Hogan envisions it being used to develop predictive models of which patients would benefit from a particular treatment. It might also be used to mitigate the risks associated with surgery. For example, the system could be trained to look for and recognize possible postoperative surgical complications in patients even before a procedure starts, giving physicians an opportunity to manage that risk earlier and proactively, Hogan said.
In addition to GatorTron™, UF has worked closely with NVIDIA to boost the capabilities of HiPerGator, the world’s fastest AI supercomputer in higher education. UF’s extensive collaboration also includes working with NVIDIA’s Deep Learning Institute to develop new curriculum and coursework as well as establishing UF’s Equitable AI program to develop tools and solutions that are cognizant of bias, legal and moral issues.
“GatorTron™ leveraged over a decade of electronic medical records to develop a state-of-the-art model,” said UF Provost Joseph Glover, Ph.D. “A tool of this scale enables researchers in all fields to tackle and solve challenging real-world problems previously intractable with old technology. Our test results are preliminary and subject to independent verification, but we are very excited by what we’ve seen so far.”
Mitchell said he is truly encouraged about the capabilities that GatorTron™ brings to health care and research.
“We will have both exceptional computing capability and a natural language model that organizes important medical information and puts it into context for the benefit of patients, clinicians and researchers,” he said.