What defines a good device? Let’s say a good tool is characterized by one that excels at its main purpose, while a mediocre tool breaks easily or performs poorly for its intended use. But multimodal artificial intelligence may change the way we look at how to improve the performance of AI tools.
Researchers at MIT have developed a multimodal framework for healthcare analytics called Holistic AI in Medicine (HAIM), which has recently been described. Nature’s npj Digital Medicine, which leverages multiple data sources to more easily build predictive models in healthcare settings: from identifying multiple chest pathologies such as lung lesions and edema, to predicting 48-hour mortality and patient length of stay until. And in doing so, they created over 14,000 AI models to test.
In AI, most tools are single-instrument tools, meaning they synthesize a range of information to produce results – for example, feeding thousands of lung cancer CT scans to a machine learning model so that it learns How to accurately identify lung cancer from medical images.
Furthermore, most multimodality tools rely heavily on medical imaging and give little importance to other factors, even though there are several ways doctors can determine whether someone has lung cancer or is at risk of developing lung cancer. Risk factors: Persistent cough, chest pain, loss of appetite, family history, genetics, etc. If the AI tool is complemented with a more complete picture of a patient’s other symptoms and health history, could it possibly detect lung cancer or any other disease earlier and more accurately?
“This idea of using a single data set to make important clinical decisions didn’t make sense to us,” said Abdul Latif Jameel Clinic for Machine Learning in Health postdoc and study lead co-author Louis R. Soneksen says. “Most of the world’s physicians work in fundamentally multimodal methods, and would never offer recommendations based on narrow single-modality interpretations of their patients’ conditions.”
Two years ago, the field of AI in health care exploded. The amount of funding in AI-enabled digital health startups doubled from its previous year to $4.8 billion, and is set to double again in 2021 to $10 billion.
At that point, Soneksen, Jameel Clinic executive director Ignacio Fuentes, and Boeing Leaders for Global Operations Professor of Management and Jameel Clinic faculty lead at the MIT Sloan School of Management, decided to take a step back to see what was missing from the field.
“Things were coming out left and right, but it was also a time when people were becoming disillusioned because the promise of AI in health care was unfulfilled,” Soneksen recalls. “We originally realized that we wanted to bring something new to the table, but we needed to do it systematically and provide the necessary nuance for people to appreciate the benefits and downsides of multimodality in health care.”
The novel idea they cooked up was common sense: creating a framework for easily building machine learning models capable of processing different combinations of multiple data inputs, similar to the way a doctor assesses a patient’s symptoms before making a diagnosis. and may take health history into account. But there was a clear lack of multimodal framework models in the health sector, with only a few papers published about them that were more conceptual than concrete. Furthermore, when it comes to developing a unified and scalable framework that can be applied consistently to train any multimodal model, single-modality models often outperform their multimodal counterparts.
Seeing this gap, he decided it was time to assemble a team of experienced AI researchers at MIT and started building HAM.
“I was very fascinated by the whole idea of [HAIM] Because of its potential to greatly leverage the infrastructure of our current health care system to bridge the gap between academia and industry,” says Yu Ma, a PhD student advised by Bertsimus and co-author of the paper are.” When [Bertsimas] Asked me if I would be interested in contributing I was immediately on board.
Although large amounts of data are generally seen as a boon in machine learning, in this case the team realized that this was not always the case when using multimodal systems; A more nuanced approach was needed for data input and evaluation of modalities.
“Many people learn multimodality, but it is rare to study every possible combination of models, data sources, all hyperparameter combinations,” says Ma. “We were really trying to understand how multimodality works under different scenarios.”
According to Fuentes, the framework “opens up an interesting avenue for future work, but we need to recognize that multimodal AI tools in clinical settings face many data challenges.”
Bertsimas’ plans for HAIM 2.0 are already in the works. Under consideration is the inclusion of more modalities (eg, signal data from electrocardiograms and genomics data) and methods to aid medical professionals in decision making, rather than simply estimating the likelihood of certain outcomes.
HAIM is also an acronym coined by Bertsimus, the Hebrew word for “life”.
This work was supported by the Abdul Latif Jameel Clinic for Machine Learning in Health and a National Science Foundation Graduate Research Fellowship.