- Healthcare organizations can improve the impact of AI applications by making proprietary clinical data more widely available.
- The data that supports healthcare algorithms also should be subject to more rigorous and transparent validation processes than generally are used.
- Privacy concerns should be considered when gathering and applying data, but such issues can’t always hold sway.
Marzyeh Ghassemi, PhD, an expert in artificial intelligence (AI) applications for healthcare, often is asked why people in her field seem to spend so much time working specifically on models that predict ICU mortality.
Her response: At this point, there aren’t many other choices.
“We are looking where the light is shining. We either cannot get access to other data, or that data doesn't exist,” said Ghassemi, assistant professor in the Department of Electrical Engineering and Computer Science and the Institute for Medical Engineering and Science at MIT.
Ghassemi made that remark during a recent workshop hosted by the National Academy of Medicine on the potential for AI to advance health equity. Realizing that potential — and avoiding outcomes that exacerbate health disparities — requires a greater variety of data on patient populations to be made widely available, she said.
That, in turn, requires a new way of thinking by healthcare organizations, especially given that ample quantities of valuable data essentially are siloed.
“It's just in the hands of private entities or, honestly, public entities that have no requirement to share their data with other researchers,” she said. “And I think that's the biggest travesty in my mind. We have all of this data that exists in hospitals and healthcare systems in our country, and publicly funded researchers don't have access because people are keeping it locked up. They think they're sitting on gold, and so they won't give it to the greater good.”
A better model, she said, would be “a holistic healthcare system that feeds data into a publicly funded and guarded entity that allows all validated researchers to have access to that data as long as they pass an ethics review that is manned by a diverse set of stakeholders.”
The importance of better understanding algorithms
Accessibility is not the only data-related concern with respect to AI applications. Rigorous and transparent guidelines need to be developed for validating algorithms based on the data that feeds them, said John Halamka, MD, president of Mayo Clinic Platform.
Without such validation, he said, a health system theoretically could “take a Mayo Clinic algorithm developed on 1 million Scandinavian Lutherans, and we’re going to run it in Tennessee. Is it going to work? Is it going to be biased or fair? Is it fit for purpose? Does anyone know?”
Halamka proposed a rigorous national validation model that involves something akin to laboratories, which would test algorithms and generate “labels” that can be electronically linked to an algorithm in the same way as labels get affixed to soup cans.
Halamka said the labels would identify information such as: “Who went into the production of this algorithm? How was it validated? Where is its fit for purpose?
“It's only when we get to this level of transparency in the algorithms that we publish about and use in practice that we can restore credibility in healthcare AI.”
Lebone Moses, MBA, founder and CEO of Chisara Ventures, an international business-consulting firm, said the next questions are how the validation process is conducted and who is included. The process can’t be considered complete unless it involves trusted community health organizations, she said. In turn, those organizations need to receive the validated information to use in their markets.
“The models are great, but they’re not efficacious without the partnerships,” Moses said.
Those partnerships should have financial stakes, she added.
“Community organizations need funding in order to continue to serve the people on the ground,” she said. “They need funding in order to continue to properly assess the populations that the medical community needs to include in their algorithms. And allowing for that to happen requires dollars.”
Addressing privacy concerns in data collection
The ideal data ecosystem would have adequate guardrails to make individuals feel confident that their data won’t be used for nefarious purposes, Ghassemi said. That would help strike a better balance between people’s desire for privacy and the need for equitable representation in data sets that build algorithms and support machine learning.
A worthwhile use of a person’s deidentified data might be for research on medical conditions for which certain demographics are at higher risk, she said. Researchers in that setting also would be unlikely to try “joining” such data with other data that could be used to market products.
“We need to condition our requirements for privacy on what the context and use of the data is,” she said.
Privacy questions also may arise when considering how insights from an algorithm are applied to a patient’s care, said Amy Salerno, MD, a practicing physician in geriatrics and palliative medicine with University of Virginia Health.
She used the example of including housing-related data in a patient’s record as an indicator of higher risk for comorbidities such as substance-use and mental health disorders.
“There's a lot of bias in healthcare against people who are experiencing homelessness,” Salerno said. “And so just because we have it documented in the medical record doesn't mean that people who are experiencing homelessness want that flagged for every single person who's interacting with their chart.
“How do we work together with the individuals who are experiencing homelessness and our homeless-service providers to make sure that what we're doing is actually working with them, rather than working on them so that we reduce our costs?”