Cost Effectiveness of Health

AI and machine learning – an intelligent approach to healthcare fraud prevention

Financial crime, fraud, waste and abuse are surging in today’s economy, and healthcare is not immune from this contagion.

June 9, 2023 4:40 pm

The threat of fraud has only become more prevalent in healthcare as a result of three broad trends:

  • Continued growth in the population of healthcare consumers
  • The increase in care being delivered outside of traditional care settings, such as telehealth
  • Exponential development of resources offering health and wellness services

Moreover, as the baby boomer generation ages, the number of healthcare consumers will increase in the next few years, increasing opportunities for fraud. Overall, opportunities parties bent on fraudulent acts outside of and even within healthcare networks have become more common, leading to challenges in differentiating between good and bad actors. As never before, healthcare organizations require effective tools to detect and prevent healthcare fraud.

With financial, regulatory and reputational risks on the line, payers, providers, federal and state government agencies and drug manufacturers alike must be vigilant in maintaining effective fraud risk management practices. Fortunately, technology has evolved to provide new and better tools for this purpose, specifically artificial intelligence (AI) and machine learning (ML).a

(See also the sidebar “What is driving healthcare fraud?” at the end of this article.)

How AI and MI can play a role in fraud prevention

The ability of AI and ML to analyze vast amounts of data makes these systems extremely effective defenses in identifying fraud. Given that healthcare’s most widely used technology providers, such as Epic and Cerner, serve thousands of hospitals and payers and maintain healthcare data on hundreds of millions of patients, detecting potential fraud embedded in that data is essential.

AI and ML can be used to detect and prevent healthcare fraud in the following ways.

Analyzing large volumes of healthcare data to identify patterns of fraudulent activities, both intentional and unintentional. Such activities, for example, include billing for services not provided or submissions of duplicate claims. AI and ML algorithms can flag suspicious transactions for further investigation and help organizations detect fraud more quickly. 

Developing predictive models for identifying potential fraudsters or at-risk claims. Such AI and ML models can analyze patterns in data to predict which claims are most likely to be fraudulent, allowing organizations to take proactive measures to detect fraud. An example is identifying physicians prescribing medications outside of the normal scope of practice for their specialty of care, location of service, and other variables.

Identifying unusual patterns such as unexpected spikes in billing or unusual provider behavior. AI and ML algorithms can flag such anomalies for further investigation. Similarly, the algorithms can help can flag suspect claims for review by analyzing claims data to identify discrepancies, errors and anomalies, thereby helping to detect and prevent fraud before it occurs. For example, the model can identify providers billing for claims outside the normal parameters based on their specialty or provider taxonomy.

Helping  to detect and prevent fraud involving payers. AI and ML can benefit payers by enabling them to mount a concerted effort to identify and prevent payer fraud, which requires a combination of data analysis, investigation and collaboration among providers, payers and law enforcement agencies. AI and ML can assist payer programs by analyzing large volumes of data to identify patterns and anomalies that may indicate fraudulent activities, including suspicious activity related to falsified or misrepresented medical services that are not medically necessary or overbilling for medical procedures.

Fraud exposure by healthcare industry segment

The healthcare segments where exposure to fraud is most prevalent are payers, providers and life sciences companies, including medical device manufacturers and biotechnology organizations.

Payers. This segment — which includes Medicare, Medicaid, commercial payers and health systems that run their own health plans — faces exposure from fraudulent claims for reimbursement, which are often perpetrated by bad actors who falsify or misrepresent information. This type of fraud can include medical identity theft, where the perpetrator uses another person’s medical card or information to obtain healthcare goods, services or funds.

Providers. Provider organizations may, whether unintentionally or intentionally, attempt to submit false information to collect reimbursement from payers for treatment and services that were never delivered. Examples of such fraudulent actions include:

  • Billing for services that were not medically needed or were never provided
  • Duplicate billing
  • Unbundling
  • Misrepresenting dates of service and locations of service
  • Soliciting or offering kickbacks

Life sciences companies. These types of companies may be vulnerable to fraudulent billing by third-party suppliers. The forging of prescriptions and the illegal sale of prescription medications are examples of fraud affecting this sector.

In addition, both payers and providers may be subject to fraud in which perpetrators use another person’s health insurance. AI and ML can be used to effectively detect such fraud by understanding that patient’s health patterns, treatments and activities and detecting anomalies.

Use of synthetic data where real-world data is not available

To be effective, AI and ML require access to large volumes of high-quality, relevant data. Sometimes, however, access to real-world data is limited or restricted due to privacy concerns. In such instances organizations can use synthetic data — artificially generated data that mimics the characteristics and patterns of real-world data — to simulate, train and test models in a controlled environment.

For example, if available real-world data is limited or biased, synthetic data can augment the dataset and increase its size and diversity. Data privacy is a critical concern in healthcare, of course, and regulations such as HIPAA can restrict sharing of real-world patient data. Synthetic data can be used to generate datasets that mimic the characteristics of real-world data, allowing researchers and data scientists to build and test models without accessing sensitive information.

An organization can use synthetic data to build an analytical model that mimics their real-world data because it is statistically similar to the organization’s real-world dataset. The synthetic data enables the organization to generate an analytical model that is without bias (as even real-world data can have bias — due to user error, for example). The organization can then run the model using its dataset internally.

 The key to using synthetic data effectively is to ensure that the data is representative of real-world data and accurately captures its characteristics and patterns. For this reason, careful data generation techniques and validation against real-world data are needed to ensure the synthetic data is high quality and useful for model development. 

Synthetic data offers the benefit of making it possible to simulate different scenarios and test the performance of models under varying conditions. For example, it allows an organization to simulate fraud schemes, such as upcoding or billing for services not rendered, to test the effectiveness of the AI an ML models for detecting and classifying such fraud.

Critical tools for effective fraud prevention — but not without human oversight

While AI and ML can be powerful tools in the fight against healthcare fraud, those hoping to rely on this technology to detect and prevent fraud should do so cautiously.

A major advantage of these technologies is its ability to quickly and precisely comb through large volumes of data to precisely identify fraud, avoiding hours spent by employees in reviewing cases manually. There is tremendous potential for cost savings in being able to perform this function.

That said, these systems do not eliminate the need for human oversight. AI and ML free people to perform more sophisticated, analytical tasks, but the technology must be continually monitored to ensure it uses its enormous data mining capacity to lead to correct, actionable conclusions. Organizations therefore should not attempt to use these technologies without first acquiring the necessary knowledge base and resources to understand and effectively implement them, whether through internal efforts or through guidance from outside experts.

The availability of AI and ML to address healthcare fraud could not come at a more critical time. The potential for fraud continues to rise with the growing and aging population of healthcare consumers, the evolution of treatment beyond traditional settings and continued increases in the financial resources allocated to healthcare.


a. This article’s purpose is simply to explore the various ways in which AI and ML can be applied for detection and prevention of healthcare fraud. It is beyond the scope of this article to also explain the technicalities of how AI and ML work to accomplish these objectives. For insight into such details, see for example, Google Cloud, “Artificial intelligence (AI) vs. machine learning (ML),” page accessed June 2, 2023.

What is driving healthcare fraud?

The pandemic unleashed a torrent of fraudulent claims, driven in part by three of the nation’s responses to it:

  1. The huge sums of money allocated by the federal government for testing, treatment, and economic subsidies
  2. Changes in employment patterns, resulting in people holding multiple jobs
  3. The remote work trend, which led to less stringent security measures by home-based workers

These developments created fertile ground for exploitation by a more sophisticated crop of fraudsters and fraud schemes, leaving companies exposed to heretofore unknown and unforeseen risks.

Regardless of the nature of the fraud, or the element of the healthcare ecosystem in which it occurs, the impact is significant. The National Health Care Anti-Fraud Association estimated (on a conservative basis) that healthcare fraud costs the U.S. about $68 billion annually — about 3% of all healthcare spending in the country.a Other estimates range as high as 10% of annual healthcare expenditure, or $230 billion.


a. NHCAA, “The challenge of health care fraud,” 2021.


googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text1' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text2' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text3' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text4' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text5' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text6' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-text7' ); } );
googletag.cmd.push( function () { googletag.display( 'hfma-gpt-leaderboard' ); } );