Machine Learning Can Make Lab Testing More Precise

An analysis of over 2 billion lab test results suggests a deep learning model can help create personalized reference ranges, which in turn would enable clinicians to monitor health and disease better.

Paul Cerrato, MA, senior research analyst and communications specialist, Mayo Clinic Platform and John Halamka, M.D., president, Mayo Clinic Platform, wrote this article.

Almost every patient has blood drawn to measure a variety of metabolic markers. Typically, test results come back as a numeric or text value accompanied by a reference range which represents normal values. If total serum cholesterol level is below 200 mg/dl or serum thyroid hormone level is 4.5 to 12.0 mcg/dl, clinicians and patients assume all is well. But suppose Helen’s safe zone varies significantly from Mary’s safe zone. If that were the case, it would suggest a one-size-fits-all reference range misrepresents an individual’s health status. That position is supported by studies that found the distribution of more than half of all lab test results, which rely on standard reference ranges, differ when personal characteristics are considered.1

With these concerns in mind, Israeli investigators from the Weismann Institute and Tel Aviv Sourasky Medical Center extracted data on 2.1 billion lab measurements from EHR records, taken from 2.8 million adults for 92 different lab tests. Their goal was to create “data-driven reference ranges that consider age, sex, ethnicity, disease status, and other relevant characteristics.”1  To accomplish that goal, they used machine learning and computational modeling to segment patients into different “bins'' based on health status, medication intake, and chronic disease.2. That in turn left the team with about half a billion lab results from the initial 2.8 million people, which they used to model a set of reference lab values that more precisely reflected the ranges of healthy persons. Those ranges could then be used to predict patients’ “future lab abnormalities and subsequent disease.”

Taking their investigation one step forward, Cohen et al. used their new algorithms to evaluate the risk of specific disorders amongst healthy individuals. When they looked at anemia cut offs like hemoglobin and mean corpuscular volume, a measurement of red blood cell size, their newly created risk calculators were able to separate anemic patients into groups at high risk for microcytic and macrocytic anemia from those with a risk no higher than the average nonanemic population. Similar benefits were observed when the researchers applied their models to prediabetes: “…using a personalized risk model, we can improve the classification of patients who are prediabetic and identify patients at risk 2 years earlier compared to classification based merely on current glucose levels.”

William Morice, M.D., Ph.D., chair of the Department of Laboratory Medicine and Pathology (DLMP) at Mayo Clinic and president of Mayo Clinic Laboratories, immediately saw the value of this type of data analysis: “In the ‘era of big data and analytics,’ it is almost unconscionable that we still use ‘normal reference ranges’ that lack contextual data, and possibly statistical power, to guide clinicians in the clinical interpretation of quantitative lab results. I was taught this by Dr. Piero Rinaldo, a medical geneticist in our department and a pioneer in this field, who focuses on its application to screening for inborn errors of metabolism. He has developed an elegant tool that is now used globally for this application, Collaborative Laboratory Integrated Reports (CLIR).”

During a recent conversation with Piero Rinaldo, M.D., Ph.D., he explained that Mayo Clinic has been using a more personalized approach to lab testing since 2015 and stated that “CLIR is a shovel-ready software for the creation of collaborative precision reference ranges.” The web-based application has been used to create several personalized data sets that can improve clinicians’ interpretation of lab test results. It has been deployed by Dr. Rinaldo and his associates to improve the screening of newborns for congenital hyperthyroidism.3. The software performs multivariate pattern recognition on lab values collected from 7 programs, including more than 1.9 million lab test results. CLIR is able to integrate covariate-adjusted results of different tests into a set of customized interpretive tools that physicians can use to better distinguish between false positive and true positive test results.


1. Tang A, Oskotsky T, Sirota M. Personalizing routine lab tests with machine Learning. Nature Medicine. 2021; 27:1510-1517.

2. Cohen N, Schwartzman O, Jaschek R et al. Personalized lab test models to quantify disease potentials in healthy individuals. Nature Medicine.2021; 27: 1582-1591.

3. Rowe AD, Stoway SD, Ahlman H et al. A Novel Approach to Improve Newborn Screening for Congenital Hypothyroidism by Integrating Covariate-Adjusted Results of Different Tests into CLIR Customized Interpretive Tools. Inter J Neonatal Screening. 2021. 7:23

Recent Posts