Demystifying Machine Learning
In this new series, we pull back the curtain on the Wizard of Oz to provide plain English explanations about machine learning, artificial neural networks, natural language processing, large language models, and related technology.

By Paul Cerrato, MA, senior research analyst and communications specialist, Mayo Clinic Platform, and John Halamka, M.D., Diercks President, Mayo Clinic Platform
In the classic movie, Dorothy and her sorry friends ask the Wizard for several favors: She wants to be transported back to Kansas and her companions want a brain, a heart, and more courage. As the Wizard terrifies them with his trickery, Toto pulls back the curtain to reveal the Professor manipulating levers and gears to perform his spectacle. Our goal in the series is to take Toto’s role and explain the technology behind the “magic.” Without this foundation, it’s difficult to understand several useful digital tools, or make informed decisions about implementing them in your healthcare organization.
Unfortunately, one of the problems in developing this foundation is the mismatch between the language clinicians and patients use, and the language used by computer scientists. Clinicians and patients usually talk about signs and symptoms, lab test results, diagnosis, and treatment benefits while technologists typically substitute inputs for symptoms, signs, and test results and outputs for diagnoses or treatments. Complicating things further, they may substitute the term feature for inputs, and labels for outputs. To understand machine learning (ML), it helps to stay current on the latest terminology.
It also helps to compare ML to other forms of artificial intelligence. Many AI systems require programmers to write code that gives a computer explicit instructions to perform a certain task. For example, IBM Deep Blue, which defeated Garry Kasparov, the world chess champion in 1997, was programed with specific instructions on what moves to make in response to his moves. In contrast, years later Google’s AlphaZero was able to accomplish a similar feat by playing millions of chess games and teaching itself how to win, an example of how machine learning can collect massive amounts of data and analyze it, rather than have a human programmer input rules to follow. In the case of Deep Blue, it relied on input commands while AlphaZero relied on input data, which was then analyzed by a model. The model used advanced statistics and probabilistic reasoning to detect relationships and patterns, and then used all that input to develop a winning strategy.
ML models usually fall into three broad categories: supervised, unsupervised, and reinforcement learning. Supervised learning refers to modeling that starts with labeled data; in healthcare, this might be pathology slides that are already identified by an expert as indicating the presence of cancer. To develop an algorithm to help accurately identify cancer, 100,000 or more slides would be divided into training and testing sections, typically with a 70/30 split. The first 70,000 slides would be accompanied by an accurate diagnosis, which the model would then analyze to locate the specific sections of pixels that are consistent with cancer. The completed algorithm would then be used to blindly test the remaining 30,000 slides, which have not been identified as being cancerous, to see if the model correctly recognizes the cellular hallmarks of the disease.
Unsupervised learning, on the other hand, doesn’t start with labeled data. Clustering is an example of this type of machine learning. It looks for hidden patterns in the data. For example, clustering can be used to develop personalized ads for households, or to classify diabetes for personalized treatment protocols. It involves finding central points in a scatter plot and calculating distances to create clusters with useful patterns, as illustrated in the graphs here.
Figure 1

Figure 2

In this case, the ML developer wants to develop a better way to classify diabetes, rather than just type 1 and type 2, to help personalize treatment protocols. The input variables might be body mass index (BMI) and hemoglobin A1c (HbA1c), which are plotted on the X-axis; the Y-axis is incidence of diabetes (Figure 1). The next step is to make an educated guess about which dots are the central points, which become centroids in Figure 2. There is a geometric equation available that allows you to calculate distances from one data point to another. Using this equation, you're able to get a useful set of clusters. In one published study, Swedish researchers used a similar but more sophisticated clustering technique and discovered five subtypes of diabetes: severe autoimmune diabetes, severe insulin deficient diabetes, severe insulin resistant diabetes, mild obesity related diabetes, and mild age related diabetes.
Reinforcement learning, the third ML category, might be compared to the reinforcement learning we all experience when trying to learn an unfamiliar task. It usually involves lots of trial and error. When developing a model, this approach requires randomly testing a “vast number of possible input combinations and grading their performance,” explains Oliver Theobald in his primer: Machine Learning for Absolute Beginners. In healthcare delivery, these algorithms can help personalize the treatment of patients with chronic conditions that change over time. The input data for this kind of model would include initial signs, symptoms, lab and imaging results, and the patient’s positive and negative responses. That would then be fed back into the model to help make adjustment to his or her regimen. It’s also being used to analyze medical images, assist in robotic surgery, and improve drug discovery and development.
As we have pointed out many times in our articles, ML will never replace the skills of an experienced physician or nurse, but it can significantly enhance their ability to make informed decisions. Pulling back the curtain on the technology is the first step in that direction.
Recent Posts

By John Halamka and Paul Cerrato — In part 2 of our series on the basics of digital technology, we explore the deep learning tools that can improve medical image analysis and much more.

By John Halamka and Paul Cerrato — Given their tendency to invent “facts,” several researchers have begun comparing their strengths and weaknesses.

By John Halamka and Paul Cerrato—A well-reasoned, coherent thesis is not enough to convince editorial gatekeepers to accept your article. Consider these additional suggestions.