Demystifying Deep Learning Algorithms
One of the goals of our weekly blog is to explain, in plain English, how digital tools can improve patient care. This week we unravel convolutional neural networks and random forest analysis.
By Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform and John Halamka, M.D., president, Mayo Clinic Platform.
Deep learning algorithms may seem like alchemy to anyone who doesn’t work in data science. The idea that a computer can teach itself to make course corrections, without a human programmer adding new code, feels almost magical. But as Eyal Oren, Ph.D, with Google Brain once pointed out: “It’s not magic, it’s math.” While the math may be complex, with the right graphics and plain text explanation, clinicians and health care executives can gain the understanding needed to evaluate algorithms based on these modeling approaches, and make more informed purchasing decisions.
Machine learning models fall into 3 broad categories: supervised learning, unsupervised learning, and reinforcement learning. We’ll concentrate on supervised learning in this blog. It’s being used effectively in medical image analysis to help differentiate between melanoma and normal moles, for example. To create this kind of algorithm, developers start with images that are clearly labeled as either melanoma or normal tissue based the consensus of a panel of expert dermatologists or pathologists—often called ground truth. This approach is referred to as supervised because it starts with labeled images; unsupervised learning on the other hand starts with unlabeled data; clustering, which we will discuss in a later installment in the series, is an example of unsupervised learning.
Once a labeled dataset of images is available, it is usually divided in half, with one half fed into a deep learning system—a convolutional neural network (CNN), for instance—that trains it to tell the difference between cancer and non-cancer by pulling out specific features in the images. After analyzing millions of pixels in tens of thousands of images, it may discover that melanomas have irregular edges while normal moles are round or oval in shape. The system can also recognize other differences, such as the fact that skin cancers are more likely to have uneven coloration, or bleeding. The second half of the dataset would then be used to test the accuracy of the trained model to see how well it can identify the cancer. We explain CNNs in more detail in our most recent book, The Digital Reconstruction of Healthcare.
Despite their usefulness, CNNs have their limitations, which is why several developers have turned to other modeling techniques, including random forests. Most clinicians are familiar with decision trees because many diagnostic and treatment decisions rely on such algorithms. Typically, these flow charts include several branches that ask the clinicians to travel down various paths, depending on an assessment of a patient’s symptoms or lab results, for example. Following the tree imagery, it takes many branches to make up a decision tree; similarly, it takes several trees to comprise a forest, thus the term random forest modeling (RFM)—a type of machine learning that can be used for both classification and regression analysis.
RFM generates a large series of decision trees and then compares these trees, taking a majority vote to determine which decision makes the most sense. As a use case, imagine a cohort of 5,000 overweight people with diabetes at risk of cardiovascular complications; researchers then test a dietary/exercise regimen as a preventive measure. A detailed subgroup analysis of the group’s risk factors will likely include more than the usual suspects, including emphysema, kidney disease, amputation, dry skin, loud snoring, marital status, social functioning, hemoglobin A1c, self-reported health, and numerous other characteristics that most researchers rarely consider. A RFM has the ability to create 1,000 decision trees and a long list of co-variates to determine which combinations of risk factors pose the greatest threat to which subgroup within the 5,000 patients. Aaron Baum and his colleagues performed such an analysis to reevaluate the results of a large clinical trial called the Look Ahead Study, which initially concluded a diet/exercise program had no effect on such complications. The reanalysis performed by Baum et al—illustrated in the figure--found that the lifestyle management program did in fact benefit certain subgroups of patients. A more detailed explanation of how this RFM can improve patient care is available in this video presentation.
Several other digital health researchers have likewise found creative ways to use RFM. The modeling technique has been deployed to help improve the diagnosis of diabetic peripheral neuropathy (DPN), for instance. Investigators evaluated a large de-identified EHR database and found nine variables associated with DPN, seven of which involved health care utilization, as well as patient age and the Charlson Comorbidity Index. This project, like the Baum et al analysis, addressed “a classification problem (e.g., is this a person with or without DPN?) by building several decision trees at once, each utilizing a subset of a group of pre-selected input variables (e.g., known characteristics or events associated with this person). The ‘forest’ of decision trees ‘votes’ for a particular classification and identifies the input variables of greatest utility in the classification process.” Among the most important variables that predicted DPN were the number of procedures and services, outpatient prescriptions, outpatient visits, laboratory visits, outpatient office visits, and inpatient prescriptions.
The AI/ML toolbox continues to evolve, with modeling techniques capable of detecting subtle, unexpected patterns in patient data. While these patterns do not always establish cause and effect relationships between risk factors and clinical outcomes, at the very least they provide testable theories. And on occasion they offer insights that directly improve patient care.
Recent Posts
By John Halamka, Paul Cerrato, and Teresa Atkinson — Many clinicians are well aware of the shortcomings of LLMs, but studies suggest that retrieval-augmented generation could help address these problems.
By John Halamka and Paul Cerrato — Large language models rely on complex technology, but a plain English tutorial makes it clear that they use math, not magic to render their impressive results.
By John Halamka and Paul Cerrato — Many algorithms only reinforce a person’s narrow point of view, or encourage existing prejudices. There are better alternatives.