A 49-year-old man notices a painless rash on his shoulder but doesn’t seek care. Months later, during a routine physical, his doctor sees the rash and diagnoses it as a benign skin condition. More time passes, and during a routine screening test, a nurse points out the spot to another physician, who urges the patient to see a dermatologist. The dermatologist performs a biopsy and the pathology report reveals a noncancerous lesion. The dermatologist seeks a second reading of the pathology slides. This time, a drastically different verdict: atypical invasive melanoma. The patient is immediately started on chemotherapy. Weeks later, a physician friend asks him why he’s not on immunotherapy instead. The man consults his oncologist, who agrees it might work. The man is still in treatment.
Though this scenario is hypothetical, real-life versions of it play out hundreds or thousands of times a day across America — not because of negligence, but due to sheer human fallibility and systemic errors.
If done right, artificial intelligence could drastically reduce both systemic glitches and errors in the decision-making of individual clinicians, according to commentary written by scientists at Harvard Medical School and Google.
The article, published April 4 in The New England Journal of Medicine, offers a blueprint for integrating machine learning into the practice of medicine and outlines the promises and pitfalls of a technological advance that has captivated the imaginations of bioinformaticians, clinicians, and nonscientists alike.
The vast processing and analytic capacity of machine learning can amplify the unique features of human decision-making — common sense and the ability to detect nuance. The combination, the authors argue, could optimize the practice of clinical medicine.
Machine learning defined
Machine learning is a form of artificial intelligence not predicated on predefined parameters and rules, but instead involving adaptive learning. With each exposure to new data, an algorithm grows better at recognizing patterns over time. In other words, machine learning exhibits “cognitive” plasticity not unlike the neural plasticity of the human brain. However, where human brains can learn complex associations from small bits of information, machine learning requires many more examples to learn the same task. Machines are far slower at learning but have greater operational capacity and produce fewer errors of interpretation.
“A machine-learning model can be trained on tens of millions of electronic medical records with hundreds of billions of data points without lapses in attention,” said Isaac Kohane, chair of the Department of Biomedical Informatics in the Blavatnik Institute at Harvard Medical School, who co-wrote the commentary with Alvin Rajkomar and Jeffrey Dean of Google. “But it’s impossible … for a human physician to see more than a few tens of thousands of patients in an entire career.”
Thus, according to the authors, deploying machine learning could offer physicians the collective wisdom of billions of medical decisions, patient cases, and outcomes to inform diagnosis and treatment. In situations where predictive accuracy is critical, the ability of a machine-learning system to spot telltale patterns across millions of samples could enable “superhuman” performance.
To err is human
“To Err is Human,” a 1999 report by the Institute of Medicine, now known as the National Academy of Medicine, recognized the imperfections of human decision-making and the limits of individual clinician knowledge. The latter is poised to become a growing problem for front-line clinicians, who must synthesize, interpret, and apply an ever-growing mountain of biomedical knowledge stemming from the exponential rate of new discoveries.
“We must have the humility to recognize that keeping up with the pace of biomedical knowledge and new discoveries is humanly impossible for the individual practitioner,” Kohane said. “AI and machine learning can help reduce [or] even eliminate errors, optimize productivity, and provide clinical decision support.”
According to the Institute of Medicine’s report, clinical errors encompass four broad categories:
- Diagnostic: failure to order appropriate tests or properly interpret test results; use of outdated tests; wrong diagnosis or delay of accurate diagnosis; and failure to act on test results.
- Treatment: choosing suboptimal, outdated, or incorrect therapies; errors in administering the treatment; errors of medication dosing; and treatment delays.
- Prevention: failures in preventive follow-up and administration of prophylactic therapies such as vaccinations.
- Errors involving communication or equipment failures, among others.
Machine learning has the potential to reduce many of these errors and even eliminate some, the authors of the commentary said. A well-designed system could alert providers when suboptimal medication is chosen; it could eliminate dosing errors; and it could triage the records of patients with vague, mysterious symptoms to a panel of rare-disease experts for remote consults.
Machine-learning models hold the greatest promise in the following areas:
- Prognosis: the ability to identify patterns predictive of outcomes based on vast numbers of already documented outcomes. For example, what is a patient’s likely trajectory? How soon will the patient return to work? How quickly might the patient’s disease progress?
- Diagnosis: the capacity to help identify likely diagnoses during clinical visits and raise awareness of possible future diagnoses based on a patient’s profile and the totality of previous laboratory and imaging test results and other available data. Machine-learning models could be used as backup intelligence to prod physicians to consider alternative conditions or ask probing questions. This could be particularly valuable in scenarios with high diagnostic uncertainty or when patients present with particularly confounding symptoms.
- Treatment: Machine-learning models can be “taught” to identify the optimal treatment for a given patient with a given condition based on vast data sets of treatment outcomes for patients with the same diagnosis.
- Clinical workflow: Machine learning could improve and simplify current electronic medical record keeping (EMR), which places a significant burden on clinicians. A change in efficiency and reduction of time spent on EMR would allow physicians to spend more time in direct contact with patients.
- Expanding access to expertise: the ability to improve access to care for patients living in remote geographic locations or regions with a scarcity of medical specialists. Such models could provide patients with nearby care options or alert them when symptoms demand urgent attention or a visit to an emergency room.
Deus ex machina … not
AI and machine learning are not perfect, nor will they solve all glitches in clinical care.
Machine-learning models can only be as good as the data they are provided. “The adage ‘garbage in, garbage out’ very much applies here,” Kohane said.
The most significant barrier to developing optimal machine-learning models is the scarcity of high-quality clinical data that includes ethnically, racially, and otherwise diverse populations, the authors said. Other hurdles are more technical in nature. For example, the current separation of clinical data across and within institutions is a significant, yet not insurmountable, barrier to building robust machine-learning models. One solution would be to put the data in the hands of patients to enable patient-controlled databases.
Other obstacles include various legal requirements and policies and a mishmash of technical platforms across health systems and tech providers that may not be easily compatible with one another and thus could compromise access to data.
One unintended consequence of machine learning could be overreliance on computer algorithms and a reduction in physician vigilance — outcomes that would increase clinical errors, the authors cautioned.
“Understanding the limitations of machine learning is vital,” Kohane said. “This includes understanding what the model is designed and, more importantly, what it’s not designed, to do.”
One way to minimize such risks would be to include confidence ranges for all machine-learning models, informing clinicians exactly how accurate a model is likely to be. Even more importantly, all models should be subject to periodic reevaluations and exams, not unlike the periodic board exams physicians must take to maintain certifications in a given field of medicine.
If done right, machine learning will act as a backup to targeted human intelligence, enhancing the clinician-patient encounter rather than substituting for physicians.
The human encounter — a physician’s sensibility, sensitivity, and appreciation for nuance and the complexity of human life — will never go away, the authors said.
“This is very much a case of ‘together with’ and not ‘instead of,’” Kohane said. “This is not about machine versus human, but very much about optimizing the human physician and patient care by harnessing the strengths of AI.”