Artificial intelligence (AI) large language models (LLMs) and the systems based on this technology have recently emerged onto the medical scene with huge promises to “disrupt” and innovate upon traditional clinical practice. Ignoring the hype and boom-bust adoption cycles associated with most novel technology, AI will eventually settle into the daily workflow of many physicians, be it through direct use, or indirectly, through integration within EMRs, insurance approval pipelines, or even preliminary diagnostic screenings. Clinical medicine has been fairly cautious about embracing AI, likely from a medicolegal standpoint. In contrast, the industry of medical education, at least on the admissions front, seems to be more willing to encompass the use of AI technologies in the “holistic admissions process,” with institutions such as the Donald and Barbara Zucker School of Medicine at Hofstra/Northwell utilizing AI to preliminarily screen medical school applicants and the University of Miami Miller School of Medicine employing it to compile myriad preceptor reviews and evaluations into a summative Medical Student Performance Evaluation (MSPE) for each student.
Proponents of incorporating AI into the admissions/applicant processing pipeline extoll the efficiency gleaned from employing such technologies, especially when considering the ever-growing arms race leading to a larger pool of prospective applicants with increasing amounts of extracurricular activities, research items, and accomplishments embellishing their résumés. This pursuit of efficiency, however, overlooks a very unique aspect of the medical school admissions process that is critical to creating a humanistic health care workforce. We need to look no further than industries outside of medicine, which lack the streamlining afforded by the guild-like nature of our profession (epitomized by things such as the National Resident Matching Program), to notice the detrimental impact of AI on hiring practices for both employees and employers. Eighty-seven percent of employers are utilizing some form of AI in their hiring practices, with most citing efficiency as the underlying reason for implementation. In response, applicants are increasingly using AI to tailor their résumés and CVs to get past most standard AI screeners, creating a flourishing new market for résumé optimization that tacitly acknowledges a breakdown or incongruency associated with the AI screening process and the ultimate selection of ideal job candidates. Many extremely qualified applicants are lost in the noise generated by this arms race, as the nature of commonly used screening tools becomes more dystopian, going so far as to incorporate facial and body language video data in their predictions.
Given the “black box” nature of how the predominant AI models used for language processing operate, this “breakdown” is not only inherent but also extremely dangerous. In the field of medicine, these models would likely be trained on historical and current medical class demographics and traits. There is the inevitable risk that they will recapitulate, perhaps even promulgate, our field’s history of discrimination towards underrepresented applicants and overall failure to create a physician workforce reflective of this country’s demographic composition. These grievances have required continuous human-driven initiatives to even attempt to redress. Furthermore, the current federal administration’s hostile stance on diversity, equity, and inclusion (DEI) may substantially disincentivize schools from making efforts to audit and monitor such models, increasing the likelihood they go rogue.
Notwithstanding the clear challenges to Title VI, Title IX, the ADA, and the ACA that rogue AI model use would elicit, there is something particularly ominous about our profession’s gravitation towards a technology that seems to be antithetical to the core principles of what it means to be a physician. While AI is adept at analyzing swaths of academic metrics and generating a report, its inference capabilities fall drastically short. AI fails to capture the nuanced qualities integral to compassionate care: Something that ideally would be observed through direct patient care, but more feasibly can be inferred through interviews and endogenous writing. While many schools already employ some in-house quantitative metric for the preliminary screening of applicants, the variables going into such a metric are clearly demarcated, unlike the non-deterministic outputs and “hallucinations” that plague AI language models. They can be easily audited, modified, and restructured to advance the goal of creating an incoming class consistent with an institution’s principles and values. In contrast, relying on inchoate LLM AI systems to cull the pool of physician applicants is woefully inadequate. At best, it runs the risk of having many competent individuals fall through the cracks, and, at worst, may exacerbate the inequity associated with the admissions arms race, selecting candidates more skilled at overcoming AI hurdles than those espousing the qualities of a good physician.
While machine learning and AI will undoubtedly continue to play a role in medical education, its current potential in the admissions process remains marred by a multitude of inherent risks, requiring substantial caution on part of both applicants and institutions going forwards. Ultimately, empathy, ethical reasoning, and commitment to service are more than just lines on a CV comprised of buzzwords merely meant to be picked up by a screening tool. Just as we expect applicants to behave according to an honor code emblematic of a future physician by not misrepresenting clinical and volunteering hours or exaggerating accomplishments and awards, we, at the very least, are obligated to afford their applications due diligence tantamount to their efforts.
Newlyn Joseph is a psychiatry resident.
