Toward expert-level medical question answering with large language models

Toward expert-level medical question answering with large language models
nature.com

by Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Mohamed Amin, Le Hou, Kevin Clark, Stephen R. Pfohl, Heather Cole-Lewis, Darlene Neal, Qazi Mamunur Rashid, Mike Schaekermann, Amy Wang, Dev Dash, Jonathan H. Chen, Nigam H. Shah, Sami Lachgar, Philip Andrew Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Agüera y Arcas, Nenad Tomašev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle K. Barral, Dale R. Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi , Alan Karthikesalingam , Vivek Natarajan • 22 days ago

Large language models (LLMs) like Med-PaLM 2 have advanced medical question answering, achieving 86.5% accuracy on USMLE-style questions. The model improves reasoning and grounding through ensemble refinement and chain of retrieval. Human evaluations indicate Med-PaLM 2 is preferred over physician answers in several areas, though specialist answers remain superior overall. This model shows promise for real-world applications, particularly where access to medical specialists is limited, while highlighting the need for further validation and alignment with human values.

Summarized in 80 words

Latest AI Tools

More Tech Bytes...