AI outperforms doctors in Harvard emergency room study

OpenAI o1-preview outdiagnosed ER doctors in a Harvard study of 76 real cases See how raw EHR text helped the model spot cases faster and more accurately

A Harvard study has found that OpenAI’s o1preview model diagnosed patients more accurately than two attending emergency room physicians in a test using 76 real ER cases. The model worked only from raw electronic health record text, and its performance was evaluated across multiple stages of patient care. According to the report, the AI identified the correct diagnosis in 67.1% of initial triage cases, compared with 55.3% and 50.0% for the two doctors. Independent physician reviewers could not distinguish which diagnoses came from the model and which came from humans. The findings add to growing evidence that AI tools may support clinical decisionmaking, especially as more people already use chatbots for healthrelated questions. The study also included examples where the model helped flag serious conditions earlier than a treating doctor in one case.