AI Chatbots Show Promise but Critical Gaps in Emergency Department Triage — SMART — Salzburg Medical AI Research in Traumatology

A recent comparative study published in Postgraduate Medicine has provided the first systematic evaluation of popular AI chatbots in emergency department triage scenarios, revealing significant limitations that challenge current enthusiasm for AI-assisted healthcare applications.

Researchers Fatma Tortum and Kamber Kaşali conducted a prospective analysis comparing triage decisions made by emergency physicians, triage nurses, and three widely-used AI models—ChatGPT, Gemini, and Pi—across 500 emergency department patients over a one-week period.

Critical Safety Concerns Identified

The study uncovered concerning patterns in AI performance. While ChatGPT demonstrated the closest approximation to physician-level triage decisions among the AI models tested, it still undertriaged 26.5% of moderate-urgency (yellow-coded) patients and 42.6% of high-urgency (red-coded) patients.

These undertriage rates represent potential safety risks, as delayed recognition of patient severity could lead to postponed interventions in time-sensitive emergency conditions.

Implementation Reality Check

The research provides valuable real-world data on AI tools that are readily accessible to healthcare providers today. Unlike studies focusing on specialized medical AI systems, this investigation tested consumer-grade AI platforms that emergency departments could theoretically implement immediately.

However, the results suggest that current consumer AI models are not yet ready for standalone clinical decision-making in emergency triage scenarios. Only 23.8% of cases received identical triage classifications from all human and AI evaluators combined.

Evidence-Based Path Forward

This study establishes important baseline performance metrics for AI-assisted triage while highlighting the continued need for human oversight in emergency healthcare settings. The research methodology—direct comparison against both physician and nursing triage decisions—provides a robust framework for evaluating future AI developments in emergency medicine.

The findings support continued development of healthcare-specific AI systems while cautioning against premature deployment of general-purpose AI tools in critical clinical applications.

Research Details: Tortum F, Kaşali K. “Exploring the potential of artificial intelligence models for triage in the emergency department.” Postgraduate Medicine. 2024;136(8):841-846. DOI: 10.1080/00325481.2024.2418806