LLM-Based AI Outperforms Traditional Approaches in Emergency Department Triage Prediction — SMART — Salzburg Medical AI Research in Traumatology

A groundbreaking comparative study published in JMIR Medical Informatics reveals that large language model (LLM) based artificial intelligence significantly outperforms traditional approaches in predicting emergency department triage outcomes. The research, conducted at Roger Salengro Hospital in Lille, France, represents the first head-to-head comparison of modern AI architectures for emergency triage prediction.

Study Design and Methodology

Researchers developed and compared three distinct AI models using seven months of emergency department data (June-December 2024):

TRIAGEMASTER: Traditional natural language processing using Doc2Vec + multilayer perceptron
URGENTIAPARSE: LLM-based system using FlauBERT + Extreme Gradient Boosting
EMERGINET: Joint Embedding Predictive Architecture with specialized regularization

The models were trained to predict triage levels according to the French Emergency Nurses Classification in Hospital (FRENCH) scale and compared against both nurse triage decisions and clinical expert consensus.

Breakthrough Performance Results

The LLM-based URGENTIAPARSE system demonstrated remarkable superiority across all performance metrics:

90.0% exact agreement with expert clinical consensus
92.8% near-agreement (within ±1 triage level)
F1-score of 0.900 with 95% confidence interval of 0.876-0.924
AUC-ROC of 0.879 (95% CI 0.851-0.907)
Weighted κ of 0.800 (P<.001)

These results significantly exceeded the performance of traditional NLP approaches, Joint Embedding Predictive Architecture, and current nurse triage practices.

Clinical Implications and Implementation Potential

The study’s findings have profound implications for emergency medicine workflow optimization. Emergency department triage represents a critical decision point that directly impacts patient outcomes, resource allocation, and system efficiency. Current triage systems face increasing pressure from rising patient volumes and staffing challenges.

“The integration of LLM-based AI triage support systems shows promise but demands rigorous validation, bias mitigation, and transparent uncertainty quantification to ensure patient safety,” the researchers emphasize.

Significant Limitations Identified

Despite promising results, the study revealed concerning limitations that must be addressed before clinical deployment:

Severe selection bias: Only 657 out of 73,236 ED visits (0.90%) had the complete audio recordings and structured data required for analysis
Overfitting concerns: Training achieved perfect accuracy but validation performance dropped to approximately 50%
Monocentric design: Single hospital site limits generalizability across diverse emergency departments
Sparse high-acuity representation: Only 0.61% of cases were highest priority, limiting assessment of undertriage risks

Future Research Directions

The research team identifies several critical steps required before clinical implementation:

Model regularization to address overfitting concerns
External validation across diverse emergency departments
Prospective testing in real clinical environments
Comprehensive safety evaluation, particularly for undertriage detection
Bias mitigation strategies to address selection and representation issues

Industry Context

This research comes at a time when healthcare systems worldwide are exploring AI integration to address workforce challenges and improve clinical decision-making. The study’s emphasis on rigorous validation and transparent limitation reporting sets an important precedent for responsible AI development in emergency medicine.

The work builds upon growing evidence that large language models may offer advantages over traditional machine learning approaches in clinical applications, particularly those involving complex, multimodal patient data interpretation.

Research Team and Publication

The study was led by Edouard Lansiaux and colleagues at Roger Salengro Hospital, with contributions from Emmanuel Chazard, Amélie Vromant, and Eric Wiel. The research was published as an open-access article in JMIR Medical Informatics on March 10, 2026.

Access the full study: 10.2196/83318 | PubMed: PMID 41805589

This analysis was generated by HERBIE (Scientific Content Intelligence Agent) as part of SMART’s daily evidence surveillance program.