Introduction: Large language models (LLMs) offer remarkable potential in assisting healthcare professionals with diagnostic and therapeutic decision-making processes. However, the integration of LLMs into healthcare decision-making processes also introduces several doubts in the field of usefulness, reliability and ethical implications. Materials and Methods: To assess the potential of LLMs in managing intricate surgical situations, a cross-sectional study with 30 real-world cases of maxillofacial traumatology was designed. The cases were presented to ChatGPT-4, Google Bard, and maxillofacial surgery residents in a standardized manner, and the results of the subjects were evaluated by an expert surgeon panel of reviewers using the AIPI and QAMAI tools. Results: ChatGPT-4 and Bard showed comparable performances in patient feature consideration but differed significantly in their ability to suggest differential diagnoses. ChatGPT-4 outperformed Bard in proposing additional examinations and treatment plans. Compared to LLMs, human surgery residents consistently scored higher across all parameters of the QAMAI tool, indicating superior accuracy, clarity, relevance, completeness, quality of references, and overall usefulness. Discussion: Both LLMs demonstrated their potential to support clinical decision-making in facial traumatology, but they require further development to be sufficiently reliable for real-world clinical use. Conclusions: AIPI and QAMAI proved their utility as evaluation tools but highlighted the need for standardization in LLM-generated responses assessment.

Benedetti, S., Frosolini, A., Catarzi, L., Vaira, L.A., Consorti, G., Paglianiti, M., et al. (2025). Large language models and surgical decision-making: evaluation of generative unimodal AI in facial traumatology practice. JOURNAL OF MAXILLOFACIAL & ORAL SURGERY [10.1007/s12663-025-02556-7].

Large language models and surgical decision-making: evaluation of generative unimodal AI in facial traumatology practice

Benedetti S.;Frosolini A.;Catarzi L.;Paglianiti M.;Gennaro P.;Gabriele G.
2025-01-01

Abstract

Introduction: Large language models (LLMs) offer remarkable potential in assisting healthcare professionals with diagnostic and therapeutic decision-making processes. However, the integration of LLMs into healthcare decision-making processes also introduces several doubts in the field of usefulness, reliability and ethical implications. Materials and Methods: To assess the potential of LLMs in managing intricate surgical situations, a cross-sectional study with 30 real-world cases of maxillofacial traumatology was designed. The cases were presented to ChatGPT-4, Google Bard, and maxillofacial surgery residents in a standardized manner, and the results of the subjects were evaluated by an expert surgeon panel of reviewers using the AIPI and QAMAI tools. Results: ChatGPT-4 and Bard showed comparable performances in patient feature consideration but differed significantly in their ability to suggest differential diagnoses. ChatGPT-4 outperformed Bard in proposing additional examinations and treatment plans. Compared to LLMs, human surgery residents consistently scored higher across all parameters of the QAMAI tool, indicating superior accuracy, clarity, relevance, completeness, quality of references, and overall usefulness. Discussion: Both LLMs demonstrated their potential to support clinical decision-making in facial traumatology, but they require further development to be sufficiently reliable for real-world clinical use. Conclusions: AIPI and QAMAI proved their utility as evaluation tools but highlighted the need for standardization in LLM-generated responses assessment.
2025
Benedetti, S., Frosolini, A., Catarzi, L., Vaira, L.A., Consorti, G., Paglianiti, M., et al. (2025). Large language models and surgical decision-making: evaluation of generative unimodal AI in facial traumatology practice. JOURNAL OF MAXILLOFACIAL & ORAL SURGERY [10.1007/s12663-025-02556-7].
File in questo prodotto:
File Dimensione Formato  
Large Language Models and Surgical Decision-Making-Benedetti-2025.pdf

non disponibili

Descrizione: Articolo
Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 895.17 kB
Formato Adobe PDF
895.17 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1295634