Background: Artificial intelligence (AI) and large language models (LLMs) are increasingly used in healthcare, with applications in clinical decision-making and workflow optimization. In head and neck surgery, postoperative rehabilitation is a complex, multidisciplinary process requiring personalized care. This study evaluates the feasibility of using LLMs to generate tailored rehabilitation programs for patients undergoing major head and neck surgical procedures. Methods: Ten hypothetical head and neck surgical clinical scenarios were developed, representing oncologic resections with complex reconstructions. Four LLMs, ChatGPT-4o, DeepSeek V3, Gemini 2, and Copilot, were prompted with identical queries to generate rehabilitation plans. Three senior clinicians independently assessed their quality, accuracy, and clinical relevance using a five-point Likert scale. Readability and quality metrics, including the DISCERN score, Flesch Reading Ease, Flesch–Kincaid Grade Level, and Coleman–Liau Index, were applied. Results: ChatGPT-4o achieved the highest clinical relevance (Likert mean of 4.90 ± 0.32), followed by DeepSeek V3 (4.00 ± 0.82) and Gemini 2 (3.90 ± 0.74), while Copilot underperformed (2.70 ± 0.82). Gemini 2 produced the most readable content. A statistical analysis confirmed significant differences across the models (p < 0.001). Conclusions: LLMs can generate rehabilitation programs with varying quality and readability. ChatGPT-4o produced the most clinically relevant plans, while Gemini 2 generated more readable content. AI-generated rehabilitation plans may complement existing protocols, but further clinical validation is necessary to assess their impact on patient outcomes.
Marcaccini, G., Seth, I., Novo, J., Mcclure, V., Sacks, B., Lim, K., et al. (2025). Leveraging Artificial Intelligence for Personalized Rehabilitation Programs for Head and Neck Surgery Patients. TECHNOLOGIES, 13(4) [10.3390/technologies13040142].
Leveraging Artificial Intelligence for Personalized Rehabilitation Programs for Head and Neck Surgery Patients
Marcaccini G.;Cuomo R.;
2025-01-01
Abstract
Background: Artificial intelligence (AI) and large language models (LLMs) are increasingly used in healthcare, with applications in clinical decision-making and workflow optimization. In head and neck surgery, postoperative rehabilitation is a complex, multidisciplinary process requiring personalized care. This study evaluates the feasibility of using LLMs to generate tailored rehabilitation programs for patients undergoing major head and neck surgical procedures. Methods: Ten hypothetical head and neck surgical clinical scenarios were developed, representing oncologic resections with complex reconstructions. Four LLMs, ChatGPT-4o, DeepSeek V3, Gemini 2, and Copilot, were prompted with identical queries to generate rehabilitation plans. Three senior clinicians independently assessed their quality, accuracy, and clinical relevance using a five-point Likert scale. Readability and quality metrics, including the DISCERN score, Flesch Reading Ease, Flesch–Kincaid Grade Level, and Coleman–Liau Index, were applied. Results: ChatGPT-4o achieved the highest clinical relevance (Likert mean of 4.90 ± 0.32), followed by DeepSeek V3 (4.00 ± 0.82) and Gemini 2 (3.90 ± 0.74), while Copilot underperformed (2.70 ± 0.82). Gemini 2 produced the most readable content. A statistical analysis confirmed significant differences across the models (p < 0.001). Conclusions: LLMs can generate rehabilitation programs with varying quality and readability. ChatGPT-4o produced the most clinically relevant plans, while Gemini 2 generated more readable content. AI-generated rehabilitation plans may complement existing protocols, but further clinical validation is necessary to assess their impact on patient outcomes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1294416
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
