Background: Artificial intelligence (AI) has the potential to transform preoperative planning for breast reconstruction by enhancing the efficiency, accuracy, and reliability of radiology reporting through automatic interpretation and perforator identification. Large language models (LLMs) have recently advanced significantly in medicine. This study aimed to evaluate the proficiency of contemporary LLMs in interpreting computed tomography angiography (CTA) scans for deep inferior epigastric perforator (DIEP) flap preoperative planning. Methods: Four prominent LLMs, ChatGPT-4, BARD, Perplexity, and BingAI, answered six questions on CTA scan reporting. A panel of expert plastic surgeons with extensive experience in breast reconstruction assessed the responses using a Likert scale. In contrast, the responses' readability was evaluated using the Flesch Reading Ease score, the Flesch-Kincaid Grade level, and the Coleman-Liau Index. The DISCERN score was utilized to determine the responses' suitability. Statistical significance was identified through a t -test, and P -values < 0.05 were considered significant. Results: BingAI provided the most accurate and useful responses to prompts, followed by Perplexity, ChatGPT, and then BARD. BingAI had the greatest Flesh Reading Ease (34.7 +/- 5.5) and DISCERN (60.5 +/- 3.9) scores. Perplexity had higher Flesch-Kincaid Grade level (20.5 +/- 2.7) and Coleman-Liau Index (17.8 +/- 1.6) scores than other LLMs. Conclusion: LLMs exhibit limitations in their capabilities of reporting CTA for preoperative planning of breast reconstruction, yet the rapid advancements in technology hint at a promising future. AI stands poised to enhance the education of CTA reporting and aid preoperative planning. In the future, AI technology could provide automatic CTA interpretation, enhancing the efficiency, accuracy, and reliability of CTA reports. (c) 2024 The Author(s). Published by Elsevier Ltd on behalf of British Association of Plastic, Reconstructive and Aesthetic Surgeons. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )
Lim, B., Cevik, J., Seth, I., Sofiadellis, F., Ross, R.J., Rozen, W.M., et al. (2024). Evaluating Artificial Intelligence's Role in Teaching the Reporting and Interpretation of Computed Tomographic Angiography for Preoperative Planning of the Deep Inferior Epigastric Artery Perforator Flap. JPRAS OPEN, 40, 273-285 [10.1016/j.jpra.2024.03.010].
Evaluating Artificial Intelligence's Role in Teaching the Reporting and Interpretation of Computed Tomographic Angiography for Preoperative Planning of the Deep Inferior Epigastric Artery Perforator Flap
Cuomo, Roberto
2024-01-01
Abstract
Background: Artificial intelligence (AI) has the potential to transform preoperative planning for breast reconstruction by enhancing the efficiency, accuracy, and reliability of radiology reporting through automatic interpretation and perforator identification. Large language models (LLMs) have recently advanced significantly in medicine. This study aimed to evaluate the proficiency of contemporary LLMs in interpreting computed tomography angiography (CTA) scans for deep inferior epigastric perforator (DIEP) flap preoperative planning. Methods: Four prominent LLMs, ChatGPT-4, BARD, Perplexity, and BingAI, answered six questions on CTA scan reporting. A panel of expert plastic surgeons with extensive experience in breast reconstruction assessed the responses using a Likert scale. In contrast, the responses' readability was evaluated using the Flesch Reading Ease score, the Flesch-Kincaid Grade level, and the Coleman-Liau Index. The DISCERN score was utilized to determine the responses' suitability. Statistical significance was identified through a t -test, and P -values < 0.05 were considered significant. Results: BingAI provided the most accurate and useful responses to prompts, followed by Perplexity, ChatGPT, and then BARD. BingAI had the greatest Flesh Reading Ease (34.7 +/- 5.5) and DISCERN (60.5 +/- 3.9) scores. Perplexity had higher Flesch-Kincaid Grade level (20.5 +/- 2.7) and Coleman-Liau Index (17.8 +/- 1.6) scores than other LLMs. Conclusion: LLMs exhibit limitations in their capabilities of reporting CTA for preoperative planning of breast reconstruction, yet the rapid advancements in technology hint at a promising future. AI stands poised to enhance the education of CTA reporting and aid preoperative planning. In the future, AI technology could provide automatic CTA interpretation, enhancing the efficiency, accuracy, and reliability of CTA reports. (c) 2024 The Author(s). Published by Elsevier Ltd on behalf of British Association of Plastic, Reconstructive and Aesthetic Surgeons. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )File | Dimensione | Formato | |
---|---|---|---|
Evaluating Artificial Intelligence s Role in Teaching-Lim-2024.pdf
accesso aperto
Descrizione: Articolo
Tipologia:
PDF editoriale
Licenza:
Creative commons
Dimensione
673.68 kB
Formato
Adobe PDF
|
673.68 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1268357