Research Article
BibTex RIS Cite

Performance Analysis of Large Language Models in Radiology and Histopathology Reporting: From Diagnostic Support to Patient Communication

Year 2025, Volume: 28 Issue: 2, 141 - 156, 30.06.2025
https://doi.org/10.7126/cumudj.1588132

Abstract

Objectives: The aim of this study was to evaluate the effectiveness of two different versions of Chat-GPT, one of the large language models (LLMs), in the diagnosis and interpretation of cone beam computed tomography (CBCT) and histopathology reports.
Materials and Methods: In this study, Chat-GPT 3.5 and Chat-GPT 4 were tasked with generating preliminary diagnoses and differential diagnoses based on the findings from ten CBCT reports and ten histopathology reports. Additionally, both versions were asked to simplify these reports to a level understandable by patients. Dentomaxillofacial radiologists and pathologists, with varying levels of expertise, evaluated the responses of the LLMs and the performance of Chat-GPT 3.5 and Chat-GPT 4 in these tasks was subsequently compared based on these expert assessments.
Results: A comparison of diagnostic performance for radiology reports showed that Chat-GPT 4 was statistically superior to Chat-GPT 3.5 (p < 0.001), while no significant difference was observed between the two models in terms of report simplification scores (P>0.05). In contrast, when evaluating histopathology reports, Chat-GPT 4 performed significantly better than Chat-GPT 3.5 in terms of both diagnostic accuracy and report simplification (p < 0.05).
Conclusions: The results demonstrated that Chat-GPT 4 achieved superior performance in the interpretation and evaluation of CBCT reports by LLMs. The strong performance of this latest version highlights the potential for LLMs to become valuable tools in the reporting processes of radiology and histopathology, as well as in numerous other fields, as advancements in technology continue to improve their capabilities.

Project Number

2024/112

References

  • 1. Buldur B, Teke F, Kurt MA, Sağtaş K. Perceptions of Dentists Towards Artificial Intelligence: Validation of a New Scale. Cumhuriyet Dental Journal 2024;27(2): 109–117.
  • 2. Elkassem AA, Smith AD. Potential Use Cases for ChatGPT in Radiology Reporting. AJR Am J Roentgenol 2023;221(3):373-376.
  • 3. Sacoransky E, Kwan BYM, Soboleski D. ChatGPT and assistive AI in structured radiology reporting: A systematic review. Curr Probl Diagn Radiol 2024;53(6):728-737.
  • 4. Sun Z, Ong H, Kennedy P, et al. Evaluating GPT4 on Impressions Generation in Radiology Reports. Radiology 2023;307(5):e231259.
  • 5. Zhong T, Zhao W, Zhang Y, et al. ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data. Published online October 9, 2023. doi:10.48550/arXiv.2310.05242
  • 6. Persigehl T, Baumhauer M, Baeßler B, et al. Structured Reporting of Solid and Cystic Pancreatic Lesions in CT and MRI: Consensus-Based Structured Report Templates of the German Society of Radiology (DRG). ROFO Fortschr Geb Rontgenstr Nuklearmed 2020;192(7):641-656.
  • 7. Brook OR, Brook A, Vollmer CM, Kent TS, Sanchez N, Pedrosa I. Structured reporting of multiphasic CT for pancreatic cancer: potential effect on staging and surgical planning. Radiology 2015;274(2):464-472.
  • 8. Kabadi SJ, Krishnaraj A. Strategies for Improving the Value of the Radiology Report: A Retrospective Analysis of Errors in Formally Over-read Studies. J Am Coll Radiol 2017;14(4):459-466.
  • 9. Nobel JM, van Geel K, Robben SGF. Structured reporting in radiology: a systematic review to explore its potential. Eur Radiol 2022;32(4):2837-2854.
  • 10. Kaka H, Zhang E, Khan N. Artificial Intelligence and Deep Learning in Neuroradiology: Exploring the New Frontier. Can Assoc Radiol J 2021;72(1):35-44.
  • 11. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nat Rev Cancer 2018;18(8):500-510.
  • 12. Keshavarz P, Bagherieh S, Nabipoorashrafi SA, et al. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging 2024;105(7-8):251-265.
  • 13. Omar M, Ullanat V, Loda M, Marchionni L, Umeton R. ChatGPT for digital pathology research. Lancet Digit Health 2024;6(8):e595-e600.
  • 14. Bhayana R, Krishna S, Bleakney RR. Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations. Radiology 2023;307(5):e230582.
  • 15. Jj C, Dh K, Tt G, et al. Accuracy of Information Provided by ChatGPT Regarding Liver Cancer Surveillance and Diagnosis. AJR Am J Roentgenol 2023;221(4).
  • 16. Lyu Q, Tan J, Zapadka ME, et al. Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 2023;6:9.
  • 17. Haver HL, Ambinder EB, Bahl M, Oluyemi ET, Jeudy J, Yi PH. Appropriateness of Breast Cancer Prevention and Screening Recommendations Provided by ChatGPT. Radiology 2023;307(4):e230424.
  • 18. Darzidehkalani E. ChatGPT in Medical Publications. Radiology 2023;307(5):e231188.
  • 19. Gunn AJ. Commentary: The Emerging Role of Artificial Intelligence for Patient Education. J Vasc Interv Radiol 2023;34(10):1769-1770
  • 20. Ethan Sacoransky, Benjamin Y.M. Kwan, Donald Soboleski, ChatGPT and assistive AI in structured radiology reporting: A systematic review, Current Problems in Diagnostic Radiology 2024;53(6): 728-737,
  • 21. R. Bhayana, S. Krishna, R.R. Bleakney Performance of ChatGPT on a radiology board-style examination: insights into current strengths and limitations. Radiology 2023; 307 (5):e230582.
  • 22. Massey, Patrick A. MD, MBA; Montgomery, Carver MD; Zhang, Andrew S MD. Comparison of ChatGPT–3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations. Journal of the American Academy of Orthopaedic Surgeons 2023;31(23):1173-1179.
  • 23. A. Rao, J. Kim, M. Kamineni, M. Pang, W. Lie, K.J. Dreyer, M.D. Succi Evaluating GPT as an adjunct for radiologic decision making: GPT-4 versus GPT-3.5 in a breast imaging pilot. J Am Coll Radiol 2023;20(10):990-997.
  • 24. Doshi R, Amin K, Khosla P, Bajaj S, Chheang S, Forman HP. Utilizing Large Language Models to Simplify Radiology Reports: a comparative analysis of ChatGPT3.5, ChatGPT4.0, Google Bard, and Microsoft Bing. Published online June 5, 2023:2023.06.04.23290786.
  • 25. Hirosawa T, Harada Y, Yokose M, Sakamoto T, Kawamura R, Shimizu T. Diagnostic Accuracy of Differential-Diagnosis Lists Generated by Generative Pretrained Transformer 3 Chatbot for Clinical Vignettes with Common Chief Complaints: A Pilot Study. Int J Environ Res Public Health 2023;20(4):3378.

Radyoloji ve Histopatoloji Raporlamasında Büyük Dil Modellerinin Performans Analizi: Tanı Desteğinden Hasta İletişimine

Year 2025, Volume: 28 Issue: 2, 141 - 156, 30.06.2025
https://doi.org/10.7126/cumudj.1588132

Abstract

Amaç: Bu çalışmanın amacı, Chat-GPT'nin iki farklı sürümünün, büyük dil modellerinden biri olarak, konik ışınlı bilgisayarlı tomografi (KIBT) ve histopatoloji raporlarının tanı ve yorumlanmasındaki etkinliğini değerlendirmektir.
Gereç ve Yöntemler: Bu çalışmada, Chat-GPT 3.5 ve Chat-GPT 4, on KIBT raporu ve on histopatoloji raporuna dayanarak ön tanılar ve ayırıcı tanılar üretmekle görevlendirilmiştir. Ek olarak, her iki sürümden de bu raporları hastaların anlayabileceği bir düzeye basitleştirmeleri istenmiştir. Farklı uzmanlık seviyelerine sahip ağız diş ve çene radyolojisi uzmanları ve oral patoloji uzmanları, Chat-GPT’nin yanıtlarını değerlendirmiş ve Chat-GPT 3.5 ile Chat-GPT 4'ün bu görevlerdeki performansı bu uzman değerlendirmelerine göre karşılaştırılmıştır.
Bulgular: Radyoloji raporlarına ilişkin tanısal performans karşılaştırması, Chat-GPT 4'ün Chat-GPT 3.5'e kıyasla istatistiksel olarak anlamlı şekilde üstün olduğunu göstermiştir (p < 0,001). Ancak, rapor basitleştirme puanları açısından iki model arasında anlamlı bir fark gözlemlenmemiştir (P > 0,05). Öte yandan, histopatoloji raporlarının değerlendirilmesinde, Chat-GPT 4 hem tanısal doğruluk hem de rapor basitleştirme açısından Chat-GPT 3.5'ten anlamlı derecede daha iyi performans göstermiştir (p < 0,05).
Sonuç: Sonuçlar, Chat-GPT 4'ün CBCT raporlarının yorumlanması ve değerlendirilmesinde eski versiyona kıyasla üstün performans gösterdiğini ortaya koymuştur. Bu son sürümün güçlü performansı, büyük dil modellerinin radyoloji ve histopatolojideki raporlama süreçlerinde ve teknolojik ilerlemelerle yeteneklerinin artmaya devam ettiği birçok başka alanda değerli araçlar haline gelebileceğini göstermektedir.

Project Number

2024/112

References

  • 1. Buldur B, Teke F, Kurt MA, Sağtaş K. Perceptions of Dentists Towards Artificial Intelligence: Validation of a New Scale. Cumhuriyet Dental Journal 2024;27(2): 109–117.
  • 2. Elkassem AA, Smith AD. Potential Use Cases for ChatGPT in Radiology Reporting. AJR Am J Roentgenol 2023;221(3):373-376.
  • 3. Sacoransky E, Kwan BYM, Soboleski D. ChatGPT and assistive AI in structured radiology reporting: A systematic review. Curr Probl Diagn Radiol 2024;53(6):728-737.
  • 4. Sun Z, Ong H, Kennedy P, et al. Evaluating GPT4 on Impressions Generation in Radiology Reports. Radiology 2023;307(5):e231259.
  • 5. Zhong T, Zhao W, Zhang Y, et al. ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data. Published online October 9, 2023. doi:10.48550/arXiv.2310.05242
  • 6. Persigehl T, Baumhauer M, Baeßler B, et al. Structured Reporting of Solid and Cystic Pancreatic Lesions in CT and MRI: Consensus-Based Structured Report Templates of the German Society of Radiology (DRG). ROFO Fortschr Geb Rontgenstr Nuklearmed 2020;192(7):641-656.
  • 7. Brook OR, Brook A, Vollmer CM, Kent TS, Sanchez N, Pedrosa I. Structured reporting of multiphasic CT for pancreatic cancer: potential effect on staging and surgical planning. Radiology 2015;274(2):464-472.
  • 8. Kabadi SJ, Krishnaraj A. Strategies for Improving the Value of the Radiology Report: A Retrospective Analysis of Errors in Formally Over-read Studies. J Am Coll Radiol 2017;14(4):459-466.
  • 9. Nobel JM, van Geel K, Robben SGF. Structured reporting in radiology: a systematic review to explore its potential. Eur Radiol 2022;32(4):2837-2854.
  • 10. Kaka H, Zhang E, Khan N. Artificial Intelligence and Deep Learning in Neuroradiology: Exploring the New Frontier. Can Assoc Radiol J 2021;72(1):35-44.
  • 11. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nat Rev Cancer 2018;18(8):500-510.
  • 12. Keshavarz P, Bagherieh S, Nabipoorashrafi SA, et al. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging 2024;105(7-8):251-265.
  • 13. Omar M, Ullanat V, Loda M, Marchionni L, Umeton R. ChatGPT for digital pathology research. Lancet Digit Health 2024;6(8):e595-e600.
  • 14. Bhayana R, Krishna S, Bleakney RR. Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations. Radiology 2023;307(5):e230582.
  • 15. Jj C, Dh K, Tt G, et al. Accuracy of Information Provided by ChatGPT Regarding Liver Cancer Surveillance and Diagnosis. AJR Am J Roentgenol 2023;221(4).
  • 16. Lyu Q, Tan J, Zapadka ME, et al. Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 2023;6:9.
  • 17. Haver HL, Ambinder EB, Bahl M, Oluyemi ET, Jeudy J, Yi PH. Appropriateness of Breast Cancer Prevention and Screening Recommendations Provided by ChatGPT. Radiology 2023;307(4):e230424.
  • 18. Darzidehkalani E. ChatGPT in Medical Publications. Radiology 2023;307(5):e231188.
  • 19. Gunn AJ. Commentary: The Emerging Role of Artificial Intelligence for Patient Education. J Vasc Interv Radiol 2023;34(10):1769-1770
  • 20. Ethan Sacoransky, Benjamin Y.M. Kwan, Donald Soboleski, ChatGPT and assistive AI in structured radiology reporting: A systematic review, Current Problems in Diagnostic Radiology 2024;53(6): 728-737,
  • 21. R. Bhayana, S. Krishna, R.R. Bleakney Performance of ChatGPT on a radiology board-style examination: insights into current strengths and limitations. Radiology 2023; 307 (5):e230582.
  • 22. Massey, Patrick A. MD, MBA; Montgomery, Carver MD; Zhang, Andrew S MD. Comparison of ChatGPT–3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations. Journal of the American Academy of Orthopaedic Surgeons 2023;31(23):1173-1179.
  • 23. A. Rao, J. Kim, M. Kamineni, M. Pang, W. Lie, K.J. Dreyer, M.D. Succi Evaluating GPT as an adjunct for radiologic decision making: GPT-4 versus GPT-3.5 in a breast imaging pilot. J Am Coll Radiol 2023;20(10):990-997.
  • 24. Doshi R, Amin K, Khosla P, Bajaj S, Chheang S, Forman HP. Utilizing Large Language Models to Simplify Radiology Reports: a comparative analysis of ChatGPT3.5, ChatGPT4.0, Google Bard, and Microsoft Bing. Published online June 5, 2023:2023.06.04.23290786.
  • 25. Hirosawa T, Harada Y, Yokose M, Sakamoto T, Kawamura R, Shimizu T. Diagnostic Accuracy of Differential-Diagnosis Lists Generated by Generative Pretrained Transformer 3 Chatbot for Clinical Vignettes with Common Chief Complaints: A Pilot Study. Int J Environ Res Public Health 2023;20(4):3378.
There are 25 citations in total.

Details

Primary Language English
Subjects Oral and Maxillofacial Radiology, Oral Medicine and Pathology
Journal Section Original Research Articles
Authors

Sümeyye Çelik 0009-0003-0676-5098

Alican Kuran 0000-0001-9677-8690

Oğuz Baysal 0009-0000-0360-0050

Umut Seki 0000-0002-0286-9792

Merva Soluk Tekkeşin 0000-0002-7178-3335

Enver Alper Sinanoğlu 0000-0002-8349-3239

Project Number 2024/112
Publication Date June 30, 2025
Submission Date November 19, 2024
Acceptance Date April 27, 2025
Published in Issue Year 2025Volume: 28 Issue: 2

Cite

EndNote Çelik S, Kuran A, Baysal O, Seki U, Soluk Tekkeşin M, Sinanoğlu EA (June 1, 2025) Performance Analysis of Large Language Models in Radiology and Histopathology Reporting: From Diagnostic Support to Patient Communication. Cumhuriyet Dental Journal 28 2 141–156.

Cumhuriyet Dental Journal (Cumhuriyet Dent J, CDJ) is the official publication of Cumhuriyet University Faculty of Dentistry. CDJ is an international journal dedicated to the latest advancement of dentistry. The aim of this journal is to provide a platform for scientists and academicians all over the world to promote, share, and discuss various new issues and developments in different areas of dentistry. First issue of the Journal of Cumhuriyet University Faculty of Dentistry was published in 1998. In 2010, journal's name was changed as Cumhuriyet Dental Journal. Journal’s publication language is English.


CDJ accepts articles in English. Submitting a paper to CDJ is free of charges. In addition, CDJ has not have article processing charges.

Frequency: Four times a year (March, June, September, and December)

IMPORTANT NOTICE

All users of Cumhuriyet Dental Journal should visit to their user's home page through the "https://dergipark.org.tr/tr/user" " or "https://dergipark.org.tr/en/user" links to update their incomplete information shown in blue or yellow warnings and update their e-mail addresses and information to the DergiPark system. Otherwise, the e-mails from the journal will not be seen or fall into the SPAM folder. Please fill in all missing part in the relevant field.

Please visit journal's AUTHOR GUIDELINE to see revised policy and submission rules to be held since 2020.