Using a Local LLM to Extract SUVmax and Anatomical Terms from Radiology Reports of FDG-PET/CT (2025)

Main menu

User menu

Search

  • Advanced search

Advanced Search

Meeting ReportPIDS: Data Sciences & Imaging Informatics

Kenji Hirata, Akie Katsuki, Masatoyo Nakajo, Shiro Watanabe, Junki Takenaka, Naoto Wakabayashi, Takaaki Yoshimura, Minghui Tang, Kazutaka Minami, Nozomu Uetake and Kohsuke Kudo

Journal of Nuclear Medicine June 2025, 66 (supplement 1) 251030;

  • Article

Abstract

251030

Introduction: SUVmax is the de facto standard for representing uptake intensity in FDG-PET/CT reports. We previously demonstrated that SUVmax values documented in FDG-PET/CT reports are useful for identifying lesion locations within the images (Hirata et al. Front Med 2021). Building upon this finding, we aim to develop systems for treatment response evaluation and automated report generation. Although SUVmax values are typically written as numerical strings with decimal points, such as "3.14," rule-based approaches like regular expressions often fail due to exceptions. Since the emergence of ChatGPT in 2022, large language models (LLMs) have become valuable tools for analysis of medical texts such as radiology reports. However, cloud-based systems, such as ChatGPT and Gemini, are not allowed to directly process reports containing sensitive information. To address this, we implemented an open-source LLM locally and utilized it to extract and structure information on "location and SUVmax" from FDG-PET/CT reports.

Methods: The Institutional Review Board approved the retrospective study (#23-0128). We reviewed 949 patients who underwent FDG-PET/CT examinations at our institute from the beginning of 2017. All reports, written in Japanese, were authored by certified nuclear medicine specialists. The LLM utilized was "Llama-3-ELYZA-JP-8B." Reports were input into the LLM with a prompt instructing it to generate a JSON-format text such as {site: "pancreas", SUVmax: "3.141"}. To mitigate hallucinations, additional instructions were provided to ensure the LLM refrained from outputting answers if the SUVmax was unclear or unavailable. For the organs such as lung and liver, the LLM was instructed to include laterality and specific lobes in the output. The ground truth was determined by an experienced nuclear medicine physician for all the cases. The accuracy of the LLM was evaluated using the Dice similarity coefficient (DSC).

Results: Among the 949 cases reviewed, 591 reports (62%) contained at least one SUVmax description. Collectively, a total of 1,135 SUVmax values were documented, comprising 614 single-digit, 25 double-digit, and 496 triple-digit values. Applying the criterion of SUVmax > 5 or triple-digit values, 842 lesions (74%) met the specified conditions. With respect to laterality, 354 lesions involved the left side, while 411 the right side. Examining major anatomical regions, the thorax accounted for the largest proportion with 276 lesions, followed by the abdomen with 199 lesions, and the head with 97 lesions. At the organ-specific level, the most frequently identified sites were the lung (208 lesions), bone (198 lesions), and pharynx (54 lesions), among others. The ground truth dataset included 1,135 SUVmax values, whereas the LLM output produced 1,353 values - a 19% increase over the ground truth, suggesting the presence of hallucinated, non-existent SUVmax values. Consequently, the patient-based DSC was calculated at 0.792. In 479 cases (81%), the sensitivity was 100%, indicating that the LLM successfully identified all SUVmax values present in the ground truth. Perfect matches were achieved in 407 cases (69%) of the 591 cases. The overall sensitivity was measured at 83.6%.

Conclusions: While the LLM demonstrated a tendency to output non-existent SUVmax values due to hallucinations, it achieved high sensitivity. Unlike cloud-based systems such as ChatGPT, the local LLM can be operated securely, making it a viable tool for efficiently extracting SUVmax values. Further improvements in accuracy are likely to require refinements in the prompts.

  • Download figure
  • Open in new tab
  • Download powerpoint

Previous

Back to top

In this issue

Journal of Nuclear Medicine

Vol. 66, Issue supplement 1

June 1, 2025

  • Table of Contents
  • Index by author

Article Alerts

Email Article

Citation Tools

  • Facebook Like
  • Google Plus One

Bookmark this article

Jump to section

  • Article

Related Articles

  • No related articles found.

  • Google Scholar

Cited By...

  • No citing articles found.

  • Google Scholar

More in this TOC Section

  • AI-Assisted Annotation of V/Q Scintigraphy VQ4PEDB: Development of a Large-Scale Annotated Database for Pulmonary Embolism

  • Lymphoma Subtype Classification Using 18F-FDG PET Tumor-to-Liver Ratio Radiomics Combined with Demographic Data: A Multicenter Study

  • AI-assisted TMTV calculation for lymphomatous disease – validation study on the international TMTV benchmark dataset

Similar Articles

Using a Local LLM to Extract SUVmax and Anatomical Terms from Radiology Reports of FDG-PET/CT (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Neely Ledner

Last Updated:

Views: 6203

Rating: 4.1 / 5 (62 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Neely Ledner

Birthday: 1998-06-09

Address: 443 Barrows Terrace, New Jodyberg, CO 57462-5329

Phone: +2433516856029

Job: Central Legal Facilitator

Hobby: Backpacking, Jogging, Magic, Driving, Macrame, Embroidery, Foraging

Introduction: My name is Neely Ledner, I am a bright, determined, beautiful, adventurous, adventurous, spotless, calm person who loves writing and wants to share my knowledge and understanding with you.