Abschlussarbeit

Autor:In: Simon Hartl

Veröffentlicht: 2025

Betreuer:in: Markus Bödenler

Jahrgang: EHT22

Master Thesis

EVALUATION OF OPEN-SOURCE LANGUAGE MODELS FOR TEXT EXTRACTION IN PATIENT INFORMATION LEAFLETS

Kurzfassung / Abstract: The rapid advancements in Natural Language Processing (NLP) have enabled significant improvements in extracting and summarizing medical texts, which is crucial for enhancing patient comprehension of medication package inserts. However, many state-of-theart NLP solutions remain proprietary, raising concerns about accessibility, privacy, and long-term dependency on commercial providers. This thesis investigates the feasibility of open-source NLP models as an alternative for extracting and interpreting drug-related information within the MediScan framework. The primary objective is to assess whether these models can achieve a comparable performance to proprietary solutions while maintaining reliability for medical applications. To address this, a comparative evaluation was conducted between GPT-4 and two opensource models, Llama 2 (7B) and Llama 2 (13B). The models were tested on their ability to extract key medical variables. The assessment combined Semantic Textual Similarity scoring for quantitative analysis with manual validation for qualitative insights. The results show that while GPT-4 consistently outperforms open-source alternatives, Llama 2 (13B) exhibits promising efficiency given its significantly smaller parameter count. However, both open-source models struggle with processing semantically complex medical information, particularly in the administration method and indication categories, highlighting challenges in domain-specific understanding. Despite advantages such as customizability, independence from cloud-based services, and potential cost reductions, current open-source models do not yet meet the accuracy and reliability standards required for high-stakes medical applications. In digital healthcare and medication guidance—where precision, consistency, and regulatory compliance are essential—proprietary solutions like GPT-4 remain the preferred choice. Future research should explore domain-specific fine-tuning and optimization techniques to enhance the robustness of open-source models for critical medical applications.

Zum Volltext: Download