Abschlussarbeit

Autor:In: Joachim Waltl

Veröffentlicht: 2025

Betreuer:in: Markus Bödenler

Jahrgang: EHT23

Master Thesis

LETHE-CHAT: SCHEMA-AWARE LLM AGENTS FOR TEXT-TO-SQL ON CLINICAL TRIAL DATABASES

Kurzfassung / Abstract: While Large Language Models (LLMs) have advanced rapidly and demonstrated remarkable capabilities in natural language processing, their safe use with privacy-sensitive clinical data remains an underexplored challenge. In the LETHE project, which investigates preventive dementia interventions, longitudinal health and risk-factor data are stored in relational databases that require SQL proficiency for access and insight generation. Since healthcare professionals typically lack this technical expertise, a considerable knowledgeaccess gap emerges. This thesis addresses this barrier by developing LETHE-Chat, an on-premise operable conversational agent that enables clinicians to query trial data in natural language. Two alternative system architectures, that translate questions into schema-aware SQL and after execution return the query results as easily interpretable answers, were implemented and evaluated: a dynamic reasoning and acting (ReAct) agent with tool-use, intermediate traces, and conversational memory for follow-up questions and a deterministic workflow graph following a predefined sequence of schema retrieval, SQL generation, and correction steps. Both systems were built with LangGraph as agent orchestration framework, SQLAlchemy as flexible Python SQL toolkit, and powered by locally deployable open-source models served via Ollama. For the ReAct agent, a lightweight Streamlit chat interface was created, enabling clinicians to pose questions, review SQL execution traces, engage in conversational follow-ups, and interact with results through sorting, filtering and download options. Evaluation on a manually curated benchmark of 65 queries demonstrated execution accuracies of 83.3%, with qualitative error analysis highlighting complementary strengths and residual risks. Transparency, read-only execution, and private deployment promote explainability, informed by key principles of AI governance as declared in the EU AI Act. The prototype illustrates the feasibility and potential of privacy-preserving, clinically usable LLM agents, specifically to improve data access and ad-hoc analysis. It further establishes a foundation for future research of hybrid designs, robustness or safety measures.

Zum Volltext: Download