EN

Blog

Blog

Delphyr Engineering

Delphyr Engineering

10 Techniques to Make LLMs Safer in Healthcare

In this blog series, Delphyr Engineering, we share practical insights from building AI systems for real clinical use. Last month, we explored how multiple layers of guardrails keep our medical AI reliable and trustworthy. In this post, our engineer Tim explains that large language models (LLMs) are inherently probabilistic systems and explores 10 techniques to reduce risk and improve reliability when applying LLMs in healthcare systems. 

Managing uncertainty in clinical AI


Large language models (LLMs) do not always produce exactly the same answer twice. To understand this properly, it is important to place LLMs in their broader context. LLMs are a type of AI model that generates text based on probabilities rather than fixed rules. If you slightly rephrase a question, the wording, emphasis, or level of detail in the response may change. This is in contrast to deterministic systems, where the same input always produces the same output.


In most contexts, that’s harmless. In healthcare, though, where even small differences can matter, the stakes are real. But here’s the thing: medicine itself isn’t fully predictable. Two doctors reviewing the same patient chart might highlight different details, interpret the results differently, or reach slightly different conclusions. That’s why clinical practice relies on layers of safety, protocols, peer reviews, and audit trails to manage uncertainty and keep patients safe.


A similar approach is emerging in AI. Rather than attempting to make LLMs deterministic, AI used in healthcare systems can be designed to introduce determinism where it matters, while managing variability elsewhere through layered safeguards. In this blog, we discuss 10 techniques that are commonly used or are being actively developed to make LLM-based systems safer and more reliable in clinical contexts.

  1. Output guards


AI-generated responses can be passed through multiple filtering layers during and after generation. These guard mechanisms are designed to detect and block unsafe, inappropriate, or out-of-scope content. In practice, such systems reduce the likelihood of problematic outputs, although no filtering approach can be assumed to be perfect.

  1. Refusal and clarification strategies


When confidence is low or input is ambiguous, models can be guided to ask for clarification or decline to answer. This behavior is particularly useful in open-ended interactions, where the risk of incorrect assumptions is higher. Encouraging “don’t know” responses in uncertain situations can reduce overconfident errors.

  1. Non-authoritative and critical response behavior


Models can be explicitly instructed to avoid presenting outputs as definitive or fully certain, especially when the data is limited or ambiguous. Instead of making absolute claims (e.g., “this treatment will definitely work”), responses should reflect uncertainty and the limits of available information.


Just as importantly, models should be guided to remain critical of source data, including information retrieved from the EHR. Rather than unthinkingly copying values, even in summarization tasks, the model should identify and flag inconsistencies or implausible entries. For example, if a patient record contains “heart rate 600 bpm,” the system should include a note such as: “A heart rate of 600 bpm is physiologically implausible and may indicate a data entry error.”


This type of behavior is not just theoretical. A recent study published in Nature Communications Medicine evaluated LLMs on 300 synthetic clinical cases containing deliberately inserted false details, such as fabricated conditions or nonexistent lab markers. Models frequently accepted these details as valid and built convincing clinical reasoning around them. For example, when presented with a fictitious condition, a model might respond:

“The patient’s chronic Lyme-induced cardiomyopathy likely explains their fatigue and arrhythmia. Management should include beta-blockers and continued antibiotic therapy to address the underlying Borrelia infection, along with cardiac monitoring.”


In this case, the model:

  • accepts a non-existent condition as real

  • constructs a full treatment plan around it

  • adds plausible-sounding detail, making the error harder to detect


In the study across models, hallucination rates were high (around 66% on average, up to 82%). Prompting models to be more skeptical reduced this significantly (to ~44% on average), and in some cases more substantially, but errors remained common.


These findings highlight a key point: without explicit guidance, LLMs tend to trust and elaborate on flawed inputs rather than question them. Encouraging critical, non-authoritative responses is therefore an important safeguard, not to eliminate errors, but to reduce overconfident and potentially misleading outputs. This approach reinforces transparency and supports clinical judgment, extending the principles of traceability. 

  1. Deterministic data extraction at ingest


Much of the structuring work can happen before any user interaction. At the point of data ingestion (e.g., when a patient record is uploaded), information can be extracted, normalized, and validated. This reduces the need for real-time interpretation during clinician interaction and helps ensure consistency across later queries.

  1. Structured outputs instead of free text


For critical data elements, systems can favor structured outputs over free-text generation. Instead of asking a model to “describe” a medication list, the system can require specific fields (drug, dosage, frequency), optionally with validation or confidence signals. When uncertainty is high, the system can defer to the clinician rather than attempting to guess.

  1. Source citation and traceability


To support transparency, outputs can include references to underlying data sources, such as patient data from the electronic health record (EHR), as well as relevant clinical guidelines. This allows clinicians to verify the source of information, much like they would review a patient chart or consult established medical guidance. Traceability helps users assess reliability rather than relying solely on the model’s wording.

  1. Citation verification


Beyond including references, more advanced approaches attempt to verify whether cited sources actually support the generated claims. This can involve matching not only source identifiers but also the cited text against the original content, helping detect cases where a reference is present but not fully aligned with the statement.

  1. Hallucination detection


To detect hallucination, statistical techniques can be used to test whether a model’s output is sensitive to changes in the underlying data. For example, removing or perturbing specific inputs and observing the effect on the response may provide signals about whether the output is truly grounded.

  1. Deterministic routing for simple queries


Not all queries require a language model. For straightforward requests, such as retrieving a lab value or listing current medications, systems can bypass the LLM entirely and rely on deterministic logic. This ensures consistent outputs for well-defined tasks and reduces unnecessary exposure to generative variability.

  1. . Evaluation and regression testing


Reliability depends not only on design but also on continuous evaluation. Many systems are tested using predefined questions or “evals” to assess performance on specific tasks. Extending this approach to full scenario-based testing, in which multi-step clinical workflows are simulated, can be valuable. 

Why variability in wording is often acceptable


Even with these safeguards, LLM-based systems may still produce slightly different wording or emphasis for the same input. In many cases, this is not inherently problematic:

  • Clinicians themselves vary in phrasing and emphasis

  • Underlying data can remain consistent even if explanations differ

  • Deterministic retrieval can ensure that critical facts (e.g., medication doses, lab values) remain stable


For natural language summaries, some variability may be acceptable, provided that outputs remain grounded, traceable, and verifiable.

What this means in practice


Taken together, these techniques illustrate a broader pattern. LLMs are not made safe by eliminating uncertainty, but by designing systems that manage and constrain it. Ideally, some components, such as data retrieval and structured outputs, are made deterministic while others, such as natural language explanations, remain probabilistic but are bounded by safeguards, validation, and transparency. This mirrors how safety is approached in clinical environments, where uncertainty is managed through layered systems rather than removed altogether.

The bottom line


Probabilistic AI models in healthcare settings introduce real challenges. However, they can still be used responsibly when embedded within systems that reduce risk and support verification. By combining deterministic components, structured data handling, guard mechanisms, and continuous evaluation, it becomes possible to build AI systems that are more transparent, more traceable, and better aligned with clinical workflows. Rather than making LLMs deterministic, the goal is to design systems in which their variability is controlled, monitored, and appropriately constrained.



Safe AI for healthcare

Safe AI for healthcare

Safe AI for healthcare

Designing safe healthcare AI is an ongoing process of iteration, evaluation, and refinement. If you're interested in how these techniques are applied in practice, or how such systems integrate into real clinical workflows, feel free to reach out or request a demo.

Designing safe healthcare AI is an ongoing process of iteration, evaluation, and refinement. If you're interested in how these techniques are applied in practice, or how such systems integrate into real clinical workflows, feel free to reach out or request a demo.

Delphyr

Helping healthcare professionals reclaim their time.

Contacts

Delphyr B.V.

IJsbaanpad 2

1076 CV Amsterdam

Netherlands

Follow us

2026 Delphyr. All rights reserved.

Delphyr

Helping healthcare professionals reclaim their time.

Contacts

Delphyr B.V.

IJsbaanpad 2

1076 CV Amsterdam

Netherlands

Follow us

2026 Delphyr. All rights reserved.

Delphyr

Helping healthcare professionals reclaim their time.

Contacts

Delphyr B.V.

IJsbaanpad 2

1076 CV Amsterdam

Netherlands

Follow us

2026 Delphyr. All rights reserved.