Wolters Kluwer Releases Validation Framework for Evaluating Clinical AI at Point of Care
The framework goes beyond binary review of answers to assess context, uncertainty, and clinical impact; designed to support governance committees
How can hospital governance committees evaluate answers provided by clinical AI?
As generative AI becomes more prominent in clinical workflows, traditional evaluation methods such as benchmarks, test questions, or user ratings fall short because they do not capture whether an answer aligns with clinical intent, whether it omits critical information, or whether it behaves appropriately under uncertainty. By contrast, UpToDate’s approach addresses these gaps through a multi-method framework that evaluates the answers clinicians receive and interpret when making care decisions.
The approach, detailed in the new report, A Measured Approach to Evaluating Clinical AI at the Point of Care, evaluates AI performance across three core dimensions: clinical intent, knowledge integrity, and clinical impact. Together, these dimensions provide hospital governance committees with a meaningful assessment of clinical reliability that goes beyond generic, measurement benchmarks.
“Assessing the reliability and clinical validity of AI is complex and inadequately captured by current benchmarks,” said
Clinical AI stress tested by physicians and AI experts; continuously monitored for reliability
The evaluation framework applied to UpToDate Expert AI combines automated testing with structured human review by leading physician editors and clinical AI experts. It includes rubric-based assessment, stress testing, “red teaming” (where professionals work to “break” a system), and ongoing monitoring to detect omissions, unsupported claims, loss of context, and other output anomalies that generic evaluations may miss.
In the most recent evaluation using this model, UpToDate Expert AI was tested on 1,669 clinical queries comprising more than 15,000 criteria. The results showed that UpToDate Expert AI provided clinically aligned information for 99.9% of assessed criteria. UpToDate Expert AI also had a significantly lower rate of omissions when compared with two general-purpose LLM comparators; both comparators had a rate of omission that was 15% higher than UpToDate Expert AI.
What are the key elements of a governance-ready clinical AI validation approach?
- Clinically meaningful evaluation: Solution should be tested against point-of-care criteria written by top physician experts in their field, rather than relying solely on generic benchmarks.
- Performance where it matters: The evaluation approach emphasizes clinical intent by measuring whether the response itself is clinically relevant and includes the information that matters most at the point of care.
- Grounded answers: Responses are evaluated for knowledge, integrity and traceability to trusted databases and clinical content, e.g., UpToDate.
- Risk-aware by design: “Red teaming,” expert review by medical specialists, bias testing, and regression monitoring are used to identify and reduce potential output issues. The most recent evaluation of UpToDate Expert AI had 200 hours of adversarial red-team testing.
- Clinical reasoning embedded: Solution should be designed to help clinicians think through next steps using evidence, assumptions, and expert judgment – and not simply return an AI-generated answer. As concerns increase about de-skilling, the overreliance on AI tools, which reduce clinicians’ ability to exercise independent clinical judgment, preserving access to a transparent view of all steps involved in the reasoning process becomes vital. This system-level approach reflects growing expectations from health systems, regulators, and clinicians for transparency, accountability, and governance in clinical AI.
UpToDate Expert AI is enabled by Wolters Kluwer Expert AI. Solutions built on Wolters Kluwer Expert AI use deep domain expertise and a decade of AI experience, empowering customers to work faster and make smarter decisions based on trusted and verified content.
As of
For more information about
About
For more information, visit www.wolterskluwer.com, and follow us on LinkedIn, Facebook, YouTube, and Instagram.
View source version on businesswire.com: https://www.businesswire.com/news/home/20260521660539/en/
Media Contact
Associate Director,
+1 781-255-5843
suzanne.moran@wolterskluwer.com
Source: