Clinical Large Language Model Evaluation by Expert Review (CLEVER): Framework Development and Validation

Clinical Large Language Model Evaluation by Expert Review (CLEVER): Framework Development and Validation

The proliferation of both general purpose and health care–specific large language models (LLMs) has intensified the challenge of effectively evaluating and comparing them. Data contamination plagues the validity of public benchmarks, self-preference distorts LLM-as-a-judge approaches, and there is a gap between the tasks used to test models and those used in clinical practice.

Medigy Insights

Large language models (LLMs) can significantly improve how patients are matched to suitable clinical trials by enhancing retrieval accuracy and handling complex eligibility criteria at scale. This AI-driven approach could streamline trial recruitment, reduce manual screening effort, and help more patients access appropriate research opportunities.



Did you find this useful?

Medigy Innovation Network

Connecting innovation decision makers to authoritative information, institutions, people and insights.

Medigy Logo

The latest News, Insights & Events

Medigy accurately delivers healthcare and technology information, news and insight from around the world.

The best products, services & solutions

Medigy surfaces the world's best crowdsourced health tech offerings with social interactions and peer reviews.


© 2026 Netspective Foundation, Inc. All Rights Reserved.

Built on Feb 6, 2026 at 1:14pm