09 Apr 2026 6 min read Regulatory Science

When FDA Says Yes and NICE Says Not Yet: The Same Trial, Two Different Questions

Listen to this essay AI Narrated

In July 2023, the FDA granted full approval to lecanemab (Leqembi) for early Alzheimer's disease — the first therapy to demonstrate disease-modifying benefit at that level of evidence. Fourteen months later, the National Institute for Health and Care Excellence (NICE) declined to recommend it for the NHS. A third appraisal committee meeting was held in May 2025. The committee rejected both lecanemab and donanemab for a third time. As of this writing, eligible patients in England and Wales still cannot access either drug through the NHS. The story has since added another chapter.

Both decisions were based on the same underlying trial.

This is not a story about one agency being right and the other wrong. It is a story about two different questions being asked of the same evidence, and what happens when the data are only strong enough to answer one of them.

One trial, two verdicts

The Clarity AD trial enrolled 1,800 patients with early-stage Alzheimer's disease, randomised to lecanemab or placebo over 18 months. The primary endpoint was the CDR-SB, a composite measure of cognitive and functional decline. Lecanemab slowed decline by 27% relative to placebo: a treatment difference of 0.45 points (95% CI: 0.27 to 0.63). The result was statistically significant.

The FDA's conclusion: clinical benefit demonstrated. Approval granted.

NICE's conclusion: a four-to-six month delay in decline, combined with the costs of administration and monitoring, "was not considered a cost-effective use of NHS funding." More evidence was needed to generate robust cost-effectiveness estimates. The drug was not recommended.

ICER, the US independent HTA body, was more explicit about the numbers: at a list price of ＄26,500 per year, lecanemab is cost-effective only if priced between ＄8,900 and ＄21,500. Its independent appraisal committee voted 12-3 that the evidence was insufficient to demonstrate net health benefit.

The divergence is not about whether lecanemab works. There is broad agreement that it slows cognitive decline. The divergence is about whether the evidence is strong enough to support the long-term extrapolations that cost-effectiveness analysis requires.

What the confidence interval is actually saying

The Clarity AD confidence interval excludes zero. By conventional standards, the trial succeeds.

But the width of that interval carries information that a binary significant/non-significant decision discards.

HTA models do not evaluate drugs at 18 months. They evaluate them over patient lifetimes, often 10 to 30 years in Alzheimer's disease. The trial result must be extrapolated across that horizon using assumptions about how the treatment effect persists, wanes, or accumulates. Each assumption introduces uncertainty. Each uncertainty propagates through the probabilistic sensitivity analysis (PSA).

A wide confidence interval from a short trial translates directly into higher PSA variance. High PSA variance means the ICER estimate spans a wide range: in this case, a 2.4x spread between lower and upper thresholds. Whether lecanemab is cost-effective depends critically on which end of that range you consider plausible.

NICE's problem is not that lecanemab is ineffective. It is that the trial cannot anchor the long-term assumptions the model requires. The evidence is too short, and the interval too wide, to distinguish between a modest transient benefit and a durable modification of disease progression, and those scenarios have fundamentally different cost-effectiveness profiles.

What the CPI reveals, and what it doesn't

The Reverse-Bayes framework makes one part of this problem precise. (For a full introduction to the Analysis of Credibility and how the CPI is computed, see Bayes through the Looking-Glass.)

Applying the Analysis of Credibility to the Clarity AD result — a CDR-SB difference of 0.45 (95% CI: 0.27 to 0.63) — yields a scepticism limit of 0.08 CDR-SB points. To dismiss this result, a sceptic would need to believe, before seeing the data, that any treatment benefit exceeding 0.08 points on the CDR-SB was implausible. That is roughly 17% of the observed effect. It is an extraordinarily demanding prior to hold.

In other words: the CPI supports the FDA's decision to approve. The 18-month result from Clarity AD is credible. The evidence clears the AnCred bar.

But here is what the CPI also reveals: the question NICE is asking is not the same question.

NICE is not asking whether the 18-month result is credible. It is asking whether that 18-month result can anchor a model that projects benefit across a patient lifetime. The scepticism limit addresses the former. The latter requires an additional inferential leap — from short-term clinical endpoint to long-term disease modification — that the CPI cannot adjudicate and the trial was not designed to resolve.

The CPI makes the distinction visible. A tight scepticism limit tells you the data are strong enough to believe the 18-month effect is real. It does not tell you the 18-month effect is large enough, or durable enough, to generate cost-effective lifetime benefit at ＄26,500 per year. Those are separate questions, and the CPI correctly declines to conflate them.

This is not a limitation of the framework. It is the framework working as intended: making explicit what the evidence can and cannot establish, rather than allowing the inference to stretch further than the data support.

The structural problem

The FDA asks: does the trial demonstrate clinical benefit?

An 18-month randomised trial with a statistically significant and credible result answers that question.

HTA bodies ask: is the drug cost-effective over a patient lifetime?

That requires extrapolation, uncertainty quantification, and a model capable of distinguishing between scenarios the trial was never designed to resolve.

These are different evidentiary standards, and they are diverging in practice.

The FDA's January 2026 Bayesian guidance reinforces a single-trial paradigm for primary inference in pivotal studies. This is sensible for regulatory decision-making: it enables faster access and reduces unnecessary patient burden. But it also means that more approvals will rest on evidence optimised for the regulatory question and underpowered for the HTA one.

Lecanemab is not an outlier. It is a preview.

Across neurodegenerative disease, rare disease, and oncology, the pipeline is full of therapies where regulatory and HTA standards are structurally misaligned. Each will produce its own version of approval without access.

What this means for trial design

The standard power calculation asks: what sample size is needed to detect a clinically meaningful effect with 80-90% power at α = 0.05?

That question is optimised for regulatory approval.

It is not optimised for the combined evidentiary burden — regulatory plus HTA — that determines whether a drug is actually used.

A CPI-informed design asks a different question: given the expected confidence interval from this trial, what scepticism limit will it produce? And will that limit require beliefs about long-term benefit that are not supported by prior evidence?

That is necessary, but not sufficient.

A tight SL tells you the short-term effect is credible. But the HTA question imposes an additional requirement: the CI must be narrow enough, and the trial long enough, that the result can anchor a lifetime extrapolation without the cost-effectiveness estimate collapsing under PSA variance.

For Clarity AD, the SL was tight; credibility was not the problem. The problem was duration and extrapolation uncertainty. A CPI-informed design would have flagged this distinction at the outset: not "will this trial produce a credible result?" but "will this trial produce a result credible enough, and precise enough, to support the downstream HTA question?"

If the answer is no, the sponsor faces the lecanemab problem: approval followed by an HTA process that effectively re-litigates the evidence from scratch.

This is not an argument for delaying access or requiring HTA-grade evidence upfront. It is an argument for making the trade-off explicit at the design stage, rather than discovering it during an ICER review or a third NICE committee meeting.

Looking forward

As of April 2026, lecanemab is approved in the US, Japan, China, South Korea, and the UK. The third NICE committee meeting was held on May 14, 2025. Final draft guidance followed on June 19, 2025, rejecting both lecanemab and donanemab on the same grounds: benefits too modest, long-term evidence too thin, cost per QALY too high.

Both manufacturers appealed. In January 2026, the appeal panels upheld several grounds for both drugs. For lecanemab, the panel found that Eisai had insufficient time to respond to NHS England's infusion cost estimates before the third meeting, and that the committee's assessment of carer utility values was unreasonable: the EQ-5D instrument used to measure them substantially underestimates the burden on carers. For donanemab, the panel upheld equivalent findings, and additionally found that the committee failed to consider unpaid care costs through a non-reference case scenario analysis.

The appraisals have been remitted for reconsideration on these specific points. No dates for the next committee meetings have been confirmed.

The carer utility issue is not incidental. If EQ-5D systematically underestimates carer burden — and there is substantial evidence that it does for dementia caregivers — then the cost-effectiveness models used in all three prior meetings have been understating the societal value of slowing disease progression. That is precisely the kind of structural modeling problem that propagates silently through PSA and never surfaces in the headline ICER estimate. The patients waiting for the next decision are still waiting on the same underlying trial evidence. The question being asked of it keeps getting refined.

Many evidentiary problems appear during trial design, not after the analysis. I work with teams to review trial designs and run simulation studies to evaluate operating characteristics before protocols are finalized.

For consulting inquiries: maggie@zetyra.com
For more essays on statistical design and regulatory evidence, subscribe to the Evidence in the Wild newsletter.

Maggie Qian

Biostatistician with a decade in oncology clinical trials. Founder of Zetyra. Writing about methods that hold up in practice.

One trial, two verdicts

What the confidence interval is actually saying

What the CPI reveals, and what it doesn't

The structural problem

What this means for trial design

Looking forward

Maggie Qian

You might also like...

What Everyone Else Submitted to FDA on the Bayesian Guidance

The Grammar No One Reads: How Platform Trials Standardize Statistical Design

Bayes through the Looking-Glass: Assessing the Credibility of Clinical Trial Outcomes by Inverting Bayes’s Theorem

What Randomization Can't Fix

Bayesian Priors in Clinical Trials: Why Historical Borrowing Reduces Sample Size

Every essay, delivered