What BATTLE Got Right That Most Adaptive Trials Get Wrong
Everyone says they want adaptive designs.
Almost no one actually runs them.
In decks, protocols, and FDA briefing books, “adaptive” has become a fashionable adjective—usually meaning a trial with one interim look, a conditional power calculation, and a long list of things you are not allowed to change. The learning benefit is promised, then quietly pre-specified out of existence.
The BATTLE trial was different. Not because it was Bayesian—that’s the easy part—but because it allowed the data to redirect patients in real time. That distinction is subtle, hard to execute, and still rare more than a decade later.
This piece is about what made that possible, why it worked in biomarker-driven oncology, and where this model breaks down.
The Real Interim Decision (Not the Hypothetical One)
At an interim analysis in a Phase II oncology trial, the uncomfortable question is rarely “Did we cross a boundary?”
It’s this:
Given what we know right now, is continuing to randomize patients the same way still defensible?
In most so-called adaptive trials, the answer is predetermined. The protocol allows a dose to drop, a sample size to inflate, or a cohort to close—if a narrow statistical criterion is met. If not, the machinery keeps running.
BATTLE confronted a harder decision: whether the trial should learn while it treated, not after.
What BATTLE Actually Did
BATTLE (Biomarker-integrated Approaches of Targeted Therapy for Lung Cancer Elimination) enrolled patients with non–small cell lung cancer into an umbrella design before “umbrella trials” were standard.
Key features mattered:
- Biomarkers were defined up front. Patients were biopsied on entry and classified into molecular subgroups (e.g., EGFR, KRAS).
- Multiple targeted therapies ran in parallel.
- Randomization probabilities were allowed to change continuously.
If accumulating data suggested that patients with a particular biomarker profile were responding better to one therapy, future patients with that same profile were increasingly likely to receive it.
This wasn’t a cosmetic adaptation layered onto a fixed design. The randomization itself was the learning engine.
The result: BATTLE didn’t just test drugs—it routed patients. In the KRAS-mutant subgroup, the 8-week disease control rate with sorafenib was roughly 60%, compared with about 30% for erlotinib—a difference that emerged during the trial, not years later in a pooled analysis.
Why the DSMB Said Yes
The usual explanation for why we don’t see more trials like BATTLE is regulatory fear. That’s incomplete.
What actually made BATTLE feasible was a combination of clear boundaries on adaptation and operational credibility.
1. The Adaptation Was Constrained to the Right Level
BATTLE did not allow:
- Moving biomarker definitions
- Inventing subgroups midstream
- Fishing across endpoints
The learning happened within pre-specified biomarker strata. This mattered.
It meant the DSMB and sponsors weren’t being asked to bless open-ended flexibility—only probability updating conditional on information everyone already agreed was biologically meaningful.
2. The Outcome Was Fast Enough to Matter
Adaptive randomization only works when feedback arrives on a timescale relevant to enrollment.
In BATTLE:
- The primary endpoint was short-term disease control.
- Signals emerged quickly enough to influence subsequent assignments.
If outcomes had taken two years to mature, the adaptive machinery would have been mostly decorative.
3. The Infrastructure Was Real, Not Aspirational
Real-time adaptation requires:
- Rapid data cleaning
- Near–real-time model updates
- Tight coordination between statisticians, operations, and investigators
Most trials fail here—not statistically, but operationally. I’ve reviewed multiple protocols in the past few years that called themselves adaptive yet would not have changed a single patient’s assignment based on accumulating data. BATTLE invested in this infrastructure up front, which made adaptation routine rather than heroic.
The Honest Limitation: Where This Model Breaks
BATTLE worked because several conditions aligned:
- Biomarker subgroups were biologically motivated and fixed
- Treatment–biomarker interactions were plausible
- Outcomes were observable quickly
Remove any one of these and the value proposition collapses.
Adaptive randomization struggles when:
- Biomarkers are exploratory or high-dimensional
- Endpoints are long-term (OS, PFS with delayed separation)
- Patient populations are small enough that probability updates become unstable
In those settings, “adaptation” often adds noise without meaningfully improving patient allocation or inference.
Where Adaptive Randomization Is Underrated—and Overhyped
Underrated:
- Phase II signal-finding in biomarker-defined populations
- Platform trials where routing patients is itself a scientific objective
- Settings where patient benefit within the trial is ethically salient
Overhyped:
- Late-phase confirmatory trials
- Designs where adaptation is so constrained it can’t change decisions
- Situations where operational latency exceeds statistical learning
Calling these designs adaptive doesn’t make them so.
The Takeaway
Most adaptive trials fail not because the statistics are wrong, but because the adaptation is too timid to matter.
BATTLE succeeded because it treated the trial as a learning system, not a hypothesis test with optional decorations. It accepted operational pain in exchange for genuine information gain—and patient benefit—during the trial, not after it.
That tradeoff isn’t always worth it.
BATTLE is what “adaptive” was supposed to mean before it became a checkbox.
📬 Want more insights on experimental design across domains? Subscribe to the newsletter or explore the full archive of Evidence in the Wild.
Member discussion