RAR: From Myth to Method — The Complete Series

Last updated: April 2026

Response-adaptive randomization (RAR) has been argued about for forty years. The debates are real, but they tend to repeat the same moves: advocates cite patient benefit and efficiency, critics cite power loss and type I error inflation, and neither side specifies when the tradeoffs are actually favorable.

This four-post series works through the critiques systematically — not to vindicate RAR, but to establish what it actually costs, when those costs are acceptable, and what design work is required to justify using it. The anchor is Robertson et al. (2023), "Response-Adaptive Randomization in Clinical Trials: From Myths to Practical Considerations," Statistical Science — a comprehensive methodological review that finally gave the field a common empirical reference point.


The series, in order

Post 1. When RAR Is the Right Design — What response-adaptive randomization actually does, how the allocation mechanism works, and what conditions need to hold for the design to make sense. The framing post that establishes what the following posts are arguing about.

Post 2. The Power Penalty [Coming Soon]

Post 3. Time Trends [Coming Soon]

Post 4. When RAR Is Worth the Complexity [Coming Soon]


What the series covers

Patient benefit is often overstated. RAR allocates more patients to the better-performing arm — but only if the outcome is fast enough relative to enrollment pace for the mechanism to actually function. Slow endpoints, short trials, or two-arm settings undermine the ethical justification.

The power penalty is procedure-specific, not inherent. Thompson Sampling incurs the largest penalty. DBCD is considerably more robust. The Wald test compounds the problem under imbalanced allocation, but the score test corrects it. Knowing which procedure you are using matters more than the aggregate critique.

Time trends are manageable, not disqualifying. This is the most serious systematic risk, but it is not unique to extreme procedures. Pre-specifying a temporal sensitivity analysis and adjusting for enrollment period in the primary model adds one term to a logistic regression. Most RAR trials do not do this. They should.


  • In Defense of 50:50 Randomization — The argument the series is in dialogue with. Fixed equal allocation is more efficient than it looks; adaptive allocation is less efficient than it sounds. The BATTLE power comparison is the worked example.
  • Your Randomization Scheme Is a Design Decision, Not a Coin Flip — How covariate-adaptive randomization (minimization, stratified permuted blocks) differs from response-adaptive randomization, and why the choice has inferential consequences.
  • What BATTLE Got Right That Most Adaptive Trials Get Wrong — BATTLE used response-adaptive randomization across biomarker-defined cohorts. What it required to work — fast outcomes, real biomarker infrastructure, operational discipline — and why most trials claiming to be adaptive are not.

Tools and reference


Reading paths

Evaluating RAR for a specific trial? Start: When RAR Is the Right Design → [The Power Penalty] → [Time Trends] → [When RAR Is Worth the Complexity] → Technical ReferenceCalculator

Skeptical about RAR and want the strongest case against? Start: In Defense of 50:50 RandomizationWhen RAR Is the Right Design → [The Power Penalty] → [When RAR Is Worth the Complexity]

Designing a composed adaptive trial with Bayesian monitoring? Start: [When RAR Is Worth the Complexity] → The Simulations Behind Comment 3Regulatory Design Package


Have questions about RAR in a specific trial context? Get in touch or comment on any post.