Blog
18.11.2025

Building better evidence: How smart study design drives real-world change

At our latest Brown Bag seminar, Drew Dimmery, PhD, Professor of Data Science for the Common Good at the Hertie School, invited us to rethink how we approach solving methodological problems. Instead of treating data as given, he argued that design – choosing what data to collect – can be a powerful tool for solving problems before they even arise.

Through a lively discussion on design-based problem solving, Prof. Dimmery showed how careful planning and mathematical rigor can make studies more reliable, generalizable, and impactful – long before a single data point is gathered. His message was clear: good design isn’t just about intellectual self-indulgence; it’s about making research more useful for the real world.

 

You emphasise solving research problems before collecting data. How does this approach change the kind of insights or impact a study can have in the long run?

The key is having methods that let you solve design problems rigorously, not just thinking ahead. For example, with optimal experimental design we can formally minimize variance or maximize power for specific estimands before seeing any data. These aren't just plans—they're mathematical solutions to practical problems that determine what you can learn from the data.

You mention that a small change in how treatments are assigned can make results more generalisable. How might this improve the way we apply research findings in real-world settings like policy or social programmes?

The method for this setting uses sampling weights in the randomization itself, not just in analysis. If you know your sample differs from the target population—say, younger people or certain regions are overrepresented—you can design the treatment assignment probabilities to account for that. The result is an estimator that directly targets the population average treatment effect, not just the sample effect. So when a policymaker asks about rolling out a program nationally based on a pilot study, the experiment was designed from the outset to answer that question.

Your work bridges computer science and causal inference. How do these algorithmic approaches help researchers produce more reliable or actionable results?

The algorithmic contribution is making optimal design practical for sequential experiments. We developed methods that maintain covariate balance in real-time with provable guarantees on variance reduction, all while running in constant time per assignment. Companies like Meta or Google run tens of thousands of experiments with millions of users, and simple randomization was the only practical option. Now we have algorithms that achieve near-optimal allocation—matching what you'd get if you designed the whole experiment in advance—but working sequentially as participants arrive.

You've developed methods for assigning treatments as data comes in. Where do you see this approach being most useful – like in online platforms, surveys, or field studies?

The methods work anywhere you have sequential arrival and need low-latency assignment. Online platforms are obvious—someone visits a website and needs instant assignment. But the same algorithmic approach applies to surveys where you want to balance as responses come in, or field experiments where you're recruiting participants over weeks. The technical requirement is constant-time assignment that maintains balance guarantees. The methods scale gracefully while ensuring the statistical properties you'd want from a carefully designed batch experiment.

If more researchers adopt design-based problem solving, what changes do you think we'd see in how research influences decisions or public understanding?

The practical impact is that design-based methods make the statistical properties of your study verifiable and transparent. Design-based methods let you formally specify the randomization procedure and prove properties about your estimators before collecting data. This means you can optimize for specific goals—minimize variance, ensure valid inference under interference, or target population-level effects. Smaller organizations benefit because optimal design means fewer participants for the same statistical power. And for decision-makers, you can show exactly what your study can and can't tell them based on the design itself.

 

As experiments increasingly guide decisions in policy, industry, and social programs, the design choices researchers make today shape the evidence of tomorrow. Prof. Dimmery’s work in Data Science for the Common Good provides a roadmap for doing this better—helping researchers build studies that are transparent, efficient, and designed with purpose. By putting design at the heart of discovery, his approach brings us closer to research that truly serves the public good.

You can learn more about Prof. Dimmery’s ongoing work at the intersection of data science and public good by visiting his website: https://ddimmery.com