Home > Evidence & resources >Cheaper (and more effective) by the dozen: Evidence from 12 randomised A/B tests optimising tutoring for scale

Working paper

20 October 2025

Cheaper (and more effective) by the dozen: Evidence from 12 randomised A/B tests optimising tutoring for scale

Authors:

Noam Angrist, Claire Cullen and Janica Magat

Suggested bibliographic citation: Angrist, N., Cullen, C. & Magat, J. 2025. Cheaper (and more effective) by the dozen: Evidence from 12 randomised A/B tests optimising tutoring for scale. What Works Hub for Global Education Working Paper Series. 2025/001. https://doi.org/10.35489/BSG-WhatWorksHubforGlobalEducation-WP_2025/001

Abstract

Over the course of 12 rapid randomised experiments, we optimise an educational tutoring programme. Tutoring is one of the most effective educational approaches yet has remained difficult to scale due to high costs. We adaptively test and improve a technology-enabled tutoring program to enhance cost-effectiveness and scalability. Results show that seven of twelve tests led to efficiency improvements, a ‘rate of discovery’ of 58%. This compares favorably to the tech sector where 10-40 percent of tests generate improvements, demonstrating the potential for A/B testing to yield large efficiency gains in the education sector. The largest efficiency gains were driven by cost-reducing modifications that streamlined labor-intensive implementation processes and effectiveness-enhancing innovations that actively involved caregivers in their child’s education, more than doubling impact at minimal additional cost. We explicitly measure practitioner prior and posterior beliefs, and find that rigorous testing facilitates more accurate identification of ‘what works’. Our findings both reveal the returns to iterative testing in social programmes and contribute new evidence on simple, cost-effective strategies to improve learning outcomes.

References

Abadie, Alberto. 2020. “Statistical nonsignificance in empirical economics.” American Economic Review: Insights 2(2):193–208.

Aker, Jenny C and Isaac M Mbiti. 2010. “Mobile phones and economic development in Africa.” Journal of economic Perspectives 24(3):207–232.

Al-Ubaydli, Omar, John A List and Dana L Suskind. 2017. “What can we learn from experiments? Understanding the threats to the scalability of experimental results.” American Economic Review 107(5):282–286.

Andrews, Matt, Lant Pritchett and Michael Woolcock. 2017. Building state capability: Evidence, analysis, action. Oxford University Press.

Angrist, Joshua D, Guido W Imbens and Donald B Rubin. 1996. “Identification of causal effects using instrumental variables.” Journal of the American Statistical Association 91(434):444–455.

Angrist, Noam, Amanda Beatty, Claire Cullen and Moitshepi Matsheng. 2024. “A/B testing in education: rapid experimentation to optimise programme cost-effectiveness.” What Works Hub for Global Education.

Angrist, Noam, David K Evans, Deon Filmer, Rachel Glennerster, Halsey Rogers and Shwetlena Sabarwal. 2025. “How to improve education outcomes most efficiently? A review of the evidence using a unified metric.” Journal of Development Economics 172:103382.

Angrist, Noam, Micheal Ainomugisha, Sai Pramod Bathena, Peter Bergman, Colin Crossley, Claire Cullen, Thato Letsomo, Moitshepi Matsheng, Rene Marlon Panti and Shwetlena Sabarwal. 2023. “Building resilient education systems: Evidence from large-scale randomized trials in five countries.”
National Bureau of Economic Research.

Angrist, Noam, Peter Bergman and Moitshepi Matsheng. 2022. “Experimental evidence on learning using low-tech when school is out.” Nature Human Behaviour 6(7):941–950.

Angrist, Noam and Rachael Meager. 2023. Implementation matters: Generalizing treatment effects in education. Blavatnik School of Government, University of Oxford.

Angrist, Noam, Sarah Kabay, Dean Karlan, Lincoln Lau and Kevin Wong. 2025. “Human Capital at Home: Evidence from a Randomized Evaluation in the Philippines.” National Bureau of Economic Research.

Angrist, Noam, Simeon Djankov, Pinelopi K Goldberg and Harry A Patrinos. 2021. “Measuring human capital using global learning data.” Nature 592(7854):403–408.

Athey, Susan, Katy Bergstrom, Vitor Hadad, Julian C Jamison, Berk ¨Ozler, Luca Parisotto and Julius Dohbit Sama. 2023. “Can personalized digital counseling improve consumer search for modern contraceptive methods?” Science Advances 9(40):eadg4420.

Avvisati, Francesco, Marc Gurgand, Nina Guyon and Eric Maurin. 2014. “Getting parents involved: A field experiment in deprived schools.” Review of Economic Studies 81(1):57–83.

Azevedo, Eduardo M, Alex Deng, José Luis Montiel Olea, Justin Rao and E Glen Weyl. 2020. “A/b testing with fat tails.” Journal of Political Economy 128(12):4614–000.

Banerjee, Abhijit, Rukmini Banerji, James Berry, Esther Duflo, Harini Kannan, Shobhini Mukerji, Marc Shotland and Michael Walton. 2017. “From proof of concept to scalable policies: Challenges and solutions, with an application.” Journal of Economic Perspectives 31(4):73–102.

Banerjee, Abhijit V, Shawn Cole, Esther Duflo and Leigh Linden. 2007. “Remedying education: Evidence from two randomized experiments in India.” The Quarterly Journal of Economics 122(3):1235–1264.

Bergman, Peter. 2021. “Parent-child information frictions and human capital investment: Evidence from a field experiment.” Journal of political economy 129(1):286–322.

Bergman, Peter and Eric W Chan. 2021. “Leveraging parents through low-cost technology: The impact of high-frequency information on student achievement.” Journal of Human Resources 56(1):125–158.

Bhatt, Monica P, Jonathan Guryan, Salman A Khan, Michael LaForest-Tucker and Bhavya Mishra. 2024. “Can technology facilitate scale? Evidence from a randomized evaluation of high dosage tutoring.” National Bureau of Economic Research.

Carlana, Michela and Eliana La Ferrara. 2025. “Apart but connected: Online tutoring, cognitive outcomes, and soft skills.” American Economic Review 115(10):3487–3513.

Cortes, Kalena, Karen Kortecamp, Susanna Loeb and Carly Robinson. 2024. “A scalable approach to high-impact tutoring for young readers: Results of a randomized controlled trial.” National Bureau of Economic Research.

Davis, Jonathan MV, Jonathan Guryan, Kelly Hallberg and Jens Ludwig. 2017. “The economics of scale-up.” National Bureau of Economic Research.

DellaVigna, Stefano, Devin Pope and Eva Vivalt. 2019. “Predict science to improve science.” Science 366(6464):428–429.

Dhaliwal, Iqbal, Esther Duflo, Rachel Glennerster and Caitlin Tulloch. 2012. “Comparative cost-effectiveness analysis to inform policy in developing countries.” Abdul Latif Jameel Poverty Action Lab, Massachusetts Institute of Technology, Cambridge, MA.

Dizon-Ross, Rebecca. 2019. “Parents’ beliefs about their children’s academic ability: Implications for educational investments.” American Economic Review 109(8):2728–2765.

Doepke, Matthias, Giuseppe Sorrenti and Fabrizio Zilibotti. 2019. “The economics of parenting.” Annual Review of Economics 11(1):55–84.

Duflo, Annie, Jessica Kiessel and Adrienne M Lucas. 2024. “Experimental Evidence on Four Policies to Increase Learning at Scale.” The Economic Journal 134(661):1985–2008.

Evans, David K and Fei Yuan. 2022. “How big are effect sizes in international education studies?” Educational Evaluation and Policy Analysis 44(3):532–540.

Fryer Jr, Roland G. 2017. The production of human capital in developed countries: Evidence from 196 randomized field experiments. In Handbook of economic field experiments. Vol. 2 Elsevier pp. 95–322.

Ganimian, Alejandro J, Emiliana Vegas and Frederick M Hess. 2020. “Realizing the promise: How can education technology improve learning for all?” The Brookings Institution, Center for Universal Education.

Ganimian, Alejandro J and Sharnic Djaker. 2022. How can developing countries address heterogeneity in students’ preparation for school? A review of the challenge and potential solutions. Technical report Unpublished manuscript. Steinhardt School of Culture, Education, and Human Development, New York University. New York.

Global Education Evidence Advisory Panel. 2023. “2023 Cost-effective Approaches to Improve Global Learning: Recommendations of the Global Education Evidence Advisory Panel (GEEAP).”.

Gortazar, Lucas, Claudia Hupkau and Antonio Rold´an-Mon´es. 2024. “Online tutoring works: Experimental evidence from a program with vulnerable children.” Journal of Public Economics 232:105082.

Kasy, Maximilian and Anja Sautmann. 2021. “Adaptive treatment assignment in experiments for policy choice.” Econometrica 89(1):113–132.

Kohavi, Ron, Diane Tang and Ya Xu. 2020. Trustworthy online controlled experiments: A practical guide to a/b testing. Cambridge University Press.

Kohavi, Ron, Diane Tang, Ya Xu, Lars G Hemkens and John PA Ioannidis. 2020. “Online randomized controlled experiments at scale: lessons and extensions to medicine.” Trials 21:1–9.

Kohavi, Ron and Stefan Thomke. 2017. “The surprising power of online experiments.” Harvard Business Review 95(5):74–82.

Koning, Rembrand, Sharique Hasan and Aaron Chatterji. 2022. “Experimentation and start-up performance: Evidence from A/B testing.” Management Science 68(9):6434–6453.

Kraft, Matthew A, Beth E Schueler and Grace Falken. 2024. What Impacts Should We Expect from Tutoring at Scale? Exploring Meta-Analytic Generalizability. EdWorkingPaper No. 24-1031. Technical report Annenberg Institute for School Reform at Brown University.

Kraft, Matthew A, John A List, Jeffrey A Livingston and Sally Sadoff. 2022. Online tutoring by college volunteers: Experimental evidence from a pilot program. In AEA Papers and Proceedings. Vol. 112 American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203 pp. 614–618.

Kremer, Michael. 2020. “Experimentation, innovation, and economics.” American Economic Review 110(7):1974–1994.

Kremer, Michael, Conner Brannen and Rachel Glennerster. 2013. “The challenge of education and learning in the developing world.” Science 340(6130):297–300.

Kremer, Michael, Sasha Gallant, Olga Rostapshova and Milan Thomas. 2021. “Is Development Economics a Good Investment? Evidence on scaling rate and social returns from USAID’s innovation fund.” University of Chicago Working Paper .

Laster, Larry L and Mary F Johnson. 2003. “Non-inferiority trials: the ‘at least as good as’ criterion.” Statistics in Medicine 22(2):187–200.

List, John A. 2022. The voltage effect: How to make good ideas great and great ideas scale. Crown Currency.

List, John A. 2024. “Optimally generate policy-based evidence before scaling.” Nature 626(7999):491–499.

List, John A, Julie Pernaudet and Dana L Suskind. 2021. “Shifting parental beliefs about child development to foster parental investments and improve school readiness outcomes.” Nature communications 12(1):5765.

Mobarak, Ahmed Mushfiq. 2022. “Assessing social aid: the scale-up process needs evidence, too.” Nature 609(7929):892–894.

Muralidharan, Karthik and Abhijeet Singh. 2025. “Adapting for scale: Experimental Evidence on Technology-aided Instruction in India.” National Bureau of Economic Research.

Muralidharan, Karthik, Abhijeet Singh and Alejandro J Ganimian. 2019. “Disrupting education? Experimental evidence on technology-aided instruction in India.” American Economic Review 109(4):1426–1460.

Murnane, Richard J and Alejandro Ganimian. 2014. “Improving educational outcomes in developing countries: Lessons from rigorous impact evaluations.” NBER working paper (w20284).

Nickow, Andre, Philip Oreopoulos and Vincent Quan. 2020. “The impressive effects of tutoring on prek-12 learning: A systematic review and meta-analysis of the experimental evidence.”.

Rainey, Carlisle. 2024. “Power Rules: Practical Statistical Power Calculations.”.

Robinson, Carly D, Cynthia Pollard, Sarah Novicoff, Sara White and Susanna Loeb. 2024. “The effects of virtual tutoring on young readers: Results from a randomized controlled trial.” Educational Evaluation and Policy Analysis p. 01623737241288845.

Siroker, Dan and Pete Koomen. 2015. A/B testing: The most powerful way to turn clicks into customers. John Wiley & Sons.

UN. 2023. Measuring Digital Development: Facts and Figures 2023. Geneva: International Telecommunication Union.
URL: https://www.itu.int/en/ITU-D/Statistics/Pages/facts/default.aspx

Vivalt, Eva and Aidan Coville. 2023. “How do policymakers update their beliefs?” Journal of Development Economics 165:103121.

Ziege, Elena and Ariel Kalil. 2025. “How Information Affects Parents’ Beliefs and Behavior: Evidence from First-Time Report Cards for German School Children.” University of Chicago, Becker Friedman Institute for Economics Working Paper (2025-54).

Zoido, Pablo, Iván Flores-Ceceña, Miguel Székely, Felipe J Hevia and Eleno Castro. 2024. “Remote tutoring with low-tech means to accelerate learning: Evidence for El Salvador.” Economics of Education Review 98:102506.

Discover more

Young female student with notebook. Photo by Apex 360, Unsplash.

What we do

Our work will directly affect up to 3 million children, and reach up to 17 million more through its influence.

Teacher sits on the floor with group of students. Photo by Husniati Salma, Unsplash.

Who we are

A group of strategic partners, consortium partners, researchers, policymakers, practitioners and professionals working together.

Children reading. Photo by Andrwe Ebrahim, Unsplash.

Get involved

Share our goal of literacy, numeracy and other key skills for all children? Follow us, work with us or join us at an event.

Loading...
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.