Insight note
A practical approach to developing A/B testing systems for digital-first organisations
Andrés Parrado, Mariana Rodríguez and Eduardo Vargas
Abstract
Social impact organisations face growing pressure to learn and improve their programmes with limited resources. A/B testing offers a rigorous and practical tool to test programme adaptations aimed at improving cost-effectiveness. This brief provides a step-by-step guide for digital-first organisations seeking to embed A/B testing into their Monitoring, Evaluation, and Learning (MEL) system. Drawing on Innovations for Poverty Action’s Right-Fit Evidence (RFE) Unit advisory experience and a growing body of evidence from the social sector, it introduces the Learning Roadmap for A/B Testing, a structured four-step process for developing the organisational capabilities, technological infrastructure, and learning culture needed to test, learn, and improve continuously.
References
Abdul Latif Jameel Poverty Action Lab. (n.d.). Quick guide to power calculations. https://www.povertyactionlab.org/resource/quick-guide-power-calculations
Alvarez-Marinelli, H., Berlinski, S., & Busso, M. (2021). Remedial education: Evidence from a sequence of experiments in Colombia. Journal of Human Resources, 56(4), 1137–1186. https://doi.org/10.3368/jhr.0320-10801R2
Angrist, N., Beatty, A., Cullen, C., & Nkwane, T. M. (2026, January). Iterative A/B testing for social impact: Rigorous, rapid, regular. Stanford Social Innovation Review. https://ssir.org/articles/entry/iterative-a-b-testing-social-impact
Angrist, N., Cullen, C., & Magat, J. (2025, October). Cheaper (and more effective) by the dozen: Evidence from 12 randomised A/B tests optimising tutoring for scale (Working paper). What Works Hub for Global Education. https://www.wwhge.org/wp-content/uploads/2025/10/
Cheaper-by-the-dozen-tutoring-AB-testing_WP_2025001_updated.pdf
Angrist, N., Beatty, A., Cullen, C., & Matsheng, M. (2024). A/B testing in education: Rapid experimentation to optimise programme cost-effectiveness. What Works Hub for Global Education. Insight Note 2024/001. https://doi.org/10.35489BSGWhatWorksHubforGlobalEducation-RI_2024/001
Gugerty, M. K., & Karlan, D. (2018). The Goldilocks challenge: Right-fit evidence for the social sector. Oxford University Press. https://doi.org/10.1093/oso/9780199366088.001.0001
Innovations for Poverty Action Right-Fit Evidence Unit. (2024, October). Enabling stage-based learning: A funder’s guide to maximize impact. https://poverty-action.org/sites/default/files/2024-10/Enabling-Stage-Based-Learning-Full-Guide.pdf
Innovations for Poverty Action. (2026, January). Theory of change. IPA Knowledge Hub. https://data.poverty-action.org/monitoring-evaluation-learning/theory-of-change.html
Kasy, M., & Sautmann, A. (2021). Adaptive treatment assignment in experiments for policy choice. Econometrica, 89(1), 113-132.
Kelly, K., Arroyo, I., & Heffernan, N. (2013). Using ITS generated data to predict standardized test scores. In Proceedings of the 6th International Conference on Educational Data Mining (pp. 3–4). https://www.educationaldatamining.org/EDM2013/papers/rn_paper_62.pdf
Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy online controlled experiments: A practical guide to A/B testing. Cambridge University Press.
Kohavi, R., Deng, A., & Vermeer, L. (2022). A/B testing intuition busters: Common misunderstandings in online controlled experiments. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22) (pp. 3168–3177). Association for Computing Machinery. https://doi.org/10.1145/3534678.3539160
Muñoz-Merino, P. J., Ruipérez-Valiente, J. A., & Delgado-Kloos, C. (2013). Inferring higher level learning information from low level data for the Khan Academy platform. In Proceedings of the Third International Conference on Learning Analytics and Knowledge (LAK ’13) (pp. 112–116). ACM Press. https://doi.org/10.1145/2460296.2460318
Ruipérez-Valiente, J. A., Muñoz-Merino, P. J., & Delgado-Kloos, C. (2018). Improving the prediction of learning outcomes in educational platforms including higher level interaction indicators. Expert Systems. Advance online publication. https://doi.org/10.1111/exsy.12298
Singh, A., Navarro-Sola, L., & Oreopoulos, P. (2025). Education technology. VoxDevLit, 20(1). https://voxdev.org/voxdevlit
The Agency Fund. (2025). User Funnel Playbook for the Social Sector. https://theagencyfund.substack.com/p/user-funnel-playbook-for-the-social
The Agency Fund. (2025.). AI evaluation in the social sector: A living playbook for evaluating AI products in the social sector. https://eval.playbook.org.ai/
Vanacore, K., Ottmar, E., Liu, A., & Sales, A. (2024). Remote monitoring of implementation fidelity using log-file data from multiple online learning platforms. Journal of Research on Technology in Education. Advance online publication. https://doi.org/10.1080/15391523.2024.2303025
Discover more

What we do
Our work will directly affect up to 3 million children, and reach up to 17 million more through its influence.

Who we are
A group of strategic partners, consortium partners, researchers, policymakers, practitioners and professionals working together.

Get involved
Share our goal of literacy, numeracy and other key skills for all children? Follow us, work with us or join us at an event.


