Bayesian A/B testing has gained much popularity over the years. It seems, however, that the examples stop at two groups. This begs the questions: should we not be able to do more than simple two-group, case/control comparisons? Is there a special procedure that's necessary, or is there a natural extension of commonly-used Bayesian methods?
In this talk, I will use life-like, simulated examples, inspired from work and from meeting others at conferences, to show how to generalize A/B testing beyond the rigid assumptions commonly highlighted. Specifically, I will show two examples, one involving Bayesian estimation on click data on a website, and another on 4-parameter dose-response curves.
There will be plenty of code from the modern PyData stack, involving the use of PyMC3, pandas, holoviews, and more.