What is Multi Armed Bandit
Typically in an A/B test, the web site’s traffic is split evenly among two or more variations. Multi-armed bandit testing allows you to allocate traffic dynamically to the variations that are performing well without sending any more traffic to variants that do not work.
The approach you will hear about today is faster than the traditional method and does not require as many components and adjusting.
The Multi-Arm Bandit algorithm is machine-learning optimization algorithms. It shifts traffic in real-time towards the winning variation and automatically allocates more or less traffic between variations.
Optimization: The Bandit Approach
The term “multi-armed bandits” can be used to describe a problem to which many solutions may be applied. Large marketing companies offer more than classic A/B testing and uses the Bandit Approach, providing multiple algorithms for dealing with different problems in order to achieve optimal results.
The multi-armed bandit problem illustrates the difficulty of selecting the best out of an unknown number of options.
One possible solution to this problem is Thompson sampling, in which at each turn you will choose a variation with a particular probability. If the variation isn’t an optimal one, it also won’t be chosen as often.
A similar algorithm could perform a traffic allocation every hour or 30 minutes and then weighs it.