Monthly Archives: August 2017

Contextual Bandits and Reinforcement Learning

If you develop personalization of user experience for your website or an app, contextual bandits can help you. Using contextual bandits, you can choose which content to display to the user, rank advertisements, optimize search results, select the best image to show on the page, and much more.

There are many names for this class of algorithms: contextual bandits, multi-world testing, associative bandits, learning with partial feedback, learning with bandit feedback, bandits with side information, multi-class classification with bandit feedback, associative reinforcement learning, one-step reinforcement learning.

Researchers approach the problem from two different angles. You can think about contextual bandits as an extension of multi-armed bandits, or as a simplified version of reinforcement learning.

Continue reading

Open AI Deep Learning for DotA 2

OpenAI’s bot is the first ever to defeat world’s best players in DotA 2 at The International 2017. It is a major step for AI in eSports. The bot was trained through self-play, but some tactics were hardcoded.

The bot doesn’t play DotA in regular 5v5 setup. It can only beat humans in 1v1 play. Team work will be harder to learn. Also, the bot plays only one character and has a few unfair advantages comparing to human players, e.g. it’s likely to have access to exact information such as distance to other players on the map and health.

Continue reading