Wednesday , March 3 2021

How to win over 70% of Rock Paper Scissors – Towards Data Science games

An elementary AI approach to a popular game

Have you ever wondered how the algorithms play chess? How to set up a game program? Why did the AI ​​robot beat you in your favorite game? Well, you will not read it in this article. The game I will write about is easier to play and implement.

Although rock-paper-scissors (RPS) may seem like a trivial game, it actually involves the difficult computational problem of recognizing the temporal pattern. This problem is fundamental for the fields of machine learning, artificial intelligence and data compression. In fact, it may also be essential to understand how human intelligence works.

The text above comes from the big one Rock Paper Scissors programming competition page. It hosts a free and open competition where everyone can submit their own game algorithm. The code requests are in Python 2and they are visible to everyone. It gives everyone the opportunity to see the details of the best solutions. However, it is not always easy to understand them.

The sent algorithm plays a thousand rounds with other programs. It's called a match. The algorithm with the highest number of won rounds takes the game. Players are listed on a Ranking according to the ranking points obtained or lost playing the games.

We implement the simplest gaming algorithm. He always plays Rock.

Note that the only thing you need to do in the program is to give a value for the interval (& # 39; R, & # 39; & # 39; P, & # 39; & # 39; S & # 39;) to the global production variable. The opponent's algorithm assigns you his move entrance global variable. Each round you can see the input of the previous round. The first round of the variable is an empty string.

Well, the first gameplay would be easy enough to neutralize for any algorithm that learns from the data. Now take the opposite approach.

How do you think it will go? It is fairly easy to predict: on average a third lost, a third has equalized and a third won. This strategy gives you a 50% chance to win every game. To get a higher payout percentage you have to take a risk. Would it be fun to watch only random models competing? Rather not, so please do not send entirely random solutions. However, we will use the model in certain situations.

Note that you can try your offline model. Just download from the site You could post them to check how they are, but there is not much computing power there. I made a mistake at the beginning and I posted some codes that were unintentionally random. It is easy to test: as long as the code works with constant inputs (like the first model implemented). If it loses about 50% of the time, your model is probably random.

The model we will implement is a discrete Markov chain. It is built on a simple idea. Let's say that a process has two possible states A and E. Now we are in state A. What is the possibility that we will stay in state A? Furthermore, what is the possibility to declare E? With regard to a Markov chain with two possible states, these two probabilities must sum up to 1. Adequately, there would be two probabilities for the current state E. It is possible to see the mechanism in the figure below.

There is an excellent visualization of how Markov Models work here.

How can we use the model in the context of the RPS competition? A natural way is to analyze the entrance and the production from the last round and try to predict the next one entrance. So make the move that beats him. This configuration indicates that our current state is a pair, such as "RP", and the next state is our result. It should look like this:

Here is the implementation. I have also added a n_obs key to keeping the number of occurrences passed. We will use it in the learning process.

The decay parameter represents a memory of the model. A value one would mean that the model has a perfect memory. If assigned with a value between zero and one, the model would do it forget previous observations and then adapt more quickly to changes in opponents' behavior.

As the entrance forecast, the model chooses the move with the highest probability, given the last one output-input pair.

Below you can find the entire game program:

You can play with the algorithm:

I must warn you. This model would not be good in the contest. It's too elementary. Most risk-taking algorithms should be able to counter this strategy. You can see its performance under this link. It's just a place for you to start an adventure in the Rock-Paper-Scissors contest. You can modify the parameters, train many models and choose the best to play, to create ensemble. It depends only on your imagination. The best algorithms win about 80% of the games. You can check my over-70% algorithm here. Can you do any better?

Source link

Leave a Reply

Your email address will not be published.