Rock Paper Scissors
Intro
This is the first project in the freeCodeCamp Machine Learning with Python Certification. The objective is to create a program that plays Rock Paper Scissor. The program must play against four bots provided by freeCodeCamp and win at least 60% of the games against each bot. The Read more about it in Rock Paper Scissors.
Check out the full code for this project at https://replit.com/@yasakdogra/boilerplate-rock-paper-scissors
Planning
The average win rate for the random choice is about 50%. To increase the win rate we will have to predict the behavior of bots. We don’t have any training data about the bots, so we can’t train our algorithm ahead of time. Even if we generate some data by playing matches against the bots, the model trained by such data will be really limited and might do worse against bots that use different strategies than the ones we trained against. At best, we can set play some matches against the bots and try to manually tune some hyper parameters.
We will use Markov chain to build our program. Our program will keep track of what the bot has played in the last few games, use that information to predict what the bot will play next and play the move that will counter it.
Talking in Markov chain terms, the states can be all permutations of m number of moves and the probabilities can be generated using the last n games played by the bots.
Examples of a state would be R
when m is one, RS
when m is two, RSR
when m is three. The higher m is the more states (permutations) there are. We can tune this parameter to whatever works best in our trials.
We don’t really need to do the calculation for probabilities. We will just go with the highest probability which will also be the highest count. So our table will just look like this (m is two in this example)
State | R | P | S |
---|---|---|---|
RR | 2 | 3 | 1 |
RP | 4 | 4 | 2 |
RS | 4 | 5 | 3 |
PR | 5 | 4 | 1 |
PP | 6 | 2 | 1 |
PS | 2 | 1 | 5 |
SR | 3 | 5 | 2 |
SP | 1 | 6 | 1 |
SS | 5 | 3 | 2 |
Code
We will use the boiler plate project provided by freeCodeCamp. We will create the player function that will receive the last move played by a bot as an argument.
We also need to keep track of bot move history and our count table for Markov chain. We cannot save these in a variable in our function, since it will be overwritten in the subsequent function calls. We will use a little trick here. We will use two additional parameters for our function, both of which will be optional and the default values will be lists. Since, lists are mutable, the default values will update when we mutate the parameters inside the function.
In RPS.py, import itertools
. We will use this library to create permutations for our state
import itertools
Create hyper parameters. SEQ_LEN
is number of moves in state (m). NUM_RECORDS
is the number of moves to use to create our counts table (n).
SEQ_LEN = 3
NUM_RECORDS = 100
Create moves and counter moves
MOVES = ['R', 'P', 'S']
COUNTERS = {'R': 'P', 'P': 'S', 'S': 'R'}
Create the player function, define an inner function to initialize the model and a default move
def player(prev_play, opponent_history=[], model=[]):
def init_model():
if len(model) == 1:
model.pop(0)
model.append({''.join(x): 0 for x in itertools.product(MOVES, repeat=SEQ_LEN)})
guess = 'R'
NOTE: All the code below also goes inside the player
function
We can check when a new bot starts playing by checking for an empty move. When a new bot starts, we can clear the history, reset our model and return the default guess
if prev_play == '':
init_model()
opponent_history.clear()
return guess
If there is not enough data to add even a single state, we can just add the play to the history and return the default guess.
if (len(opponent_history) < SEQ_LEN):
opponent_history.append(prev_play)
return guess
If we have reached the maximum number of history we want to use, we can remove the oldest one from both the history and model
if len(opponent_history) >= NUM_RECORDS:
oldest_seq = ''.join(opponent_history[0:SEQ_LEN])
model[0][oldest_seq] -= 1
opponent_history.pop(0)
Now that we have the checks in place, we can move forward and add the prev move to history and add the counts to the model
opponent_history.append(prev_play)
new_seq = ''.join(opponent_history[-SEQ_LEN:])
model[0][new_seq] += 1
Get a set of all possible future states
possible_moves = [new_seq[1:] + x for x in MOVES]
Generate a prediction using max count in the table
pred = max(possible_moves, key=lambda x: model[0][x])[-1]
Finally, counter and return the guess
guess = COUNTERS[pred]
return guess
Results
Running the test provided by freeCodeCamp and using SEQ_LEN = 3
and NUM_RECORDS = 100
Testing game against abbey...
Final results: {'p1': 483, 'p2': 301, 'tie': 216}
Player 1 win rate: 61.60714285714286%
.Testing game against kris...
Final results: {'p1': 942, 'p2': 50, 'tie': 8}
Player 1 win rate: 94.95967741935483%
.Testing game against mrugesh...
Final results: {'p1': 843, 'p2': 153, 'tie': 4}
Player 1 win rate: 84.63855421686746%
.Testing game against quincy...
Final results: {'p1': 995, 'p2': 2, 'tie': 3}
Player 1 win rate: 99.79939819458376%
.
----------------------------------------------------------------------
Ran 4 tests in 0.075s
OK
Thank you for reading. You can also check out my other projects for this series below.