Rock Paper Scissors

Intro

This is the first project in the freeCodeCamp Machine Learning with Python Certification. The objective is to create a program that plays Rock Paper Scissor. The program must play against four bots provided by freeCodeCamp and win at least 60% of the games against each bot. The Read more about it in Rock Paper Scissors.

Check out the full code for this project at https://replit.com/@yasakdogra/boilerplate-rock-paper-scissors

Planning

The average win rate for the random choice is about 50%. To increase the win rate we will have to predict the behavior of bots. We don’t have any training data about the bots, so we can’t train our algorithm ahead of time. Even if we generate some data by playing matches against the bots, the model trained by such data will be really limited and might do worse against bots that use different strategies than the ones we trained against. At best, we can set play some matches against the bots and try to manually tune some hyper parameters.

We will use Markov chain to build our program. Our program will keep track of what the bot has played in the last few games, use that information to predict what the bot will play next and play the move that will counter it.

Talking in Markov chain terms, the states can be all permutations of m number of moves and the probabilities can be generated using the last n games played by the bots.

Examples of a state would be R when m is one, RS when m is two, RSR when m is three. The higher m is the more states (permutations) there are. We can tune this parameter to whatever works best in our trials.

We don’t really need to do the calculation for probabilities. We will just go with the highest probability which will also be the highest count. So our table will just look like this (m is two in this example)

StateRPS
RR231
RP442
RS453
PR541
PP621
PS215
SR352
SP161
SS532

Code

We will use the boiler plate project provided by freeCodeCamp. We will create the player function that will receive the last move played by a bot as an argument.

We also need to keep track of bot move history and our count table for Markov chain. We cannot save these in a variable in our function, since it will be overwritten in the subsequent function calls. We will use a little trick here. We will use two additional parameters for our function, both of which will be optional and the default values will be lists. Since, lists are mutable, the default values will update when we mutate the parameters inside the function.

In RPS.py, import itertools. We will use this library to create permutations for our state

import itertools

Create hyper parameters. SEQ_LEN is number of moves in state (m). NUM_RECORDS is the number of moves to use to create our counts table (n).

SEQ_LEN = 3
NUM_RECORDS = 100

Create moves and counter moves

MOVES = ['R', 'P', 'S']
COUNTERS = {'R': 'P', 'P': 'S', 'S': 'R'}

Create the player function, define an inner function to initialize the model and a default move

def player(prev_play, opponent_history=[], model=[]):
      def init_model():
        if len(model) == 1:
            model.pop(0)
        model.append({''.join(x): 0 for x in itertools.product(MOVES, repeat=SEQ_LEN)})
      
      guess = 'R'

NOTE: All the code below also goes inside the player function

We can check when a new bot starts playing by checking for an empty move. When a new bot starts, we can clear the history, reset our model and return the default guess

      if prev_play == '':
              init_model()
              opponent_history.clear()
              return guess

If there is not enough data to add even a single state, we can just add the play to the history and return the default guess.

    if (len(opponent_history) < SEQ_LEN):
        opponent_history.append(prev_play)
        return guess

If we have reached the maximum number of history we want to use, we can remove the oldest one from both the history and model

    if len(opponent_history) >= NUM_RECORDS:
        oldest_seq = ''.join(opponent_history[0:SEQ_LEN])
        model[0][oldest_seq] -= 1
        opponent_history.pop(0)

Now that we have the checks in place, we can move forward and add the prev move to history and add the counts to the model

    opponent_history.append(prev_play)
    new_seq = ''.join(opponent_history[-SEQ_LEN:])
    model[0][new_seq] += 1

Get a set of all possible future states

    possible_moves = [new_seq[1:] + x for x in MOVES]

Generate a prediction using max count in the table

    pred = max(possible_moves, key=lambda x: model[0][x])[-1]

Finally, counter and return the guess

    guess = COUNTERS[pred]
    return guess

Results

Running the test provided by freeCodeCamp and using SEQ_LEN = 3 and NUM_RECORDS = 100

Testing game against abbey...
Final results: {'p1': 483, 'p2': 301, 'tie': 216}
Player 1 win rate: 61.60714285714286%
.Testing game against kris...
Final results: {'p1': 942, 'p2': 50, 'tie': 8}
Player 1 win rate: 94.95967741935483%
.Testing game against mrugesh...
Final results: {'p1': 843, 'p2': 153, 'tie': 4}
Player 1 win rate: 84.63855421686746%
.Testing game against quincy...
Final results: {'p1': 995, 'p2': 2, 'tie': 3}
Player 1 win rate: 99.79939819458376%
.
----------------------------------------------------------------------
Ran 4 tests in 0.075s

OK

Thank you for reading. You can also check out my other projects for this series below.