Back

Python Posts

Mindstorms ev3 Football

posted in Python, Hardware at 08 September 2017

In my sixth form, I worked as part of a team to make a Mindstorms ev3 robot play football using ev3dev (a loadable Linux OS for the ev3) and Python.

Our Robot (sorry about the image quality, for some reason all of my images of it were missing so all I had was a video of it)

The Robot

DARWIN, our robot, consists of the following components. It has two independent motors and a caster for flexible movement, and a third motor to apply backspin to the ball so that it stays in the robot's grasp. It has a rotation sensor to detect whether it has been pushed off course and a colour sensor to look at the pitch and its lines to make sure it isn't going off the pitch. It also has an ultrasonic sensor to look for a goal and other robots, and an infrared (IR) sensor to look for the ball.

Our Strategy

This robot was designed to score goals rather than just defend, so with that in mind it has a simple plan: find the ball, grab the ball and then score a goal with it. This essentially means it has two states, which we'll get to later. While it doesn't have the ball, it uses the IR sensor to find out what direction the ball is in and moves towards it. If it can't find it at all, it just spins on the spot until it sees it.

The biggest problem we faced was that the IR sensor was unreliable at very short distances, causing it to produce very weird values suggesting that the robot should take a sharp right when the ball is right in front of it. To deal with this, the robot ignores values corresponding to very sharp turns and rather than directly acting on the value it receives keeps calculating the average of the last few readings. We also ignore the IR sensor completely once the robot has the ball, which is done by monitoring the speed of the backspin motor - if it has slowed slightly, we know the ball is likely underneath it so we have it.

To detect the goal, we use both our rotation and ultrasonic sensor. If we have the ball, are facing the right way and detect an object, we assume it is the goal and attempt to score. For collision detection, we turn left or right randomly to avoid the object, and then get back to our original path.

The Code

Because our robot essentially has a few states it can be in, Kirsty cleverly realised that our code could be structured in that way also. So, our main loop in the code is:

functions = {"moveToGoal": moveToGoal, "retreat": retreat, "shoot": shoot, "lookForBall": lookForBall, "realign": realign}
while True:
        state, ro = functions[state](ro)

Where ro is our Robot Object, which contains some key-value pairs of information. Every function returns a state and newly-modified object once done, so we have effortless switching between states.

These stats are organised as follows. Our main state is lookForBall, which in a surprising twist looks for the ball. It reads the IR sensor, keeps track of the average and turns in the relevant direction. This returns its current state unless it turns out it has now got the ball, then it moves to realign. Realign rotates the robot until it is facing the right way, then moves on to moveToGoal (or lookForBall if it loses the ball while rotating). In moveToGoal we move in a straight line and try to score (changing to the state shoot), changing to realign if we get knocked in the wrong direction. The state shoot is self-explanatory.

The final state that we've not covered is retreat, which is when it detects the white line surrounding the outside of the pitch. When it does this, it reverses, makes a quarter turn and then goes back to realign. This can be called while the robot is in moveToGoal. Illustrated, this state system looks like this:

State Diagram

Turns out if you can split a program up like this, it looks really nice.

Words With Friends Solver

posted in Python at 04 September 2017

For my A-level Computing project, I worked on a rudimentary Words With Friends solver. When emailed a screenshot from a particular size of phone (and given the letters to make a word from), it not only gives the best move (and may do so from a dictionary of commonly used words or a full dictionary), but also gives hints. Here's an example of the output it gives:

Hint #1: The best move is located in the top-left quadrant, attached to (or creating the word GI). If you're only looking at commonly used words, the best move is in the bottom-left quadrant, attached to (or creating) the word DO...

How does this work? It is split into three parts: image processing, email processing and the actual solver.

Processing an Image

The first thing the program does when receiving an email is take the screenshot and convert the grid shown in the screenshot to a Python array. This is the most rudimentary bit - in the program we have black and white images for every letter, so we simply compare predefined sections of the image to those and look for the closest fit. These predefined sections have been found by hand - as the aim was for this to only work with one screen size, everything will always be in the same place and the program can take advantage of that. Each tile of a certain letter is exactly the same, so by knowing what one of them for a certain screen size looks like, we know what all of them look like, which allows us to check pixel-by-pixel rather than using more advanced image processing techniques.

Solving a Grid

To do this, we will first consider the across words in our grid. This is done by taking each row in turn and determining what tiles can be placed there, if any. If a tile is already present, then it is certain we should leave that there, otherwise we work out what tiles will continue to make a valid word if placed on the grid in that square. This is possible as our dictionary is stored in a trie.

A trie is a tree with letters as edges. Our root is the empty word, and words are constructed by branching out from the root. For example, in our image below, inn can be found by going to i, then in, then inn.

Trie Example (source: Public Domain from Wikipedia)

In my solver, each root also has a flag stating whether this forms a complete word or not. Leaves are automatically complete words, however, we also have words such as "in" (which has "inn" as a child but is still a complete word). This trie structure allows us to work out whether we have a valid part of a word - if that node exists in the trie, completed or otherwise, it can be completed.

The actual algorithm is broken up into two steps. First we look through every square of the grid and mark it with tiles that could be there based on the tiles currently placed on the grid above and below that square. This is because we might end up creating new vertical words as we make horizontal ones, but these words must be complete and legal so only certain tiles can be placed.

Then to find the words, we go through each line. Start at a square, place one of the possibles in it (either a tile already on the board, otherwise one of the tiles we worked out from the previous step that could be played), then make a recursive call with the remaining tiles and new board state. Keep doing this, and whenever you score a word, add it to your word list. Once all the possibilities have been exhausted, score the words in your word list, and return the one with the highest score.

To do the same with down words, we simply flip the grid along the diagonal and repeat the above.

Hints and the Final Email

We then construct the final email using our two solutions (one for the normal dictionary, one for the most common words). This consists of two hints and the solutions separated out using a large amount of whitespace so that on a phone you have to scroll down. This means that users will only ever see the first hint unless they scroll down for more. These are constructed simply from the data in the solutions given - the first hint looks at the coordinates of the middle of the word to find the quadrant and attached word, and looks at the word itself for the second hint.

Finally, this email is sent off.

Markov Chains Music

posted in Python, Music, Generative at 03 September 2017

For my Extended Project Qualification, I decided to look into using Markov Chains to generate music, with surprisingly okay results. For this I use the music21 Python library to import, analyse and export musicXML. This was finished in November 2015.

Example Output

Markov Chains

A Markov Chain is essentially a probability distribution that introduces state. To give an example, let's consider a naive music generator that plays a random note with some given probability. It'll play what is essentially a random string of notes, like hitting a keyboard with your fist, hitting some good and some bad notes. The important note to take away from that thought experiment is that there won't be a flowing tune at all - each note is independent of the last. If you try to alleviate this problem by massively weighting our selection to the best notes, you'll end up with something like the head of C Jam Blues - which is a perfectly fine tune, but if I wanted a computer to generate music, I'd quite like it to be able to generate more complex music than just a rehash of C Jam Blues every time.

A good way to make music flow is by considering the previous notes in a phrase, as notes are evidently not composed independently of each other. Otherwise composition would be as easy as rolling some dice and putting dots on a page! To introduce this phrasing, we consider what note we are currently playing when writing the next. Rather than having an absolute list of probabilities (e.g. there's a 0.5 chance of playing a C, 0.1 of a D...) we now have a transition matrix, where each row adds up to 1. When picking our next note, we look at the row of our current note for the probabilities we should use.

x   C   E   G
C | 0.1 0.5 0.4
E | 0.2 0.3 0.5
G | 0.3 0.5 0.2

So if our first note was C, there is a 0.1 probability that we choose a C to be our next note.

We can expand this to look at multiple previous notes in a higher-order Markov Chain, where we consider the previous few notes as well as the current one. An order N Markov Chain is defined as a Markov Chain considering N inputs total. This gives us a row consisting of multiple previous notes, e.g. in this order 2 transition matrix:

x    C   E   G
CC | 0.1 0.5 0.4
CE | 0.2 0.3 0.5
CG | 0.3 0.5 0.2
...| ... ... ...

This allows us to improve the phrasing of our music even more. However, this comes with a caveat which will be explained as we begin to look into how we determine these probabilities.

Training our Markov Chains

To train our Markov Chains, we must observe what real music does and get probabilities similar to above. We do this by taking training melodies, transposing them so that they are all in the same key, and going through each note and recording the frequency of notes directly after it. First, we convert our notes to numbers, where 1 is the note of the key, 2 is the note above that and so on. Then we construct a 12x12 matrix to fill with frequencies, and initialise every value to zero.

Now we go through each training melody. For each note, on the row of the current note, we record what the next note is by increasing the number at that entry by 1. We repeat this for every note in every melody, and then divide each row by the total of it to give probabilities that sum to 1.

We do the same with rhythm, except we look at lengths of note rather than pitches. In my program, rather than considering a note-by-note basis, we instead record full beats of rhythm and change between those. This unfortunately detaches the rhythm from the melody, making less pleasing music.

Actually Generating Music

This is the easiest bit. Start at a note, then use our Markov Chain to generate each note after it. Do the same for rhythm and map the pitches to notes. Then you're done. Just choose a key, turn your numbers into notes from that key, generate a title (my program does this by applying Markov Chains of order 4 to song titles from The Beatles) and output it as musicXML.

After experimentation, I decided to use an order 2 Markov Chain for the melody generation. The problem with higher-order Markov Chains is that you need to have lots of input data because otherwise you will end up with verbatim repeats of other tunes as many-note sequences are less likely to appear in multiple tunes than shorter ones so the program essentially chooses one of the few options and follows it.

Extras in the program

The program implements "Tintinnabuli", a method which makes a bassline from the melody. More can be read about it here.