Sniff is a "Scratch-like" programming language that's designed to help Scratchers move gently from Scratch to more conventional languages. They can start writing programs, without having to learn a new language because Sniff is based on Scratch. They learn a little more about variables, compiling, syntax errors (!), and they can have fun controlling real hardware while they're doing it.

Friday, 31 July 2015

Countdown 2 (Extinction!). Genetic Algorithms!

In the last post we saw how how we could encode solutions to the Countdown numbers game as strings, and write Sniff code to both generate potential solutions and evaluate them. Using brute force we could generate thousands of solutions, and test them until we found a good one.

It actually worked pretty well but it required a lot of trials to find solutions for trickier problems.  In comparison we can be pretty sure that a human player could only check perhaps at most 50 solutions in the 30 seconds available to play the game? There's obviously a better solution, but what?

The computer generates a guess, tests it and if it doesn't work throws it away. By comparison, a human would also make guess and test it. Leaving aside that the humans "guess" probably isn't entirely random, when a human gets a "close" guess they don't throw it away for not being perfect - they try and "fix" it. If I need to make 804 and I gave 50 4 4 and 1, my first try might be 50x4x4 to get 800. That's pretty good. Rather than starting over I'd realise I can tweak the solution to give (50x4+1)x4=804.

Of course humans use all sorts of "skill and judgement" to arrive at good solutions, but the basic idea of taking a good solution and tweaking it to see if we can find a better one is the basis of "Genetic Algorithms".

A GA operates by generating a whole load of random solutions (we've done that already), and then testing them against a "fitness function" - how good a solution are they? Bad solutions are thrown away, and good ones get to produce offspring - solutions which are similar, but not exactly the same as the known good solution. The hope is that the offspring will keep the best bits of their parents, and maybe even improve, producing new, better solutions.

There are two main ways of generating new potential offspring from a known good solution. The first is mutation - just go through and randomly change some of the characters. For example if we want 101, and we have a promising solution of 4x25-1=99, then it might mutate next time of "4x25+1=101.

The other option is "crossover" where you take two good solutions, and pick the best bits from each. For example to make 125 we could have tried 5x5+50+10+10=95 and 50-5-5+10x10=140. They're both ok-ish solutions so but if we take the first part of the first one, and the second half of the second we get 5x5+10x10.

As we already have our solutions encoded as strings we can mutate them pretty easily:

when mutateCandidate
.make index number
.make newCandidate string
.make l string
.make r number
.set newCandidate to ""
.repeat length of candidate using index
..set l to letter index of candidate
..if pick random 1 to length of candidate=1
...if value of l = 0
....set r to pick random 1 to 9
....if r < 4
.....set l to "*"
.....if r <7
......set l to "*"
......if r <9
.......set l to "-"
.......set l to "/"
..set newCandidate to join newCandidate l
.set candidate to newCandidate

Mutating the card numbers would be tricky, so to keep things simple we just mutate the operations - that way we're sure that the new rule is valid. If we were to start changing the numbers we'd need to make sure we didn't use the same one twice, and swapping numbers of operators would upset the "balance".

Similarly crossover is a bit tricky, but there's a simpler way to handle it, which in many ways resembles a human approach - if we want 803, then we would start with 100x8=800, then start tagging things on the end to get a better solution. If they don't work we go back to 8x100 and try different things. With a but of tweaking we can modify our random solution generator to be a random solution comption script (making a new solution is just completing the empty string).

.repeat 10
..broadcast makeCandidate and wait
..add candidate to population

Now we can start by making 10 potential solutions and adding them to a population list.

.repeat until evalCount>1000000
..set index to 1
..repeat until index> length of population
...set candidate to item index of population
...broadcast evalCandidate and wait
...set offspring to 10
...if  abs of (eval - target) < bestErr
....set bestErr to abs of (eval - target)
....broadcast prettyPrintCandidate and wait
....say candidate
....say pretty
....say join "="[eval]
....if bestErr=0
.....say join "Evaluations:" [evalCount]
.....stop script
...if abs of (eval - target) > 10 * bestErr
....delete item index of population
....set offspring to 0
....change index by 1

We then go through the population evaluating them. We keep track of the best one we've found, and if a solution is 10 times worse than the best we throw it away.

...set parent to candidate
...repeat offspring
....set candidate to parent
....broadcast truncateCandidate and wait
....broadcast mutateCandidate and wait
....#say join join parent "=>" candidate
....if not (population contains candidate)
.....add candidate to population

Then for each good solution we generate some new ones by truncating (and completing) the good one, mutating it and adding it into the new populations.

Lets test this.
Numbers:75 8 6 4 2 9

Using brute force we find the solution:((8+((75*(9+4))-6))+2) after trying 56779 potential solutions. Our GA tries only 6296. However it didn't always work so well.

Given 803, and the numbers 100, 75, 10, 2, 2 ,8 the obvious solution is to start by multiplying 8x100 to get 800, and then mess around with 10, 2, 2 and 75 to get the remaining three. I suspect I'd end up with 802, or 804. It turns out there is an "easy" exact solution: (9+2)+(75-2)=803. The reason we miss it is that we're stuck in a "local minima". We've found a "good" solution, and we think we can tweak it to get a perfect one, when in fact the perfect solution looks nothing like the "good" one. The random generator will eventually find the perfect solution, but our smarter GA gets stuck trying to tweak the good one.

Of course a human would eventually give up. In fact our truncation stategey will occasionally throw out a rule completly and generate an entirely new candidate, but it still wastes a lot of trials. Looking at some  of the really bad results for the GA (When it did much worse than random), we saw that often is was the result of a divide. The better the solution we have, the more likely we are to throw out other solutions: if our solution is 5 away, then anything that is within 50 gets to play the next round. However we occasionally say results where we were 0.3 away, so almost all other solutions were discarded. Adding some code to block non-integer divides improves perfornance further, and we find our solution to 979 in only 729 trials! That's almost 100 times faster than brute force!!!

That gain however isn't representative: Over 1000 test puzzles, brute force averaged 60,597 attempts before getting the solution, while the GA averages 33,255. That's still a 50% improvement. The GA did better than random on a slightly disappointing 623 of the 1000 tests, but that's not really the whole story - subjectively brute did better on very easy and very hard puzzles. Sometimes brute force got lucky and found the solution to easy puzzles in under 1000 trials. For such easy puzzles, just guessing was likely to find and answer more quickly than trying  to work it out. For some the harder puzzles the GA got locked into local minima. However when GA did work (as in the 979 test) it showed some really good results.

The code for all this will be in the next Sniff release. There's lots more you could do with this - at least try changing the initial population size, number of offspring, and criterea for killing candidates. More complex mutation and crossover strategies would also be worth investigation, or try applying the approach to different number puzzles.

Countdown to extinction is of course also the title of a Megadeth album, and not one but three Transformers episodes!

No comments:

Post a Comment