Like so many of my mathematical interests, this one began with a

*Scientific American*column by Martin Gardner (or perhaps I encountered it in one of his books). The theme was that a machine can learn by experience. I had a very limited knowledge of computers, naturally, and one of the things people said at the time that distinguished human from machine intelligence is that a machine's programme is hardwired and cannot learn from its experience. Gardner's article showed that this is not so.

His example (if I remember correctly - I am often surprised how wrong my memories are when I check up, so I won't!) was a simple game with chess pawns on a 3x3 board. I implemented it for Nim's Game, with three piles of three matchsticks. (Why matches? Because I had a lot of matches to use up, as we will see.) In Nim's game, players take it in turn to remove as many matches as they wish (but at least one) from any single pile, and the player who takes the last match loses. (Or wins- counter-intuitively, it makes little difference to the strategy which version you choose.)

There is a mathematical strategy for Nim's game, but my "computer" didn't do it that way. I enumerated all the different positions, from (0,0,1) to (3,3,3). I reckon there are nineteen essentially different positions, with (0,0,1) being an immediate loss for the player about to move. These days, one would do this on a real computer, but this was long before teenagers had access to computers so, following Gardner, I labelled an empty matchbox for each position. In each box I placed a counter indicating a legal move from that position, so for example the box for (3,3,3) would contain counters for (2,3,3), (1,3,3) and (0,3,3), representing the three possible moves: one can take 1,2 or 3 counters from any one pile.

The matchbox "computer" is now ready to play against a human. When it is the computer's turn, one opens the box for the current position and draws a counter randomly: that is the computer's move. The human then tales their turn, and then opens the box for the next position and draws another counter.

The beauty of this is that the machine can learn from its mistakes! When the machine loses, we discard the last counter chosen. (All other counters are replaced in the box they were drawn from.) Consequently the computer will never make the same final losing move twice. So, for example, consider the box (0,0,2). There are two moves: to (0,0,0) (immediate defeat) or (0,0,1) (immediate victory). The first time the machine reaches this position, the counter for (0,0,0) may be drawn. The machine will lost that game (from a position where any human would have won), but that counter will be discarded, and the machine will always win from that position in future.

When a box is empty, that means the machine knows that there is no move from that position that does not result in eventual defeat (assuming perfect play from the opponent), so it resigns. As a consequence, counters from boxes which allow the opponent to move to that position will begin to be removed.

After a reasonable number of games, the only counters left in any boxes will be those which never lead to defeat. The machine will start playing randomly, but will end by playing perfectly!

This fascinated me. I called my machine Angela, for "A Nim's Game Experience Learning Automaton". (I'm not sure "automaton" is right for a device which requires one physically to draw counters from matchboxes, but it served the needs of the acronym.)

What does this have to do with the Vampires and Maidens problem? Remember that we have three vampires and three maidens at the foot of a building, needing to get to the top in a lift which can only carry two people at a time and which needs one person inside to operate it. If at any time, vampires outnumber maidens in any location, then something unspeakable happens. How can all six safely get to the top?

Well, the matchbox computer idea will solve this. As suggested in my previous post on this problem, w can enumerate the possible positions - these will be of the form (v,m,l) where v is the number of vampires at the bottom, m is the number of maidens at the bottom and l is the position of the lift. So v and m are between 0 and 3 and l is -1 for lift at ground level and +1 for lift at the top. The problem is to go from (3,3-1) to (0,0,1), where we fail if at any stage m is non-zero and m is m<v.

Now create matchboxes for each position, and use counters to indicate possible moves. From the starting position (3,3-1) there will be five possible moves, to (2,3,1), (1,3,1), (3,2,1), 3,1,1) and (2,2,1). (Two of these leave maidens outnumbered at the bottom.) We then begin at the starting point and draw a counter, move to the indicated matchbox and draw another. Any time we reach a failing position, we discard the last counter drawn.

Of course, the counter we draw may take us back to a position we were in earlier. To avoid looping for ever, we discard any counter which returns us to a state we have previously encountered.

The result is that, eventually, drawing counters at random and backtracking when we fail, we are left with a solution to the puzzle.

What is wonderful, or appalling, about this is that, as with my Nim's Game device, we solve the problem with no deep understanding of the strategy. On the one hand it is amazing that we can find solutions to difficult problems without any serious thinking about the problem: on the other hand, the standard strategy for playing Nim's Game (which I already knew from earlier articles by Martin Gardner) is better maths!

So is it better to solve problems by mathematical thinking or by the brute force approach of my matchbox computer?

Your memory does serve you correctly (on this occasion), Martin Gardner's machine learning game was 3 pawn-a-side chess on a 3x3 board. I remember making the 'matchbox computer' for it when I was younger. It's described in Chapter 8 of Further Mathematical Diversions, or Chapter 35 of the compendium volume 'The Colossal Book of Mathematics'

ReplyDelete