Saturday, 23 June 2012

In praise of Knuth - the joy of algorithms

Since I am taking part in a Turing-related Lates evening at the Science Museum on Wednesday 27 June, in which my contribution will look at a curious mathematical algorithm and its applications in computing, I have been thinking about some of the algorithms I have used in programming, and this post will describe one particular favourite (not the algorithm I will be demonstrating on Wednesday).

I used to be a keen bridge player.  Sometimes in competitions hands were "computer-dealt" and people noticed that the card distributions seemed to be less regular than in hands shuffled and dealt physically.  Good players even adjusted their play according to whether the hand was dealt by a human or a computer.  The moral is that human shuffling isn't perfectly random.

Anyway, I wanted to programme the shuffle of a pack of cards so that I could use computers to explore strategies for playing various forms of patience ("solitaire" to Americans).  In particular a student project verified that the probability of clock patience coming out is indeed 1/13, as I had proved as a schoolboy.  How do you get a computer to shuffle a pack of cards (or equivalently to generate a random permutation of the numbers 1 to 52, or more generally 1 to n)?  (In this post I am using "random" to mean that all possibilities are equally likely, which is what one hopes for in shuffling cards.)

I thought the obvious approach was to generate successive random (equally likely) integers between 1 and n, simply rejecting those that had already been drawn.  As we go through the shuffle, we are more and more likely to generate random numbers corresponding to cards already dealt, so towards the end it can take a long time to find an undealt card - when there is only one number left to be drawn, the probability of hitting it is only 1/n and the expected number of draws required to get it is n.  In practice this algorithm seemed to work OK for a pack of 52 cards - the probability that it takes an absurdly long time to get the last few cards is in fact quite small, and some clumsy ad hoc adjustments could improve it.  The result was a messy piece of code which might take some time to complete the shuffle for a large pack.

Then a colleague (thanks, Don) told me about Knuth's shuffling algorithm (which it turns out is a variation of one devised by Fisher and Yates before the computer era).  It's beautiful.  To shuffle n cards, you simply do the following: for i between 1 and n, we successively swap card i with a random card in the pack.  It's easy to see that the result is a random permutation of the pack and that all possible permutations are equally likely.  We have a simple algorithm, very easy to programme, that generates a random shuffle in a fixed number of steps. 

One of the joys of mathematics can be finding that there is a beautiful, simple solution to a problem on which one has spent a lot of time working with messy, unsatisfactory methods.  This was such an example.  I remember how happy I felt when I learned about this algorithm! 

Of course, the downside is that I am reminded that people like Knuth always seem to produce beautiful, simple algorithms in contrast to my clumsy efforts.  It's a bit like watching Messi play football - how many can aspire to play like that? - except that I am left with the frustrating feeling that I should have been able to come up with the algorithm myself.  Luckily perhaps I have now reached an age where I can appreciate beautiful mathematics without beating myself up over my failure to do it for myself.

Tuesday, 19 June 2012

When conceding a goal doesn't matter

As a follow-up to yesterday's post, here's more on the maths of football tournament mini-groups.

In last night's UEFA Euro 2012 matches, Croatia played Spain and Italy played Ireland.  Because Ireland had lost both previous matches and the others had been drawn, it was likely that the tie-breaking rules would come into effect and the score in the Croatia-Spain match would be crucial.  Assuming Italy beat Ireland, the winners of Croatia-Spain would go through and the losers would be out.  Croatia and Spain would both qualify if they drew 2-2, Spain and Italy if the result was 0-0, while if Spain and Croatia drew 1-1 then Italy would have to beat Ireland by two goals and score three times to deny Croatia.

So we're in the last couple of minutes of both matches, Spain and Croatia is goalless and Italy lead Ireland 1-0.  At this point Spain and Italy are on course to qualify. Croatia need a goal, but the interesting thing is that conceding a goal doesn't damage Croatia.  In fact Spain score just before the end, but Croatia need to score just one goal to go through, and the Spanish goal makes no difference to them.   At this point there is no likely situation in which a 1-0 win for Croatia is better for them than a 1-1 draw, unless Italy were to score two more goals in injury time.   (In fact Italy scored one more to win 2-0 so perhaps that wasn't an impossible scenario.)

Indeed, perhaps Croatia are more likely to score if Spain have scored, since Spain are likely to be less worried about conceding an equaliser which won't put them out.

Spain's goal did change the order of the top two in the group, but, coming when it did, it made no difference at all to Croatia's chances.

I am reminded of the Scottish Premier Division 1990/91.  Rangers and Aberdeen are fighting it out with two matches left.  In these days it is two points for a win (not three) and one for a draw, and if two teams finish level then the deciding factor is goal difference, with goals scored then being decisive if both teams have the same goal difference.  The last match of the season is Rangers against Aberdeen.  With two matches left, Rangers are two points above Aberdeen.  Aberdeen are winning their penultimate match but, in injury time, Rangers are trailing Motherwell 1-0.  If it finishes like that, the teams will go into their last match level on points.  Rangers will have scored 60 and conceded 21, Aberdeen have scored 62 and conceded 25.   Rangers' goal difference is better and they need only draw the last match against Aberdeen to win the league.

So what do Rangers do?  They throw caution to the wind in seeking an equaliser.  Had they got an equaliser they would have gone into the last match a point above Aberdeen, and needed a draw to win the league.  So  scoring an equaliser would not have benefited them in the slightest: they need exactly the same from the last match as if they lose 1-0 to Motherwell.  But in fact the reckless attack allowed Motherwell to score twice more in injury time.  Rangers lost 3-0, their goal difference is now the same as Aberdeen's, they have scored fewer goals, and now they need to win the final match if they are to win the championship.  The reckless pursuit of an equaliser which could make no difference has actually made the league title much harder to win.

(Rangers beat Aberdeen 2-0 so this mathematical madness didn't actually cost them the championship.)

Monday, 18 June 2012

The maths of "The Group of Death"

Tonight Spain play Croatia in the UEFA Euro 2012 football tournament.  At this stage teams are playing in groups of four - each team plays each of the other three (six matches) - with the top two in each group progressing to the next round.  The results of the first four matches mean that if Spain and Croatia draw tonight and both teams score at least twice, then Italy are knocked out whether or not they win their match against Ireland.

While sporting integrity (a strange concept currently being invoked in discussion about the future of Rangers in the Scottish Premier League) will ensure that Spain and Croatia both play to win, history suggests that if the score is 2-2 with five minutes remaining it would be irrational either side were to take risks in attempting to score a late winner.   A similar situation arose in 1982 when West Germany played Austria in the World Cup. A one-goal victory for the Germans would ensure that both sides progressed.  West Germany scored an early goal and thereafter, according to Wikipedia"Onlookers noted that both teams played as though they were content with the result", the 1-0 outcome seeing both teams through at the expense of Algeria.  There have been other examples in international and club matches.  Robert Axelrod writes about an English league match in his book about the Prisoners' Dilemma and the mathematics of altruism, "The Evolution of Co-operation".

In fact there is a lot of (relatively simple) mathematics in sporting league tables.  Quite possibly my interest in mathematics began around my eighth birthday in 1965 when Kilmarnock had to beat Hearts 2-0 in the final match to win the league on goal average (the ratio of goals for to goals against), a much more complicated tie-break mechanism than the goal difference used now.

The current rules for deciding ties in mini-leagues in tournaments like Euro 2012 is that, rather than look at goal difference over all matches, instead we look at results over the matches involving the teams who have tied.  So if A and B finish with the same number of points (3 for each win and one for each draw), and A won the match between A and B, then A finish above B.  This avoids the situation which eliminated Scotland in the 1974 World Cup, when all three of Scotland, Yugoslavia and Brazil beat Zaire while the matches between these three were drawn.  Scotland played the weak Zaire side first which meant that Brazil knew exactly how many goals they had to score against Zaire to finish above Scotland.

While the newer rules work well in some circumstances, they still give rise to situations like tonight's Spain-Croatia match.  

As a boy I loved mathematical puzzles based around sporting league tables - typically one was given an incomplete table and had to work out the results of every match.  These appealed to my twin loves of sport and logic.  I noticed however that on the final edition of Dara O'Briain's School of Hard Sums, when such a problem was presented, it didn't enthuse the students on the programme: perhaps these problems no longer have wide appeal.  Nevertheless I still enjoy the mathematics of league tables.

For example, (1) in mini-leaguers of four teams as in Euro 2012, can a team win two matches out of three and still not qualify for the next round?  (2) Can a team win one and lose two, and still not qualify?  (3)What is the smallest number of points with which a team can qualify?

Answers: (1) Yes if three teams win two matches and the fourth loses all three - this would have happened if Denmark had beaten Germany in the "Group of Death" last night, since Germany would have been out despite winning their first two matchers.  (2) Yes, if one team wins three matches and the other three each beat one of the others - this would have happened if Netherlands had beaten Portugal last night. (3) A team can go through with one defeat and two draws, if one team win all three matches and the other three matches are all drawn.

One point which was obvious when league tables were displayed during the TV coverage of last night's matches is that, under today;s tie-breaking rules, traditional league tables (recording numbers of matches, wins, draws,losses, goals for and against, and points) do not contain enough information to show which teams will qualify.  If both last night's matches had been drawn, Denmark and Portugal would each have had four points. Portugal would have finished second, and qualified, because they beat Denmark in the match between these two teams.  But one couldn't have deduced that from thr group table.  Exactly the same numbers could have arisen in the group table with Denmark beating Portugal and drawing with Netherlands, and Portugal beating Netherlands.  

So whereas once a group table gave all the information you needed to decide the order of the teams, this is no longer the case.  How could we create an improved group table which actually included all the information we need?

Sunday, 10 June 2012

A tree which is not a tree

Graph theory is perhaps my favourite area of mathematics.  It features diverse and appealing modes of reasoning, and many of the results are relatively accessible,  rather than requiring a lot of theory.  
A graph-theory tree
tree is a special kind of graph, like that above (the numbers serve only to identify the vertices).  Technically a connected graph is a tree if there is exactly one path connecting any two vertices, or equivalently the graph contains no cycle. In lay terms this means a diagram where points are joined by such a way that there is no loop.  The branching is like the branches of a tree (in the natural sense of the word) or of a family tree, as one sees in the diagram below, where one of the vertices is chosen as a root.
Rooted Tree

There are lots of results one can prove about graph-theoretical trees.  For example, a tree with n vertices must have exactly n-1 edges.  Trees are particularly useful in computer science.  But what I am interested in today is the definition.  

Books like Robin Wilson's excellent Introduction to Graph Theory motivate the definition by referring to family trees and botanical trees. Now there is a nice sycamore tree in our garden:

Our sycamore tree

But this is not a tree in the graph theory sense!  It contains a cycle:

Two squirrels could climb to the top of the tree taking different routes!  And that's impossible in a graph which is a tree, where there is only one route between any two points.

So what's gone wrong?  Why is my tree not a tree?  Why does the mathematical definition which seeks to encapsulate the "tree-ness" of trees, exclude this perfectly respectable specimen of a tree in our garden?

Well, mathematics is about abstraction.  The definition of "tree" abstracts something typical of real trees, which does reflect the nature of branching (our sycamore is slightly unusual).  It leads to a very rich branch of graph theory, with many applications.  So it's a good definition.  (But it doesn't exactly fit family trees either.  If you go back a few generations you get connections which give loops, when the same person is an ancestor in two different lines.  Otherwise we would each have far more ancestors than all the humans who have ever lived.)

Here's another illustration.  If I take a square piece of paper, it has four corners.  Let's get rid of the corners by cutting them off with scissors (cutting off a triangular piece at each corner).  I've taken four away from four: mathematically I might expect to have zero corners left, but in fact I have eight since the remaining paper forms an octagon.  So 4-4=8 in this situation.  (Peter Flom's blog has some other examples where 2+2 is not 4.)

So does maths fail?  No.  When I was younger I'd have felt that these examples of arithmetic failure were contrived, obviously artificial and not to be taken seriously.  I now think I was wrong.  These are real examples: it's mathematics that is contrived.  We abstract the rules of arithmetic from a world in which they don't always apply, to create a system which is enormously useful.  The power of mathematics comes from simplifying the real world.  The mathematical idea of a tree is powerful, even if it excludes some legitimate trees.  Arithmetic is powerful even if it doesn't apply to some real-life situations. 

So it's amusing that our sycamore tree is not a tree, but it isn't a flaw in graph theory!

Thursday, 7 June 2012

Dara's Big Mistake

I've just watched the latest (and final) episode of Dara O'Briain's School of Hard Sums, with Andi Osho again demonstrating a remarkable ability to get straight to the solution by what appeared to be inspired guesswork.  If I were doing undergraduate groupwork, I'd want her in my team!

SPOILER ALERT: If you're planning to watch this programme (it's repeated over the next few days) you may not want to read this yet as I give some hints as to what happens (though I don't give away the maths)..

The final problem involves runners going round a race-track at different speeds and asks how long it is after the start before all three runners are again together.  Dara produces reams of incomprehensible algebra to get an answer in which he doesn't have full confidence - he presents it reluctantly, convinced that he has gone wrong.  His delight when he finds his answer is correct was the highlight of the series.

I think Dara got it badly wrong.  He didn't check his answer!  His algebra was complex but it gave a solution: it would have taken only a moment to see where the three runners were at the time in question and establish that they were all together.  ((OK, only a partial check: it might not have been the first time, but it would at least have shown that his answer was plausible.)

In any maths problem, if you can check your answer is valid, it gives you confidence in your solution.  I worked for many years in the mathematical modelling of industrial plant.  There were essentially two kinds of problems.  Engineers wanted either to find the steady state of the model - the normal plant operating condition when nothing is happening to disturb it - or to find out what happened dynamically when some disturbance was applied (say if a pipe broke and the cooling system failed).

The models were essentially defined by a set of differential equations.  Finding the steady state involved finding values such that all the derivatives were zero - that is, finding a zero of a system of a couple of thousand non-linear equations.  Not easy  - sometimes the equation-solving algorithms failed and we were reduced to starting from a random initial state and running the model for a very long time in the hope that it would naturally arrive at the steady state.  But if the algorithm gave a solution, you could test it by plugging the numbers into the equations and making sure they all gave you zero.

But running dynamically involved numerical integration, and there is no simple way to check your answer!  Since numerical integration can give completely incorrect results (try integrating dx/dt = -1000x by Euler's method with step-size 0.01, say, to see output which is not only quantitatively but also qualitatively completely wrong) this is a serious problem.  (There are ways one establishes confidence in the output from numerical integration, but they aren't as simple as plugging the answers in to the equations to check that you get zero!)

Dara's algebra may or may not have been right.  But if he had checked where the runners were after the time he had deduced, he could have had a lot more confidence in his calculations.  Of course, it wouldn't have made such good TV.

Saturday, 2 June 2012

Atiyah and Villani at Tate Modern - the value of blackboards

A wonderfully stimulating afternoon at the Tate Modern gallery this afternoon, for a mathematical event in a series entitled "Topology".  The afternoon had two parts - a showing of the film "Au Bonheur des Maths" by Raymond Depardon and Claudine Nougaret, followed by discussion between Michael Atiyah and C├ędric Villani, two mathematical superstars, chaired by Ian Stewart.

The film comprised a series of short statements by mathematicians, essentially presented in black and white as talking heads, though Villani alone used the blackboard to communicate (more on this later).  They all had interesting things to say.   I was slightly puzzled by Atiyah's comment afterwards on the diversity of the mathematicians: most seemed to be white, male and middle-aged, and presenting similar views of the subject.

The discussion, ably chaired by Ian Stewart, was absorbing.  It helped that the questions from the audience were all interesting (or were made interesting by Villani and Atiyah).  It was being recorded, and I hope the result will be made widely available.

There were so many interesting ideas that I can only pick out a few that resonated with me.   One was the extent to which mathematics is a frustrating activity: 'You spend most of your time thinking "I can't understand this"' (and if that's the experience of Stewart, Atiyah and Villani, what is it for the rest of us?)  This is something that must influence the characters of mathematicians: how can you be arrogant when you are always wrestling with problems you can't solve?  It must be off-putting for undergraduates who found mathematics easy at school to find at university that there are hard problems on which they get stuck: appreciating that being stuck is the natural state of even the greatest mathematicians, and isn't a sign of one's inadequacy, might help keep them in the profession.  So I'll be quoting this to my students.

I was really impressed by the evident breadth of knowledge of both Atiyah and Villani.  They had read Kepler; Villani told us about Voltaire's comment that Kepler's achievement was a humiliation for philosophy in that such a deluded thinker could make such great discoveries.  Both seemed keen on reading the writings of the great mathematicians of the past.  Villani made fascinating comments about the different ways in which the composers Xenakis and Ligeti made use of maths, and showed how a work of Ligeti refutes a remark in Euler's writing on music. When there is such a focus on research outputs that reading outside one's immediate subject can seem like a luxury one can't afford, I found Atiyah and Villani inspirational in their range of serious interests.

There were insights into similarities between mathematicians and artists - both proceed by breaking rules (for example, non-Euclidean geometry arising by discarding a previously unchallenged axiom) and both work with a mix of individual and collaborative activity.  There was wisdom on teaching - the personality of the teacher is the most important factor, and Villani commented that saying that we need more people to study science is not the best way to encourage young scientists.

Villani was asked about the "enemies" a mathematician faces, and identified three - lack of imagination; a priori ideas, which you have to eliminate in order to make progress; and too much information, hiding the things that really matter in fog.  Stewart told a fascinating anecdote about a friend discussing three-dimensional topology over the phone with a mathematician who had been blind since birth (a remarkable idea in itself) and noting that the blind mathematician was obviously visualising the problem very clearly but without taking any single point of view - a very interesting concept.  There was discussion of ugliness in mathematics - with, I think, agreement that a good proof is about understanding, and a proof which doesn't tell you why something is true is inherently unsatisfactory.

Artists in the audience asked some of the most interesting questions.  One was about symbols - how do mathematicians choose their symbols?  Villani's answer made me aware just how important choice of symbols can be - they permit a level of automatic checking and most symbols carry meaning for the reader which helps comprehension.  (I remember struggling with the German letters symbolising group varieties in an algebra book: I couldn't immediately tell which was which and, not surprisingly, I never got on top of the material. My fault, not the authors: I should have spent time copying the letters until I could read them at a glance!)

The final question, again an insight into how the practice mathematics relates to that of the visual artist, was about the "extraordinary mark-making" mathematicians do on blackboards, and whether this can survive in the digital age. Atiyah's answer agreed on the importance of the hand in doing maths, and that communication of mathematics involves body movement, props, blackboards - the garnish that makes something palatable.  Villani, noting that blackboards are now not always available in lecture rooms, felt that the blackboard is unrivalled: much more than with a computer, one can improvise, erase, keep some material and lose other bits; it forces you not to overflow, and when  you're frustrated, you can bang your head on it!

This was an amazing afternoon, rich in provocative ideas.  The mathematicians' passion came across; there were insights about mathematics, art and music.  If these notes are garbled, it was because there was so much to think about!  I do hope the recording will be made available, even though it will probably show that I have misunderstood and misinterpreted a lot of what was said.  I have rarely spent such a stimulating afternoon in a gallery (and that is a very strong statement).