Saturday, 14 May 2011
Compromise in Wargames - (3) Probability: the Ludic Fallacy and Other Stuff
This is the last of my three posts considering the basic assumptions on which wargames depend, and the need for a commonsense approach when applying them. This one will concern itself with the gamer’s need for convenient mechanisms to simulate chance events, or events which are subject to the laws of probability. The obvious areas of focus – maybe the only important ones, are casualty rates and the maintenance of some measure of combat effectiveness during a battle. To protect my sanity a little, and save some typing, let’s call this effectiveness CE, for short, and let’s not fuss too much about how it is assessed – let’s just assume that there is such a thing.
I don’t know what the Very Beginning is, in absolute terms, but Young & Lawford’s excellent Charge! seems a Very Good Place to Start. In the opening chapter, the authors discuss the introduction of random events into wargames, mentioning topics such as Military Chess, a variant of the noble game in which a determined pawn may occasionally fight off an attack from a knight, for example. Random events – simulators of battlefield probabilities – are introduced as a characteristic of wargames.
In the basic game of Charge!, infantry fire requires the player to throw 1 normal dice for every 8 figures firing. The score on the dice gives the basic number of hits. For long range (over 3”) halve the dice score. For incomplete volleys (4 to 7 odd men), halve the dice score. Hits on gunners, cavalry are halved; for troops in cover, hits are halved. All these halves are cumulative, and adjusted hits less than ½ a man are ignored.
This is a practical, standard approach to the problem – some contemporary rule writers allowed saving throws in addition, but this was the state of the art in the 1960s. The implied theory is fine – circumstances which reduce the probability of a hit (range, cover, type of target, etc) are allowed for by reducing the number of hits. Whether the numbers which result are reasonable or correct might be a very subjective judgement – we could compare the results with known recorded events from history, but the main criteria are whether the game works, and whether the players are happy with it. Charge! gives a good, rollicking game which is easy to understand, though the arithmetic can still become troublesome at 2am after a bottle of wine.
Possibly as a reaction to what had become the establishment method, some dissatisfaction began to appear among gamers who felt this was too crude, that it was not “scientific” enough. Charge! uses large units – about 60 figures to a battalion, so the relatively large numbers of dice in use would cause some averaging of the results, but people with 20-man units would be throwing 2 or 3 dice, which gives greater volatility. I can imagine some disgruntled player whose grenadier battalion had just rolled two 1s at long range, feeling this was unreasonable, that he had been cheated by the rules. He might point out that the 20 figures represent 750-odd men, who could get off something like 1500 shots in a 1 minute bound. If we know the probability of a single shot finding its target, we should really be throwing 1500 dice (or similar), which would give a much more predictable, much more even result. I would be prepared to bet that some hero, somewhere, did attempt to throw a dice for each musket shot. However, “if we know the probability” is the key phrase – in fact we don’t really, but we’ll come back to this point later.
The Wargames Research Group produced their famous table – you worked out the combat factor for the kind of weapon and the circumstances, threw a dice or two, and looked up the table, and it would tell you that the target unit had lost, say, 27 men (not figures) which at 20:1 figure scale meant you’d lost 1 figure plus 7/20 of a figure. You kept a note of all the bits, and removed complete figures when appropriate, and this was widely accepted as a step forward – it was now pretty much impossible for your grenadiers to miss – they just hit very small parts of a figure, which would eventually accumulate to something which represented discernible damage. There were those of us, admittedly, who considered the extra record-keeping something of a nuisance, but progress can often have a small cost.
Combat losses still had some variability, but using this approach they were generally closer to expectation. An extreme case of this was developed in Arthur Taylor’s Rules for War Gaming, published by Shire Publications in 1971, which set out diceless rules; in a given situation, the casualties inflicted are always the same. I am not proposing to dismiss this approach – it was regarded as returning something of a chess-like precision and dignity to the wargames, but in its way it is just as daft as completely random results. [I used to have this book, but don’t seem to have it now – entirely out of idle curiosity, did anyone ever fight battles using Taylor’s rules?]
A big problem is that we do not actually know what the probability of a hit is – we do not know what it is in general terms, and we certainly do not understand the variations from man to man, from moment to moment. I remember that, like a lot of other gamers, I used to search for some clues which might give some evidence of what hit rates really were in history – just something factual to hang a hat on.
Contemporary diarists like George Simmons (95th Rifles) would occasionally give a tantalising glimpse of the reality – he might say that in a smart skirmish with the French outposts his company lost, say, 5 men wounded and 1 killed, which was considered light in view of the severity of the fighting. Very clearly, Simmons had some view of what sort of casualties you might suffer on such an occasion – it would not be a probability calculation or a dice throw, it would be what his experience led him to expect, and he probably could not tell you what the expected number was, just when it seemed heavy or light to him. That’s entirely subjective, but at least he knew what he was talking about, which most of us patently do not.
I was thrilled to bits when Bill Leeson translated and published Von Reisswitz’ Kriegspiel in the early 1980s. I was fascinated by a number of aspects of the game and the book, but in particular I spent many hours poring over the tables – here, at last, was something entirely relevant to horse and musket warfare, written by serving soldiers in the Prussian Army, no less – guys who would certainly know what was what. I confess I was surprised that the hit rates were so high – I would be reluctant to say I viewed them with suspicion, but Kriegspiel was bloodier than I had expected. That was when I first started to have doubts about how helpful actual casualty returns are when constructing wargame rules. [It’s appropriate to remind ourselves that Kriegspiel is alive and well, and nurtured these days by the splendid chaps at TooFatLardies.]
Let’s go back to my nice new CE acronym – if I find that the 50th Foot have a casualty return of 74 all ranks at some battle or other, out of a morning strength of 428, does that mean that their CE was reduced to 82.7% of what they started with? Well, 74 and 428 are definitely real, official looking numbers, and it’s tempting to use them in this way, but it doesn’t seem very likely, does it? We’ve had some discussion of this in this blog before – when a unit is fired on, over and above the initial problem that we don’t fully understand the maths which would give us the likely number of hits, what happens to the target’s CE, as I have chosen to call it? Some of the men will be physically disabled – some permanently – and some slightly hurt; some of them will be shocked into a state of reduced capability, some will be discouraged – some may even be discouraged enough to seek a change of location to somewhere less stressful. A unit of Prussian guard might be so outraged by the insult that their performance is actually enhanced; a unit of Napoleon’s 16-year-old Marie-Louises might suffer no loss at all, but be so upset by being fired at that they take no further part. Almost anything is possible – as we have discussed before, the concept of morale is central to this, the level of optimism in the army, the fact that they may be fighting on home soil for their liberty, the inspirational qualities of their leaders, the level of training and experience of the troops, their physical state, the weather (probably) – and so on.
So if Von Reisswitz reckons that a combat will result in a number of losses, probably what he means – or should mean – is that the effect of the combat is a reduction in CE equivalent to the loss of this number of men. Whether or not this number of men actually make it into the casualty returns is of no interest at all until we work out strengths at the end of the day to feed back into our campaign. Separate issue.
To those of us who have ever felt a temptation to snort at Little Wars’ simple blood-bath melees, in which equal sized units simply eliminate each other, just think – what are the chances of an evenly matched melee leaving the winners in a position to do much else for the remainder of the day? They are not dead, they are merely resting.
The big godsend to everyone with this sort of appetite for numbers was Maj-Gen BP Hughes’ Firepower, which was published in 1974. The timing was spot-on, and it presented a lot of fascinating and authoritative material in a readable and understandable way. I still think this is a great book, though I am a little saddened by the fact that some writers have used it subsequently to justify some pretty crazy extrapolations from the factual bits.
Hughes describes field trials of artillery pieces, and I would love to see contemporary pictures of the trials being carried out. Case shot, for example, was fired at a number of ranges at a large (battalion-sized) canvas screen, to estimate numbers of hits at various ranges. Brilliant. I have a lovely vision of gentlemen with large moustaches, solemnly marking off the holes in the sheet with the official crayon, to avoid double counting, and presenting a double-checked return to the officer in charge (lots of saluting and stamping boots). The Army would be in its element, ordering some poor grunt to count holes.
Hughes reports similar trials with various kinds of artillery projectile and small arms volleys, and painstakingly tabulates and explains the results. He also spends some time discussing the shortcomings of the data, and he examines Albuera, Talavera and a couple of other battles by analysing losses and the estimated effect of fire. Excellent.
One of the parts which most of the wilder enthusiasts did not read was Chapter 3 – Inefficiencies of the battlefield. In this he points out that the trials were designed to examine the optimal capabilities of the weapons, not to estimate their effectiveness in battle. The test circumstances were abstract, artificial, calm. Everyone would be on his best behaviour, the best gunners would be selected, all distractions would be eliminated, and anything which did not work would, presumably, be repeated. In a real battle, Hughes says, other elements would come into play which would change the situation out of all recognition:
1. The “animate” target – not only would they be moving and taking shelter, but the beggars might even shoot back
2. Technical failures – this includes routine misfires as well as more dramatic failures
3. Human error – now you’re talking – the sergeant can try to make you fire, but he can’t make you hit anything
4. The nature of the ground – unfavourable slopes, hidden areas, cover, variable bounce
5. Ammunition – the need to conserve it, and the variable quality of its manufacture and condition
6. Smoke – we think they’re out there somewhere...
What relevance do the battlefield trials have when applied to actual battle experience, then? Probably not very much, in truth.
While we are on this topic of the hopelessness of estimating probabilities of a hit, it seems appropriate to introduce a gentleman named Nassim Nicholas Taleb. He is a writer, quite a celebrity, in fact, and variously regarded as anything from a guru to one of the most irritating men around. I cannot claim to be an expert on his work, though what I know of him suggests that he has the rare gift of being able to present a limited number of important ideas in sufficient different ways, with different wording, to allow him to publish a surprising number of books featuring them. I recall that Edward De Bono used to be adept at the same strategy, but that was some years ago, and is, in any case, a digression. This is not to say, of course, that the ideas are incorrect – merely that over-exposure does not seem to improve their level of general acceptance.
In his The Black Swan: The Impact of the Highly Improbable (Penguin, 2008), Mr Taleb makes the important point that mathematical models do not work, and are unreliable for anything other than artificially simple games of chance and similar. Basically, what he says is correct, which is faintly disappointing for sad souls like me who spent years working with models to perform stochastic testing on populations, funds, stock markets and the like. He coins the expression Ludic Fallacy to describe what he sees as a practice which is inaccurate and even dangerously misleading – his main target is the world of finance. He identifies that economists, fund managers and investment analysts who grow to trust computer models set themselves up for catastrophic disillusionment and failure, since the model will not cover everything.
The world, says Taleb, is a dirty place, in which the things we do not know, or cannot measure, or (most importantly) just haven’t thought about will swamp the things which we can actually calculate. Tinkering with the decimal places of how many canister balls hit the canvas screen is worse than pointless when trying to simulate real battle action, when the numbers will be changed out of recognition by a whole raft of interacting intangibles, most of which we cannot predict or even fully understand. We may be doing our best with what we can actually get a numerical handle on, but we are – to quote my grandmother yet again – whistling into a gale.
Even the simple world of games is not clean. The odds of a head (or an eagle, or a zarg, or whatever) when tossing a coin is one half – 50% - every schoolboy knows this. If a coin turns up four tails in a row, what is the chance of a head? Again, the theory says it is still 50% - in an infinite series of tosses of our coin, we would expect 50% of the results to be heads, but 4-on-the-trot is a very small sample, and not significant. OK then – what about 99 tails in a row? What then? Well, 99-on-the-trot is not very likely, but it can happen, and the theory reassures us that there is still a 50% chance of a head on the next toss. However, at this point, you or I – or even a statistician – would start to suspect that the coin is dodgy, and tend to bet on another tail next time.
So where does that leave us? To be honest, I’m not entirely sure. I was brought up to trust in the purity of mathematics, but I can appreciate that calculating, for example, the effect on a raw battalion of a single volley is beset with all sorts of unknowns and things that can vary wildly from instance to instance. The WRG might expect them to lose an average of 4 figures plus 11/20 of a figure, give or take a few; even Rifles officer Simmons would have had some kind of expectation of that sort, but I suspect the fact of the matter is that a volley of 300 muskets in clear conditions at 100 paces might be expected to injure about 80 men (say), but the standard deviation is high, because of the unstable nature of the underlying probabilities, and the mixture which they present. It was not unknown for such a volley to hit no-one at all, and there must be a very slight chance that 200 men could be laid low.
We need mechanisms which give results which can be seen to be reasonable over extended experience of their use in gaming. The mechanisms should be simple to use, and they should allow a fair amount of variance – maybe more than the scientific wargamers would have claimed. We should give due weight to factors like first volley of the action (perfect loading under the NCO’s eye), and the steadiness and calibre of troops, but what exactly is due weight? Maj-Gen Hughes and our new friend Mr Taleb would agree that the things for which we cannot come up with exact numbers probably overwhelm the things for which we can.
You know what? The game is the most important thing - paramount. The more I think about this, the more attractive are the rules in Charge!, which seems a Very Good Place to End.