Sunday, November 26, 2006

One-Day Batting: Part 2

As a nation mourns, professional parliamenterians act suitably outraged, and the cricinfo page for a lowly Ranji trophy match gets frantically loaded and refreshed on desi desktops across the world, it is time for me to take a second look at one-day batting numbers.

When I introduced the ground-breaking concept of batting productivity, I duly noted the dominance of Australians. Until Ponting and Gilchrist started to explode, Australian one-day batting, unlike the subcontinental or Caribbean counterpart, used to be more about solidity and consistency than flair and bursts of unpredicatable greatness. In order to get some quantitative support for our intuitive sense of predictability, I always wanted to take a look at the standard deviation of the scores of major batsmen and finally managed to excel it this weekend from cricinfo innings-by-innings data.

Here is a comparison of top ten -- selected by productivity and ranked by averages in the spreadsheet-- Indian and Australian batsmen. Yes, I am sad to see Kaif and Vengsarkar among the top ten -- Laxman and Manjrekar narrowly lost to them -- but the scarier fact is there are no other batsmen even in the reckoning except those twelve. The Australian list, on the other hand, is missing Boon, Border, Slater, Taylor, Moody, Hussey and Clark among others. But that is a different blog post.

(Click on the chart for a full-size version)

The average of Australian top ten is about ten percent greater than that of the Indian top ten - a number slightly higher than what I would have guessed. However, a lot more interesting is the fact that, in spite of their higher averages, the Australian batsmen do have a lower standard deviation than their Indian counterparts. As the means are different, we need to look at the coefficient of variation -- the ratio of standard deviation and mean -- expressed in percentage in the spreadsheet. The greater the coefficient of variation, the lower is the predictability. Higher average and lower standard deviation give Australian batsmen a significantly higher predictability.

The numbers do provide some support for our intuitive feelings about which batsmen are more consistent than others i.e. who are likely to match their expected scores. For example, Dravid is expectedly more reliable than Ganguly, Tendulkar or Sehwag, but perhaps surprisingly is not quite as predictable as Azhar or Jadeja was. On average, middle-order batsmen have lower standard deviations than the openers. This is expected as openers have more overs at their disposal to score a really big one from time to time. On the other hand, they are also a lot more likely to receive unplayable deliveries and face aggressive field settings resulting in a cheap dismissal. However, I think the comparison between the sets of Australian and Indian batsmen is a fair one as all batting positions are almost equally represented in the two sets which are also comparable mixes of batsmen from different time periods.

Apart from the batting order, another explanation of differences in standard deviations could be the playing style. If that is indeed the case, one would expect that to show up in career strike rates. The unpredictability of Gilchrist, Tendulkar or Symonds is at least partially a consequence of their aggressive batting style which invariably involves taking more risks, but it is less so for Ganguly, Sidhu, Mark Waugh or Kaif. Risk effectiveness -- the ratio of strikerate and coefficient of variation -- can be used as an indicator of how much of a batsman's inconsistency is a result of taking risks and can be excused beacuse of his high strikerates. One thing that stands out and might to some extent explain why Ganguly -- the batsman -- is such a polarizing figure among Indian cricket followers is the fact that his risk effectiveness, along with Sidhu's, is the lowest among this set of twenty batsmen even though his average is extremely high and he is at a very creditable twnetieth position in the all-time all-country productivity ranking which does take strike rate into account. It is the combination of a low career strike rate and a high variation in scores that possibly leads to heated arguments and selective memory about his one-day batting record. As an aside, the average strikerate of top ten Australian batsmen is about five percent greater than that of top ten Indian batsmen.

Finally, ceteris paribus, is it desirable to have a lower standard deviation in individual scores? I think so and I think it is so because of the importance of partnerships. When the variation of a team is high, chances of multiple batsmen firing in the same match is lower and that implies fewer big partnerships. As an example, Gilchrist and Ponting brilliancies are cherished a lot and outlier Bevan's incredible records are deservingly celebrated, but Martyn and Lehmann - with strike rates close to 80, very low standard deviations and very high 'risk effectiveness' numbers - and their partnerships with the likes of Gilchrist, Ponting and Bevan have not been any less significant to Australian dominance.


vivek said...

Great deal of good work you have done with these 2 posts :)

Hope you will fine-tune it further and let us know!

Dipanjan said...

Thanks Vivek. I have some ideas about fine-tuning - excluding performance against minnows, trying to weigh home/away scores, trying to factor in the fact that batsmen do not face the bowlers of their own countries.

Not sure if I will get to it though, maybe someone else will. Excel and CricInfo database make life a lot easier for amateur cricket stasticians. Still remember the childhood days of copying scorecards from newspaper on to the pages of maths exercise books before the stack of papers was bartered for utensils.

Anonymous said...

Hi Dipanjan,

Your painstaking effort is highly commenable.

Are you a statistician like the late B B Mama or Mohandas Menon ?

Hope I can ask you about any interesting statistics.

Anonymous said...

The deconstruction of Ganguly's performance (high variation of scores, not-so-high strike rate) is undoubtedly the most concise and comprehensive I have come across! Usually, any analysis of his performance is too mired in parochialism and jingoism to make cricketing sense.
(in fact, I am guilty of this myself!)

You should be posting much more often!

Anonymous said...

Charlottesville Virginia Real Estate Has great MLS LIstings on there site I just couldn't read all the content that was on there.

Anonymous said...

I was looking at all the Real estate Signs California I happened to find the abslote best price in california with this company... Great service...

Anonymous said...

I recently found real estate signs from DEESIGN, The Best value and customer experience a realtor will ever have, hands down.