Technology is continuing to play an integral part in sports. In cricket too, there are many areas where technology can be used. Machine learning will play an important role in Sports Analytics.
We believe that we can use Machine Learning to analyze historical cricket games, and use this to continuously improve the Duckworth Lewis (D/L) Method of computing target scores in rain-shortened matches.
The Current D/L method is a statistical method invented by statisticians Frank Duckworth and Tony Lewis. It is designed to calculate the target score (or the PAR score) that the second batting team (in a rain-interrupted match) needs to achieve. Today, there are two D/L models/editions that are available to the cricket community: Standard Edition and Professional Edition. The Standard Edition is a chart-based model used for non-ICC match and local matches. The Professional Edition is a software-based black box model, and is used by the ICC for all official matches.
After Duckworth and Lewis retired, Professor Steven Stern (from the Queensland University of Technology) became the custodian for the method, and the method was renamed as the Duckworth-Lewis-Stern method (or D/L/S method). In many pieces of existing literature, it continues to be referred to as the D/L method.
Improving the D/L Method
The D/L table is static and does not take into consideration the latest game statistics (e.g., which teams are playing better this season, ranking of players, etc.).
We believe we can use historical Twenty20 data to derive an always up-to-date D/L table that takes into these latest statistics. This can be operationalized using Azure Machine Learning and run on a frequent basis to always produce an updated D/L table.
To achieve this, we analyze the T20 data from http://cricsheet.org/, which provides ball-by-ball data for international and IPL cricket matches. The T20 historical data captures the ball-by-ball for about 620 matches, and 153K rows of ball-by-ball data.
Using a Jupyter notebook, we show the data exploration on how we can derive a better D/L table by applying quadratic curve-fitting with constraints techniques using the T/20 data. This Jupyter notebook is now available to the data science and cricket communities so they can take this foundational information and work together to improve the state of cricket analytics.