This article contains information concerning the mathematics involved in the spreadsheets which are described in Part 1. That article was revised on 5 March 2017 and the spreadsheets were revised on 19 February 2017. It is assumed that the readers of this article have some knowledge of statistical mathematics.
Any row which is concerned with only partial fulfilment of a purchase order is weighted accordingly.
For each row, for the purpose of testing the null hypothesis that the dates are correct, I assumed a lognormal distribution of supplier lead times, an example of which is shown in the following graph:
The estimation of the natural logarithms of the mean and standard deviation of the lead times was done separately for each row using all rows except the one concerned. That was necessary in order to prevent outliers or multi-modal lead time distributions from severely adversely affecting the hypothesis testing. Another thing which I did for the same reasons was to base the estimate of the standard deviation of the logarithms of the lead times on mean absolute deviation (MAD) because of its robustness.
Cells L4 and Q5 were calculated using the fact that the variance of the sum of independently and identically distributed random variables is equal to the sum of the variances of the individual variables. For manually set mean supplier lead times, it is assumed that the amount of historical data used in setting them is the same as in the spreadsheet.
The testing which I have done appears to show that the techniques which I have used are effective at ensuring that suspect lead times are flagged, either immediately or after the initially flagged rows are checked and corrected.