An Empirical Investigation of Lead Time Distributions

. This paper proposes a methodology for analyzing lead time behavior. The method focuses on identifying whether lead times are in fact identically independently distributed (i.i.d.). The method uses a combination of time series analysis, Kolmogorov-Smirnov’s test for similar distributions and data sampling to arrive at its result. The method is applied to data obtained from a manufacturing company. The conclusions are that while the lead time to customers can for some products be assumed to be i.i.d. this is not uniformly true. Some products’ lead times are in fact neither independently nor identically distributed.


Introduction
Supply chain management has long been one of the leading topics in both management and academia. The focus is the management of activities a cross a chain of companies and thus partitioning the supply chain into a number of echelons. As with all fields there are number of assumptions build in to the way supply chain management literature addresses the planning and control activities (Otto and Kotzab, 2003). This paper investigates lead times and the associated assumptions found in Supply chain literature.
In practice most research into supply chain management assumes that lead times follow one of two forms. Either lead times are assumed to be constant (see e.g. Chen et al. 2000) or they are assumed to be i.i.d. (Kim et al., 2006;Michna et al., 2013). Rather than attempt to model the impact of a given type of lead time behavior on a supply chain, this paper investigates the actual lead time behavior in a manufacturing company. The aim is to identify the actual lead time behavior and subsequently in further research model this behavior and its impact on supply chain performance.
The remainder of the paper is structured as follows. First, a brief literature review of the current-state of supply chain management with regards to modeling lead time behavior is presented. Second, a method for analyzing the distributions of lead times is presented before it is applied to data from a company. Finally implications from the case investigation and future avenues of research are presented. Otto and Kotzab (2003) define six specific approaches to address supply chain management with varying focus. This paper focuses on the perspective Otto and Kotzab (2003) would term Systems Dynamics and papers related to this perspective. Systems Dynamics is characterized by focusing on distortion of demand patterns for various reasons (demand forecasting, non-zero lead time, supply shortage, order batching, and price fluctuation (Duc et al., 2008) and typically this distortion is quantified by the bullwhip effect (variance of downstream orders / variance of upstream demand e.g. Chen et al. (2000)).

State-of-the-art
This research is narrowly focused on lead times and the assumptions build into analytical or simulation based models for determining the bullwhip effect in a given supply chain. It is also based on the notation that IT can be used to estimate and register lead times in real supply chains (Arshinder and Deshmukh, 2008). Chen et al. (2000) is one of the main contributions to quantifying the bullwhip effect in supply chains. However, in the work of Chen et al. (2000) lead times are actually assumed to be constant. The same goes for the control theory approach used in Dejonckheere (2003). This is not the case in the more recent work of Duc et al. (2008). Here the bullwhip effect is quantified for a system with stochastic lead times. The lead times are assumed to be stationary and i.i.d.. The same assumption is found in the work by Kim et al. (2006), who in their work use a similar analytical approach to Chen et al. (2000) and Duc et al. (2008). This is very significant as Chattfield et al. (2004) note that stochastic lead times are major source of bullwhip effect. Kim et al. (2006) choose another approach by assuming that both demand and lead times are stochastic, but rather than predict demands and assume stochastic lead times they choose to predict lead time demand. Another interesting aspect addressed by Chaharsooghi and Heydari (2010) is determining for a specific supply chain what has the largest effect; reducing the average lead time or reducing the variance of lead time? Chaharsooghi and Heydari (2010) use simulating and multivariate models to conclude that lead time variance is in fact a major cause of bullwhip effect. This underlines the importance of determining actual lead time behavior to reduce the bullwhip effect in supply chains. If lead times are not in fact i.i.d. the bullwhip effect will in all likelihood be higher than expected, and also higher than standard models will be able to explain. Nielsen et al. (2010a) offers an approach to improve supply chain planning through the use of RFID technology to track and thus be able to provide higher quality information, while Sitek and Wikarek (2013) offers approaches for improving planning through optimization. The main issue in applying optimization methods for supply chain management is the quality of the information available for the methods.
It is interesting to note that there is very limited research that actually investigates lead times, regardless the fact that quite some research has been conducted assuming a given lead time behavior. This research addresses this gap and suggests a method for determining whether lead times are in fact i.i.d..

Method of analysis
The method for analysis is composed so that it addresses both the independence of observations (in this case only investigated as L i is independent of L j , where L j is a lead time observations lagged arbitrary to L i ) and whether any set of observations stem from the same distribution as any other set.
The data used is from a manufacturing company and represents the sales for the ten most frequently sold (in terms of number of order lines) make-to-stock products over a two year period. The most frequently sold product has 6,967 orders in the period and the tenth most sold has 2,158 orders. So no testing is done on less than 2158 observations. Data is only cleaned in so far that any lead time that is more than six standard deviations from the mean are removed as the tests methods tend to over fit to the few extreme distributions. The outliers are removed in one step.
The analysis methodology addresses both the aspect of identical distributions and independently distributions. From literature two combined assumptions have been identified and thus the analysis methodology must address both these aspects. This means that there are two hypotheses that must be tested: 1) Lead times are identically distributed, i.e. for any given practical purpose it is possible to assume that for a given planning horizon the lead time distribution is the same for the whole period.
2) Lead times are independently distributed. This may have different implications, but is typically taken to mean that any lead time observation does not depend on any previous observed lead time value. It may however also be taken to mean that lead times is not dependent on any other variable. In this research only the first aspect is investigated.
To investigate both these aspects the analysis methodology shown in Figure 1 is used. Kolmogorov-Smirnov's (KS) test is used to determine whether or not the lead times stem from identical distributions. The KS test is a widely used robust estimator for identical distributions (Conover, 1971) and does not suffer from some of the weakness of other tests such as Chi-squared. The method (as seen in Figure 1 and 2) relies on comparing samples of lead times and using KS test to determine whether not these pairwise samples are identical. In this research a 0.05 significance level is used and the ratio of pairwise comparisons that pass this significance test is the output from the analysis. Different sample sizes are used to determine if the lead times can be assumed to be similar distributed in smaller time periods, and thus if it is fair to sample previous lead time observations to estimate lead time distributions for planning purposes. For the (in)dependence tests Box-Jenkins autocorrelation function is used (Box and Jenkins, 1976). In this case the test of independence is run on the mean lead time achieved on any given actual delivery date. The mean value is used as it makes no SL: a sample of lead time Z: sample size of a sample of lead time J: number of pairs for pairwise comparison sense to chronologically order lead times for individual orders in smaller time intervals than one day. It is for all practical purposes not reasonable in most manufacturing environments to discuss manufacturing lead time in smaller intervals than whole days, as the data is not sampled with that level of detail. Only the first lag is considered as no systematic behavior is expected beyond piecewise increasing/decreasing lead times in the particular case study. In other contexts seasonality or customer order cycles could be incorporated and other lags should be included in the analysis.

Case application of method
The following section explores the results of the case study, with the results from applying the methodology to the case data shown in Figure 3.

Fig. 3. An overview of first order autocorrelation
As seen in Figure 3 the first order autocorrelation values for the ten products range from 0 to 0.4. For practical purposes it would reasonable to assume that lead time set that have first order autocorrelation values below 0.2 is in fact independently distributed. Out of the ten products, six products have first order autocorrelations less than 0.2, while the remaining ten products have first order autocorrelations in the interval 0.25-0.42 indicating some form of dependence in the mean values of the lead times. It is also interesting to note that all the first order autocorrelation values are positive, indicating that to some extend large average lead times are follow by large average lead times. This clearly underlines that the average lead times may be linked to capacity issues, despite the products being purely make-to-stock. If capacity constraints are periodically present, one would expect to find that the average lead times are in fact dependently distributed, with positive autocorrelation values for a number of lags. In this aspect it is also interesting to note that the four products that are not independently distributed are in fact the third, fourth, seventh and tenth most sold products. So there is no indication that the number of observations is a relevant criterion when determining whether a product is in fact independently distributed. Furthermore it should be noted that the products are all produced on the same production line, with very similar components and process times. This could indicate that the company in periods of constrained capacity / material shortages chooses to deliver specific products, e.g. caused by prioritization of certain customers. As can be seen from Figure 3 there is a clear tendency that the larger the sample size used for the pairwise comparison with the KS test for identical distributions, the less of the samples are significantly identically distributed. However, to determine whether the distributions can reasonably be considered to be identical one needs a benchmark. For this reason a benchmark study is conducted. In the benchmark study 10,000 pairs of distributions in sizes 50, 100, 150 and 200 (the same samples sizes as shown in Figure 3) are generated and compared. Three distributions are chosen for the experiment; normal, exponential and uniform distributions, and the sample values are rounded to nearest integer (as lead times are integers). The benchmark value is then how many of these samples are actually for a given distribution and sample size found to be similar when they are known to be sampled from the same distribution. The benchmark values can be seen in Table 1 The indication from the values in Table 1 is that even for small samples (50 observations) more than 95% of the comparisons should be significantly the same using a KS test. None of the sampled lead times achieve this high level of confidence. It is noteworthy that the KS test performs equally well for all three benchmark distributions, so no bias can be expected in the test due to the shape of the lead time distribution. It would be fair to conclude that most of the products exhibit a behavior that indicates that for reasonable sample sizes the lead times actually stem from the same distribution. The data covers around 500 work days of observations, meaning that even the product with the highest order frequency only has approximately 15 orders/day, assuming a lead time of 7 days, that would mean that on average 105 orders can be observed during a normal lead time and thus in c. 80% of the cases the lead time distribution for the next 7 days would be the same as for the previous average lead time period. This means that for most of the products it is actually fair to assume that the lead time distribution for the next lead time period can be estimated from the lead times observed during the last lead time period. It however also means that lead time distributions should be updated with relatively high frequency, which can lead to nervous planning systems. It is also interesting to note that the products with high first order autocorrelation values exhibit the poorest performance on the test for identical distributions as well. This is not unexpected as the KS test for similar distributions also tests the mean of the samples. If there is a tendency that lead times increase /decrease systematically, it is also fair to assume that the distributions change. Here it is critical to note that the shape of the lead time distribution may actually be the same, an obvious subject for further research. Taking together with the knowledge that demand distributions may exhibit similar time dependent distributions even within a planning period (Nielsen et al., 2010b) the planning problem is very much more complex that the assumptions allow for.

Implications & further research
There are a number of implications that can be inferred from the presented analysis for both academia and practitioners. First, it is obvious that lead times are, in the presented case, not constant as assumed in most supply chain models (see e.g. Chen et al. (2000)). Second, some of the products exhibit lead times that neither stem from identically nor independently distributions, while a number (six out of ten) can in fact reasonably be assumed to be i.i.d. For research purposes this indicates that even assuming that lead times are i.i.d. may be an oversimplification compared to real-life conditions. It also underlines the folly of assuming that lead times are constant, as this is a gross oversimplification. For practitioners there is also a significant implication. Specifically that lead times are in all likelihood not constant and thus that this uncertainty should be included in the planning and control approaches used. Another interesting implication is that the more information used to estimate the lead time distribution, the worse the estimate of the distribution in fact becomes. This is due to the pairwise comparison of identical distributions indicating that large sample sizes seldom lead to the same lead time distributions. In practice this means that companies should update their lead time information frequently and disregard old observations.