Impact of linear correlation on construction project performance using stochastic linear scheduling
 Ricardo Eiris Pereira^{1}Email author and
 Ian Flood^{1}
Received: 15 December 2016
Accepted: 16 April 2017
Published: 27 April 2017
Abstract
Background
In the construction industry, the productivity of all trades is directly impacted by uncertainty and variability. For repetitive projects, smooth work flow of productive resources is necessary to minimize or eliminate interruptions and idle time with the objective of reducing costs. An ideal or near optimal solution requires careful planning of the sequence, timing and resource allocations for each activity. Earlier research has demonstrated that uncertainty in the duration of repeated activities can have a significant impact on what is determined to be the optimum project plan. This suggests that correlation in the duration of repeated activities (where durations are stochastic) may also be important in determining the most favorable plan.
Methods
This study assesses the significance of correlation in this respect, using a Linear Scheduling framework for modeling repetitive construction work. Numerical and graphical results are used in a case study to evaluate and compare the optimal solution derived for both deterministic and stochastic environments. For the stochastic environment, a range of levels of correlation are considered using linear correlation between immediate successor repetitions of an activity.
Results
The results provide insight into the effects of different degrees of correlation on the expected project duration, cost, crew and equipment idle times and interruptions. The correlation level has effects that translate into performance tradeoffs depending on the initial plan assumptions applied to the activities.
Conclusions
The impact of full correlation on the optimality of a project plan was found, on average, to introduce idle time to crews equal to 7% of their active time, and to cause avoidable delays to the completion of a project equal to 12% of the project duration. The authors believe these inefficiencies justify further investigation of the impact of correlation on construction projects, including the development of more sophisticated models of correlation.
Keywords
Background
Many of the activities performed in construction are repetitive in nature. Activity repetition is most prevalent at a low level in a work breakdown, such as the cycling of equipment in an earthmoving operation or the laying of bricks, but it is also common at intermediate and high levels, such as the laying of utility lines or the construction of many similar floors in a highrise building. Repetitive activities can be either discrete or continuous processes but most planning tools are limited to one or other perspective. The critical path method (CPM), for example, treats all activities as discrete units  this is convenient for activities that are inherently discrete, but activities that are continuous in nature (such as the operation of a tunnel boring machine) must be converted into a series of discrete units of work.
Planning projects where there is significant repetition of activities becomes challenging using traditional activity network methods (such as CPM) because of the difficulty of ensuring continuity in resource utilization (Harris and Ioannou 1998) and the consequently large number of activities and dependencies that must be defined and maintained. As a result, alternative planning methodologies have been considered in construction such as the Linear Scheduling Method (LSM) which represents work as discretecontinuous activities plotted across space and time. LSM provides a visually insightful framework for representing activity progress and understanding how interactions between activities impact that progress.
Regardless of the planning methodology adopted, modeling repetitive activities requires careful attention to ensure accuracy since a small error in the estimate of a single repetition translates to a large error over many repetitions. Moreover, effects such as learning and forgetting (Gates and Scarpa, 1972) in repetitive activities can be dramatic and if not properly addressed can lead to significant errors in the estimation of project performance. Uncertainty in activity performance must also be taken into account since it can significantly impact the accuracy of project performance estimates. Ignoring uncertainty (using a deterministic analysis) leads to optimistic estimates of project performance for concurrent interacting processes, the socalled fallacy of averages. The PERT method is a relatively popular tool used for modeling uncertainty in construction schedules, but it only considers uncertainty along the deterministically derived critical path and therefore underestimates both project uncertainty and project duration. Consequently, the PERT method, while simple to use, is only suitable for projects that have a dominant critical path with a low probability of other paths becoming critical. Indeed, interactions between construction processes are usually sufficiently complicated that stochastic effects can only be modeled accurately using statistical sampling techniques, the most popular of which being the Monte Carlo method.
Recent years have seen an interest in developing optimization and satisficing methods for planning repetitive construction work. Ioannou and Srisuwanrat (2006, 2007, 2007a, 2007b) proposed and evaluated a technique for planning a smooth work flow for productive resources operating in conditions of uncertainty, set within a linear scheduling (LSM) framework.
Trofin (2004) and Flood et al. (2004) implemented a Monte Carlo analysis using the LSM framework to assess the impact of uncertainty on project duration, and activity idle time. It was shown that increasing the level of uncertainty not only increased the expected project duration but also changed the optimal schedule.
Rachmat et al. (2009) investigated stochastic simulation on repetitive projects to incorporate activity performance uncertainty in lookahead scheduling. A case study was undertaken for the construction of a pipeline, where real data were collected in the field, fitted to a statistical distribution, and processed by a simulation package that took into account uncertainty using Monte Carlo sampling. The output from these models was a variable production rate linear schedule of all the activities that comprised the project. In this analysis it was concluded that including uncertainty on linear schedules improves the forecasting capability of project performance and thus helps a scheduler anticipate problem areas and formulate new plans that improve project performance.
Processes that are naturally stochastic can also demonstrate correlation between the duration of repeated activities. Positive correlation means that if one activity (or repetition of an activity) takes longer than expected then the correlated activities (or repetitions of that activity) are also more likely to also take longer, and vice versa. Work on correlation between construction activities (repeated or otherwise) is minimal, but it is easy to demonstrate that positive correlation affects the statistical performance of a project by increasing kurtosis, meaning that more of the variance in the performance of a project results from occasional larger deviations as opposed to more frequent smaller deviations. An outstanding question, however, is whether the effects of correlation significantly impact the optimality of a plan. This paper reports on ongoing research into this question. It introduces the questions being investigated and their rationale, the proposed approach to resolving them, and the results from a series of experiments designed to assess the potential impact of correlation on project plan optimality. If correlation is found to impact plan optimality, then this will justify further work into the development and validation of more accurate models of correlation.
The paper provides a review of the concept of activity correlation in Activity correlation section. This is followed by a description of the modeling approach adopted in this study (Modeling approach section), and the experimental plan (Research plan section). The results and their analysis are then presented in Results and Discussion section, followed by a summary of the conclusions and recommendations for future work in the final section.
Methods
Activity correlation
Correlation in the context of this study is concerned with the relationship between the duration of activities that are intrinsically uncertain. That is, once the uncertainty about the duration for one of the correlated activities has been resolved (such as when the activity has been executed on site and thus we have a measure of its actual duration) then we can make a statistically more accurate estimate of the duration of the correlated activities. Correlation occurs when the durations of different activities are determined by common factors, such as excavation activities that operate in similar ground conditions, utilize the same crew, and/or are overseen by the same superintendent.
If correlation is perfect (often represented as correlation = ±1.0) then resolving the uncertainty for one activity will in effect fully resolve the uncertainty for the correlated activity. If, on the other hand, there is no correlation (correlation = 0.0) then resolving the uncertainty for one activity will not resolve any of the uncertainty in the other activity. It follows from this that correlation could be partial, somewhere between 0.0 and 1.0 or 0.0 and −1.0. This study is concerned with determining the significance of correlation in terms of its impact on the optimality of a project schedule. For this reason, a range of levels of correlation from 0.0 to 1.0 will be considered. Negative correlation will be outside the scope of this study, although it should be considered in future work since it is likely to occur in systems that, for example, include information feedback such as the optimization of work processes in real time through ‘lessons learned’.
Activities that are repeated would seem strong candidates for demonstrating correlation since the durations of the repetitions will likely be determined by many common factors. Moreover, the impact such correlation may have on the optimality of a plan will likely compound over many repetitions and thus could quickly become significant. For this reason, this study will consider correlation that occurs between repetitions of an activity but not between different activities.

\( {D}_n \) = the duration for the n^{th} repetition of the activity;

\( D{\prime}_n \) = a stochastically generated component for the duration of the nth repetition of the activity;

\( {D}_{n1} \) = the duration of the (n1)^{th} repetition of the activity; and

\( k \) = the correlation between the durations of subsequent repetitions (ranging from 0.0 for no correlation to 1.0 for perfect correlation).
While there is no published work to support any particular model for representing correlation of durations for repetitive construction activities, the authors chose this approach since it is simple to implement. An alternative approach, for example, could be to combine weighted durations of more than one previous repetition of an activity. Further work in this aspect of correlation modeling is required to determine the most appropriate model.
While the studies by Trofin (2004) and Flood et al. (2004) were concerned with how stochastic effects in repetitive activities impact project performance, they gave no consideration to correlation. Implicitly their model had correlation set to k = 1.0 between the durations of repetitions of an activity in that all repetitions of a given activity had the same duration. Rachmat et al. (2009) also gave no consideration to correlation, although in their case correlation was set implicitly to k = 0.0 in that a new duration was generated stochastically for each repetition of a given activity with no back referencing to any previously generated values.
Modeling approach
Monte Carlo sampling
Stochastic analysis refers to the study and modeling of uncertainty with the objective of understanding its impact on performance, and is typically adopted in high risk projects. Uncertainty can be apparent in many different aspects of a construction project (such as labor productivity, the timing of information supply, and the cost of material resources) complicating the tasks of producing accurate estimates of project cost and duration. This, in turn, makes it difficult to determine a competitive bid for a project or produce a supportive plan. Several approaches have been applied to handle uncertainty in construction, the two most popularly accepted being Program Evaluation and Review Technique (PERT), and Monte Carlo sampling. The Monte Carlo approach has been implemented within a number of construction planning tools including the linear scheduling method (Wyrozębski, and Wyrozębska, 2013), and is generally more accurate and versatile than the PERT method. Indeed, of the two methods Monte Carlo is the only one capable of incorporating correlation. For this reason, the Monte Carlo sampling method was adopted for this study.
Monte Carlo sampling allows many project outcomes to be evaluated statistically. Random sampling is performed to select values for uncertain parameters, typically activity durations, with a probability of occurrence that is characteristic of the uncertainties in the real project. The accuracy of the approach increases with the number of samples made of possible project outcomes. The procedure of fitting observed data to an appropriate duration distribution, and then randomly sampling for possibly thousands of alternative scenarios is a highly intensive computing process. This study will use the SciPy package from IPython (Pérez and Granger 2007) for Monte Carlo sampling since it provides a convenient framework for model development and analysis.
For this study, the expected durations and variances of activities will be allowed to differ but the expected durations and variances for repetitions of a given activity will be fixed.
Linear scheduling method
Concepts such as float can be read from a linear schedule (LS) and dependences are represented in the form of buffers which can be measured in the progress dimension as well as the time dimension. Referring to Fig. 4, the dashed lines between the two activity progress lines are a graphical representation of time buffer and space buffer. This creates a simple but visually insightful representation of the production rate of the crews and their potential spacetime conflicts.
Once the visual representation of the work flow is established, less apparent patterns are reveled to the planner, simplifying the task of plan optimization. This may involve adding more crews (including equipment) to certain activities, adding more crew members (and/or other types of resources such as equipment) to certain activities, and/or changing the direction of flow of work.
Research plan
Objective function and objective variables
The aim of the study is to determine the impact of correlation between the durations of repeated activities on the optimality of the project plan. Specifically, the question addressed is: How does a change in the level of correlation from that expected affect the performance of the optimized project plan?
Missed Opportunities and Crew Idle Time are caused by stochastic and correlation effects that result in crew performance rates that are different from the deterministically derived optimum base plan. This base plan assumes that each activity progresses at its expected rate and the start times for each activity are set to ensure that crews neither spend time idle nor miss any opportunities for starting and finishing work sooner. Crew Idle Time represents an additional direct cost to the project in that the crews are employed for longer periods of time to complete the specified amount of work. Missed Opportunities represent an indirect cost to the project in that they lead to a longer than necessary project duration and therefore result in unnecessary project overhead costs.
Synthetically generated test projects
Investigation of the research question was completed for a sample of synthetically generated projects, similar to the approach reported by Trofin (2004) for assessing the impact of uncertainty on LSM plan optimality. The number of activities in each synthetically generated project was set to 10, a large enough number to permit complicated interactions between crews. Each activity was represented by its own Beta distribution which was used to generate the stochastic component of the duration of each repetition of that activity, the parameter \( D{\prime}_n \) in Equation (1). For construction simulations, the Beta distribution has been found to provide a good representation of the stochastic variance apparent in construction activities (AbouRizk et al. 1994).
Scope of study

The study is focused on linear construction work, such as occurs in tunneling, highway, highrise, and pipeline construction operations.

Correlation was considered to occur between repetitions of an activity but not between different activities.

Correlation was assumed to have a linear relationship as described by Eq. 1.

The expected duration for an activity did not vary with repetition. For example, deterministic variance resulting from, for example, learning and forgetting effects was not considered.

Stochastic variance in the durations of repetitions of an activity were assumed to be Beta distributed, and the parameters for this family of distributions were kept within the ranges observed in the study by AbouRizk and Halpin (1990).

Stochastic standard deviation was set to 16.9% of the expected duration for an activity.

Stochastic low boundary for the Beta distribution was set to 71.7% of the expected duration for an activity.

Stochastic high boundary for the Beta distribution was set to 39.5% of the expected duration for an activity.
Results and Discussion
The results of the experiments described above indicated that lower levels of correlation, between 0.00 and 0.80, did not show a significant impact on either Crew Idle Time or Missed Opportunities. However, for higher levels of correlation both variables were found to increase geometrically. Therefore, an additional 1000 LSM scenarios were generated for each level of correlation ranging from k = 0.80 to 1.00 in increments of 0.025, to provide a higher resolution in the results for the region where performance was found to change most dramatically.
Experimental margin of error for each correlation level
Correlation Level (k)  0.0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0 

Error  Crew Idle Time %  3.74  3.69  3.81  3.82  3.73  3.80  3.75  3.76  3.92  3.64  3.06 
Error  Missed Opportunities %  3.76  3.62  3.73  3.65  3.58  3.74  3.67  3.47  3.32  3.45  3.73 
Conclusion
The impact of correlation between activities on the performance of construction projects is not well understood. Moreover, existing models of correlation are limited in sophistication and largely untested in terms of their accuracy. Before investing resources in the development of more appropriate models of correlation for construction it was decided to first test whether correlation may affect project performance significantly. Specifically, this study had the goal of determining whether the optimality of a project plan is prone to disruption by unaccounted correlation. Project performance was assessed in terms of two optimality indicators: Crew Idle Time and project Missed Opportunities. The results showed that both performance indicators are significantly impacted if the level of correlation is high (between k = 0.8 and k = 1.0), in the worst case having an expected crew idle time of 7% of crew active time and an expected extension to the project duration of 12%. These results are applicable to problems that fall within the scope of this study.
These results provide justification for investing resources in developing our understanding of how correlation can best be modeled in construction. These studies would, in part, be aimed at extending the study beyond the limitations listed in Scope of study section. For example, studies are required to determine whether correlation between the durations of construction activities should use linear or nonlinear relationships, and whether correlation between repetitions of an activity is best modeled as a dependence between immediate repetitions or between the first and current repetition. Future research must also include realworld data from a comprehensive range of construction project types.
Alternative indicators for project optimality should also be considered. In particular, there is a need to determine the extent to which the Crew Idle Time and Missed Opportunities could be reduced if an accurate assessment of the level of correlation was available and used to determine the optimal plan  the question is not straightforward as the systems performance is subject to stochastic variance. Finally, work is required to determine how different levels of uncertainty in the duration of an activity affects the relationship between the level of correlation and plan optimality.
Declarations
Acknowledgements
No acknowledgement section contained within the document.
Funding
This paper is part of an invitation from 2016 ICCCBE from Visualization in Engineering. No funding was required for this submission as it was waived by the journal.
Authors’ contributions
IF: conceived of the study, participated in its design and coordination and the final draft arrangements of the manuscript. REP conceived of the study, participated in its design and interpretation of the data (statistical analysis) and drafted of the manuscript. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 AbouRizk, S. and Halpin, D. (1990). Probabilistic Simulation Studies For Repetitive Construction Processes. Journal of Construction Engineering and Management, Vol. 116, No. 4, ASCE, ISSN 07339364/90/00040575Google Scholar
 AbouRizk, S., Halpin, D., and Wilson, J. (1994). Fitting Beta Distributions Based on Sample Data. Journal of Construction Engineering and Management, 10.1061/(ASCE)07339364(1994)120:2(288), 288–305.Google Scholar
 Eiris Pereira, R. (2016). Impact of Linear Correlation on Construction Project Performance Using Stochastic Linear Scheduling. Master Thesis, Rinker School of Construction, University of Florida, Gainesville, FL.Google Scholar
 Flood, I., Issa, RRA, and Troffin, I. (2004). “Optimization of Stochastic Linear Schedules”, in proceedings of 11th International Workshop, EGICE, Weimar, May 2004, pp 182–189.Google Scholar
 Gates, M. and Scarpa, A. (1972) Learning and experience curves. Journal of the Construction Division, ASCE, 98(CO1), March, proceedings papers 8778, pp. 79–101.Google Scholar
 Harris, R. and Ioannou, P. (1998). “Scheduling Projects with Repeating Activities.” Journal of Construction Engineering and Management, 10.1061/(ASCE)07339364(1998)124:4(269), 269–278.Google Scholar
 Ioannou, P. G., and Srisuwanrat, C. (2006). Sequence Step Algorithm for Continuous Resource Utilization in Probabilistic Repetitive Projects. Conference Proceedings of the Winter Simulation Conference, WSC 2006, Monterey, California, USA, December 3–6, 2006.Google Scholar
 Ioannou, P. G., and Srisuwanrat, C. (2007). Optimal Scheduling of Probabilistic Repetitive Projects Using Complete Unit and Genetic Algorithm. Conference: Proceedings of the Winter Simulation Conference, WSC 2007, Washington, DC, USA, December 9–12, 2007.Google Scholar
 Ioannou, P., and Srisuwanrat, C. (2007a). Probabilistic Scheduling for Repetitive Projects. Conference: Lean Construction: A New Paradigm for Managing Capital Projects  15th IGLC Conference 2007, Pages 498–507Google Scholar
 Ioannou, P. G., and Srisuwanrat, C. (2007b). Simulation and Optimization for Construction Repetitive Projects Using Promodel and Simrunner. Conference: Proceedings of the 2008 Winter Simulation Conference, Global Gateway to Discovery, WSC 2008, InterContinental Hotel, Miami, Florida, USA, December 7–10, 2008.Google Scholar
 Pérez, F., & Granger, E. G. (2007). IPython: a system for interactive scientific computing. Computing in Science and Engineering, 9(3), 21–29. doi:10.1109/MCSE.2007.53.View ArticleGoogle Scholar
 Rachmat F., Song, L., and Lee, S. (2009). “Applying a Stochastic Linear Scheduling Method to Pipeline Construction”. Conference: 3rd International Conference on Construction Engineering and Management, ICCEM & ICCPM, May 2009, pp 154–162.Google Scholar
 Trofin, I. (2004). Impact of Uncertainty on Construction Project Performance Using Linear Scheduling. Master Thesis, Rinker School of Construction, University of Florida, Gainesville, FL.Google Scholar
 Wyrozębski, P., & Wyrozębska, A. (2013). Challenges of project planning in the probabilistic approach using PERT, GERT and Monte Carlo”. Journal of Management and Marketing, 1(1), 1–8.View ArticleGoogle Scholar