Skip to main content

Crowdsourcing BIM-guided collection of construction material library from site photologs



With advances in technologies that enabled massive visual data collection and BIM, the AEC industry now has an unprecedented amount of visual data (e.g., images and videos) and BIMs. One of the past efforts to leverage these data includes the Construction Material Library (CML) that was created for inferring construction progress by automatically detecting construction materials. CML has a limited number of construction material classes because it is merely impossible for an individual or a group of researchers to collect all possible variations of construction materials.


This paper proposes a web-based platform that streamlines the data collection process for creating annotated material patches guided by BIM overlays.


Construction site images with BIM overlays are automatically generated after image-based 3D reconstruction. These images are deployed on a web-based platform for annotations.


The proposed crowdsourcing method using this platform has potential to scale up data collection for expanding the existing CML. A case study was conducted to validate the feasibility of the proposed method and to improve the web interface before deployment to a public cloud environment.


Over the past few years, the advances in image-based 3D reconstruction and UAVs enabled the creation of accurate and dense 3D as-built point clouds of building structures. Moreover, 3D point clouds aligned with BIM (denoted as integrated information models (IIMs)) enabled new approaches for managing construction (Han and Golparvar-Fard 2017). There is a growing interest from practitioners and researchers in the architecture, engineering, and construction (AEC) industry in creating these IIMs. For instance, there is a startup company providing a web service that uses IIMs to support project controls. The biggest advantage of IIMs is visualizing as-planned (4D BIM) and as-is conditions (3D point clouds and images) in the same environment, which provides enhanced communication of progress deviation and quality issues. One of the benefits of IIMs is an automatic alignment of images to BIM. The images that are used for generating 3D point clouds are automatically aligned with BIM (see Fig. 1) after their 3D point clouds are aligned with BIM either manually or automatically.

Fig. 1
figure 1

Image-to-BIM alignment: An image used for 3D reconstruction (left); aligned BIM from the same viewpoint (middle); and image and BIM aligned (right)

To fully leverage these visual data, researchers have worked on developing data analytics that could potentially automate construction performance monitoring - progress, quality, and safety (Bosché et al. 2013, Han et al. 2015, Kim et al. 2013a,b, 2015). For progress detection, Han and Golparvar-Fard (2015) proposed an appearance-based progress detection based on a machine learning method proposed by Dimitrov and Golparvar-Fard (2014). As part of these efforts, the Construction Material Library (CML) consisting of over 3000 construction images and 22 material categories (denoted as classes) were collected and used as a training dataset for the proposed machine learning method. These were enough dataset to be applied to small scale case studies (the buildings with the similar architectural style, such as brick exterior walls (Dimitrov and Golparvar-Fard 2014) and progress detection of concrete structures (Han and Golparvar-Fard 2015)). However, there are many more construction materials to be added to CML to be practically useful. The purpose of CML is to evolve over time by more and more researchers contributing to creating a much larger dataset that can be shared and used as a benchmark. For that reason, it is shared online with the research community.

However, it is merely impossible for a small group of researchers to collect images of every construction materials. Adding more images and classes to CML requires a series of tasks that are very labor intensive. First, someone has to a visit construction site (s) and take photos. Then, someone has to go through every image and annotate predetermined material classes manually. This physical constraint of limited human resources barred the authors and Dimitrov and Golparvar-Fard (2014) from having more material classes and images.

Therefore, this paper proposes a streamlined data collection process for cognitive computing research in the AEC industry by utilizing IIMs. Cognitive computing research in this paper refers to computer vision and machine learning research that require the training and testing phases. The authors intend to attract practitioners to share their project data and in return provide visualized IIMs online where they can access the IIMs via web browsers. The authors have created a web-based platform for managing IIMs and crowdsourcing collection of CML from the images in these IIMs. Amazon Mechanical Turk (MTurk) is used for creating annotations. This approach can easily outnumber 3000 images of the current CML and is more suitable for expanding the number of material classes.

Moreover, by sharing it with the whole research community in the AEC industry, the whole research community can benefit from the dataset that is larger and has more material classes. The authors have observed a growing interest from researchers in applying computer vision techniques in the AEC domain. These researchers can also contribute by sharing their IIMs. They can also use the platform for various machine learning research projects. The platform provides a generic tool that allows users to annotate and label construction materials.

A typical range of images used for image-based 3D reconstruction varies from a few hundreds to a few thousands. These visual data can add up to a significant size throughout the construction duration. For instance, Han and Golparvar-Fard (2017) collected about 30,000 images (about 108 GB) in 11 months just using unmanned vehicles (both ground and aerial). As more practitioners and researchers collect visual data for creating IIMs to improve construction performance monitoring and management, there will be an abundance of visual data that are valuable to cognitive computing research for automating construction management processes. The main goal of this paper is partial automation in data processing and collection with minimal administrative efforts, enabling mass-generation of annotated visual data of construction materials, which can potentially advance cognitive computing within the domain of construction engineering and management.

Proposed method

Figure 2 shows the overall workflow of the proposed method that utilizes IIMs. For each IIM, images and videos are collected from a construction site. They are processed and 3D point clouds are generated. Then, a 4D BIM is aligned with them, automatically creating hundreds to thousands of images that are registered with the BIM. These images and BIM overlays shown in Fig. 1 are input to the proposed crowdsourcing method. MTurk that allows crowdsourcing of labor is used for annotating construction materials from these images. The annotators are given with a list of materials. The materials, that are not included in this list and/or that the annotators do not know, are classified as “Others/not sure.” The compiled annotations under this class is reviewed periodically by the administrators (current the authors) for further entry as a new material class. This is an important process because users, often times, may be faced with materials with which they are not familiar. There are also materials that have similar texture and therefore hard to distinguish even by construction experts. The “Others/not sure” class will avoid creating false labels when users are faced with these challenges. This work process is continuously repeated and the number of classes will increase, allowing cognitive computing researches that require additional material classes. The following sections present related work, describe each step in detail, and present several case studies for validating the quality of these annotations done by MTurk annotators who are not construction experts.

Fig. 2
figure 2

Overall workflow of collecting CML

Related work

Over the past decade, the Computer Vision research community has experienced significant advances in object recognition with help of large image databases (Bell et al. 2013; Dana et al. 1999; Deng et al. 2009; Endres et al. 2010; Liu et al. 2010; Hu et al. 2011; Russell et al. 2008; Torralba et al. 2008; Xiao et al. 2010) to which the community had access. The researchers now have access to a vast amount of training and testing data. Many of these efforts are shared on their project websites and are used as benchmarks.

These datasets are primarily for image classification of objects and materials. The datasets by (Deng et al. 2009; Endres et al. 2010; Russell et al. 2008; Torralba et al. 2008) contains large collections of objects and scenes with relatively smaller collections of materials. These are more suitable for studies on object recognition rather than material recognition.

On the other hand, the datasets by (Bell et al. 2013; Hu et al. 2011; Liu et al. 2004,2010) focus primarily on material recognition and therefore consist of the collections of close-up photos of objects and scenes that are suitable for collecting annotated materials. Han, Dimitrov, and Golparvar-Fard (2015,2014) have worked on creating a dataset of construction materials (CML). Due to the limited resources, their work resulted in a relatively small numbers of material classes and patches. This paper presents a possible solution for further expanding CML with relatively fewer efforts on the researchers.

Crowdsourcing, especially using MTurk, has gained popularity among image classification research projects. The work by (Bell et al. 2013; Russell et al. 2008) have included tools that utilize MTurk for crowdsourcing of creating annotations. Similarly and more closely related to the AEC industry, Liu and Golparvar-Fard (2015) have used MTurk for creating annotations of construction workers. MTurk enables quick annotations of a vast amount of images at relatively low cost. However, it still requires an admin person to manage web servers (e.g., interface for annotation, updating image sets), annotators, and quality of annotation.

The main characteristics that differentiate the proposed work from the above-mentioned literature are: 1) focusing on construction materials, 2) using BIM to guide users with the areas to be annotated (i.e., automated BIM-based segmentation using IIMs), and 3) its adaptiveness to new classes through the last step in Fig. 2.


Collection and creation of images-to-BIM

This section describes two ways to gather images that are aligned with BIMs. The very first step of the workflow (see Fig. 2) is generating an IIM. Practitioners and researchers (denoted as contributors) who are not familiar with image-based 3D reconstruction and registration of two 3D models (in this case, a point cloud and BIM) would not be able to provide the output of the IIM, such as camera parameters and point clouds in required formats. To deal with this challenge, the authors provide two options to the contributors:

  • 1. They create IIMs by following the provided instruction (detailed in this section), requiring the use of opensource packages that might not be so straightforward to use; or

  • 2. They provide images to the authors, the authors create and return a 3D point cloud. They register the point cloud with BIM by following an instruction (also detailed in this section), given by the authors, on registering BIM using commercially available software that is straightforward to use.

The former option is more suitable for experts in image-based 3D reconstruction and BIM (e.g., researchers). The latter option is more suitable for practitioners who might not be familiar with using opensource packages. Contributing practitioners can potentially benefit from the visualized IIMs on the web that the proposed approach offers. A project webpage is created for providing the instructions on these two options (Fig. 3). Once all required inputs are shared, the authors will send the contributors URLs to a web visualizer loaded with their projects. In return, the authors collect images to annotate, which are crowdsourced to MTurk users.

Fig. 3
figure 3

Snapshot of the project webpage with a set of instructions for creating IIM and start MTurk annotations

Figure 4 presents steps required for creating an IIM. Each step is associated with required tools. These steps are the typical workflow for creating an image-based 3D point cloud and aligning it with a BIM. The following subsections present required tools and a step-by-step guide in detail.

Fig. 4
figure 4

Steps for creating a IIM and required tools

Image-based 3D reconstruction

The very first step of collecting images in Fig. 4 is rather straightforward. Any devices that can take images and videos can be used. Camera-equipped UAVs are gaining popularity due to abilities to provide wider views of the entire construction sites and collect hundreds to thousands of images that are suitable (images with high resolution that have 60% or more overlaps between images) for 3D reconstruction.

What is not too common for practitioners would be the second step. Although there are many “black box” software that generates 3D reconstruction nowadays, non-experts are not familiar with dealing with camera parameters. Moreover, many commercial software do not provide these information to users. The authors use Bundler output (.out format) initially proposed by Snavely et al. (2006), which many authors of 3D reconstruction opensource packages commonly use (Moulon et al. 2016; Wu 2016; Wu et al. 2011).

The instruction on the first option for creating IIMs includes use of Bundler (Snavely et al. 2006), VisualSfM (Wu 2016; Wu et al. 2011), and openMVG (Moulon et al. 2016). VisualSfM includes a set of instructions for gathering binaries (compiled and ready-to-be installed files) of the required dependencies. This is the most user-friendly out of these three because it provides graphical user interface (GUI). Moreover, even people who are not familiar with software engineering can install it if they carefully follow the instruction. Bundler and openMVG, on the other hand, requires minimal experience with command-line interfaces and compilation of opensource packages (e.g., using Microsoft Visual Studio or GCC on Linux operating systems). All three packages allow users to output camera parameters in the Bundler format and to run dense reconstruction packages, such as Patch-based Multi-view Stereo (PMVS) (Furukawa and Ponce 2010), Multi-View Environment (Goesele et al. 2007), and CMP Multi-View Stereo (CMPMVS) (Jancosek and Pajdla 2011). The common output among them is the “.ply” format.

Aligning point clouds to BIMs

The next step is to align a point cloud to a BIM. There are two approaches: 1) finding corresponding coordinates between the point cloud and BIM and then performing similarity transformation (Horn 1987; Golparvar-Fard et al. 2009); and 2) using a commercially available software to manually transform one model into the other model’s coordinate system.

For the first approach, a user needs to visualize these two models. Some of the packages that can visualize point clouds and extract coordinates include CloudCompare (Girardeau-Montaut 2016), Autodesk packages, and Rhinoceros series (please note that the authors do not intend to make any recommendation on which commercial products users should use). Similarly, these software can visualize BIMs. However, to use CloudCompare, an industry foundation classes (IFC) file (open BIM format that many commercial BIM software can output) needs to be converted to object files (“.obj”). The user selects four or more correspondences from the two models. The coordinates of these correspondences are input to the least square problem of absolute orientation (Horn 1987) that minimizes e (see Eq. 1). Through this process, a scale s, rotation matrix R, and translation T are retrieved to align the point cloud to BIM. Using these output, the similarity transformation transforms the point cloud to the BIM or, alternatively, vice versa by changing the order of P BIM and P ptC (see Eq. 2).

For this approach, a user will provide an IFC file, coordinates of correspondences (P BIM and P ptC ), and outputs from one of the three abovementioned tools for 3D reconstruction (Bundler, VisualSfM, and openMVG).

For the second approach, a user changes the scale, rotate, and translate of the point cloud and align it with the BIM through an iterative process until they visually look aligned in a commercially available software (e.g., Autodesk Navisworks). For this approach, a user will provide an IFC file, outputs from one of the the three 3D reconstruction tools, and values of Origin, Rotation, and Scale (used during manual alignment and as shown in Fig. 5).

$$ \sum_{n}^{1} \| e_{i} \| = \sum_{n}^{1} \| P_{BIM,i} - sR(P_{ptC,i}) - T \|^{2} $$
Fig. 5
figure 5

Values used for manual alignment in Navisworks

where P BIM is the selected points of a BIM and e is the lowest registration error between the BIM and point cloud.

$$ \left[\begin{array}{c} P_{BIM} \\1 \end{array}\right]_{4x1} = \left[\begin{array}{cc} sR & T \\ 0&1 \end{array}\right] \left[\begin{array}{c} P_{ptC} \\ 1 \end{array}\right]_{4x1} $$

where P BIM is the transformed points in the site coordinate system, and P ptC is the selected points of the point cloud.

This first option might overwhelm practitioners and prevent them from contributing. Thus the second option is offered, which inevitably impose more work on the authors because the authors have to run 3D reconstruction for contributors. Although not part of the scope of this paper, this process can be easily automated by writing a script that runs 3D reconstruction when a contributor uploads a set of images.

The contributors have an option to create IIMs and share them with the authors (option 1 above). Alternatively, if the contributors do not wish to deal with the complication of generating IIMs from scratch, they can choose to provide only visual data. The author generates 3D point clouds and returns them to the contributors. The contributors find corresponding coordinates from BIM and point clouds and the author transforms and aligns the point clouds to BIM (option 2). For either case, the authors provide accounts to the demo web page that displays shared IIMs Fig. 6. BIM overlays and images are sent to MTurk for annotation.

Fig. 6
figure 6

Demo webpage for visualizing IIMs

Assigning annotations through MTurk

MTurk is a crowdsourcing marketplace that requesters (e.g., the authors of this paper) create a web-based working environment for human workers (e.g., annotators in this paper). It provides a great way to scale workforce by assigning as many tasks as possible to workers throughout the world (Amazon 2016). Creating training datasets for machine learning algorithms is one common types of work. The workers (denoted as annotators from now on) annotate and label the given images as instructed. One of the commonly used opensource annotation tool is LabelMe (Russell et al. 2008). LabelMe is a web annotation tool that provides generic functionality of annotating images. An annotator can create polygons by clicking points on an image and close the loop. Then, a window that allows labeling of the chosen area pops up. The annotator can annotate as many objects as possible in the image and move on to the next image.

The annotation tool presented in this paper is built on LabelMe. The main addition to the existing functionality is showing transparent BIM overlays on top of images that are outputs from IIMs as discussed in the previous sections. With these inputs, a Human Intelligence Task (HIT) is generated. The annotator have multiple HITs from which they can choose. If an annotator clicks on a HIT that the authors generate, the annotator will be provided with an instruction on how to perform the task (see Fig. 7). Since there is a high chance that the annotator is not familiar with construction materials, the instruction shows examples of construction materials (see Fig. 8). Moreover, examples of good and bad annotations are provided as an instruction to help annotators make quality annotations (see Fig. 9). BIM overlays are provided to guide annotators on deciding the construction materials to be annotated. They can choose to turn on and off these BIM overlays to help him/herself identifying construction materials to be annotated (see Fig. 10).

Fig. 7
figure 7

Mturk instruction on creating annotations

Fig. 8
figure 8

Mturk instruction to annotators: Examples of construction materials for annotators without construction background

Fig. 9
figure 9

Mturk instruction to annotators: Good and bad examples of annotations

Fig. 10
figure 10

MTurk HIT interface - BIM overlay on (left); and BIM overlay off (right)

Quality of annotations

To test and validate quality of annotations that are submitted by the annotators, a case study of three projects were created. For each project, three duplicate projects were generated to compare the accuracies of three different annotators on the same projects. This test also serves as a mean for improving the interface and formulating a quality control mechanism suitable for a sustained data collection effort. Therefore, the MTurk Sandbox was used. The Sandbox is a simulated environment for testing the generated HITs prior to actual deployment to the marketplace. The main benefit is that the interface can be tested and be improved before the deployment. Once HITs for the Sandbox is generated, the authors can directly hire annotators and share the URLs to the generated HITs. To simulate realistic cases, three non-construction annotators were chosen. They were assigned to work on the three projects. At the end of the testing, their feedback will be used for future improvement before actual deployment to the marketplace.

Experimental setup

The three projects consist of construction images of a residential hall (RH), dining hall (DH), and hotel (HP). For the first two projects, ground images (field engineer walking around the structures) were used. For HP, aerial images taken by a camera-equipped UAV were used. The number of images used for 3D reconstruction and the number of images that are used for the annotation task are summarized in Table 1. Often times, the images cover very small areas of the structure and may not be suitable for annotation (see Fig. 11). BIM overlays are used to filter out these images by calculating the areas of BIM shown in the images. If an area A of BIM overlay shown in an image is less than a threshold beta, the image is excluded from the annotation task (i.e., β>A BIM /A im ). Since UAVs often have to be flown at a very high altitude due to safety concerns related to cranes, HP had many images with smaller coverage of BIM elements. This is the main cause of having a much smaller set of images compared to the number of images used for 3D reconstruction. β was chosen based on the commonly used image patch size (i.e., 256 × 256) in the machine learning community and the number of patches expected per image (see Eq. 3).

$$ \beta=w_{pixel} \times h_{pixel} \times nPatch $$
Fig. 11
figure 11

Images with small areas of construction materials to be annotated (filtered by Algorithm 1)

Table 1 Numbers of images used for 3D reconstruction and annotation

where w pixel and h pixel are the width and height of the commonly used patch size and nPatch is the number of expected patches per image.

The chosen pixel size was 256×256 and nPatch was 20. Therefore, all images with the BIM coverage less than 1,310,720 pixels (256×256×20) were skipped. Moreover, images that are used for 3D reconstruction have high percentages of overlapping areas. Meaning that they have multiple images taken from the similar locations and orientations. To avoid similar images (which cause redundant image samples), every n th images in the sequential order that is determined during the 3D reconstruction process are chosen for annotation.


To test and compare qualities of annotations done by different annotators, three annotators were assigned with the same set of three projects (RH, DH, and HP). Their annotations are compared to the ground truth. The results are evaluated for improving the instruction on the webpage and annotation tool for future annotators. The annotations were categorized into four categories: good annotations (denoted as G), okay annotations but could be better (denoted as C), redundant annotations (denoted as R), and not good annotations (false positive and denoted NG). Note that the number of elements missed is not recorded. There are too many of them in each image and certain elements consist of multiple materials. Division of large elements in a BIM varies by different designers/drafters and adds to the complexity of counting the number of elements shown by BIM versus actual number. Figure 12 shows the categories that do not fall into G.

Fig. 12
figure 12

Examples of non-G categories: a NG - annotations run over multiple object; b NG - non-construction material; c C - larger surface could have been selected; d R - two redundant surfaces inside of an annotation; and e NG - multiple surfaces annotated as one

Tables 2, 3, and 4 summarize the findings. There are quite significant numbers of NGs due to the annotators making the wrong annotations consistently. For instance, Annotator B consistently annotated two separate surfaces as one material - a sliding deck of the core structure (see Fig. 13). There also were cases where the annotators make inconsistent judgments. As can be seen in Fig. 14, Annotator A and B annotated the same material as two different materials. For instance, Annotator A annotated a deck slab as “Others/Do Not Know” on some images and as concretes (not correct anyway) on other images.

Fig. 13
figure 13

Example of a consistent mistake

Fig. 14
figure 14

Example of inconsistent annotations: Annotator A (left) and Annotator B (right)

Table 2 Annotation quality on RH
Table 3 Annotation quality on DH
Table 4 Annotation quality on HP

To better visualize the performances of these annotators, a histogram (Fig. 15) is created. To better compare correct-versus-wrong cases G and C are combined. G+C are about 75%, 78%, and 82% for Annotator A, B, and C, respectively. NG are about 23%, 17%, and 18%, respectively. Since these high percentages of NG are due to the consistent mistakes and lack of domain knowledge in construction materials, the instruction provided to them should include more material examples to help them make a better judgment. Although there is more manual work involved, preparing project specific examples per HIT can potentially address this issue and improve the quality of these annotations. Moreover, similar to Le et al. (2011) and Liu and Golparvar-Fard (2015), each contributor starting with a project that is already annotated without an error and educate them by presenting falsely labeled patches with the ground truth can be used to present consistent mistakes observed in this study. Similarly, randomly assigning projects that are labeled by a well-established contributor and signaling the administrators when there is a notable difference in results can also control the quality.

Fig. 15
figure 15

Quality of annotations by the annotators

Another finding from the outcome is that the annotators are confused by the ambiguity of which materials to annotate when majority of the materials they see are concrete. For this reason, the annotators stopped annotating concrete elements in many cases. In most cases they annotate one to three concrete surfaces even though there are more concrete elements. They felt that there are too many “redundant” materials and “felt like” they should skip some (see Fig. 16). To guide them better, a dynamic list of material classes per project with the number of annotations per class can avoid any confusion. However, Annotator C performed very well when it comes to the number of annotations. Annotator C followed the instruction much better than the other two (although there are some consistent errors due to the lack of domain knowledge) and annotated almost all building elements shown by the BIM overlays.

Fig. 16
figure 16

Examples of lack of annotations due to having too many “redundant” materials: Shown in blue and green boxes are the only annotations in these four images

To assess the quality variations among the three annotators, a bar chart with the numbers of annotations and annotators with standard deviations is generated (see Fig. 17). The intent of this study is to illustrate the effect of an outperforming annotator on the other two annotators who have relatively lower numbers of annotations. The authors, however, acknowledge that the sample of three is not enough to present a statistical value. As can be seen from the figure, the standard deviations for the two annotators are much larger because Annotator B largely outnumbered the other annotations (Tables 2, 3, and 4).

Fig. 17
figure 17

Number of annotations by varying numbers of annotators

Another factor that needs to be considered before deploying to the public is cost. Rewards to annotators can add up quickly and be significant. According to (Bell et al. 2013), 13 cents per surface was an effective cost that attracted many annotators. The reputation of quick approval of tasks is another factor that attracts annotators. With a rate of 13 cents per surface, the rewards that should have been paid to these three annotators (if they were real MTurk annotators) are summarized in Table 5. These costs only include G and C and do not include R and NG that are assumed to be rejected. For annotating three projects with 149 images, the expected cost is $203.71.

Table 5 Projected costs of annotations

Conclusions and discussion

Crowdsourcing CML for the research community in the AEC industry is presented. The presented approach will benefit from the practitioners and researchers interested in creating and using IIMs in practice and research. In return, the contributors will have access to their IIMs in a web-platform. Moreover, the AEC research community will have access to more construction material data that can be used for various machine learning related research.

The current version uses BIM to guide users by annotating BIM elements visible on images. If the material information from IFC files are extracted, the developed platform can further guide users by providing a less number of material types to label. These features are not included in the current version because of inconsistencies created by different BIM tools and by different designers. Efforts like IFC Standards can potentially minimize and avoid these inconsistencies in the near future. Then, the second version will be updated to read the material information from IFC files.

Moreover, an additional study on computing percentages of overlapping annotations with the same labels and discrepancies among annotators could be conducted. This study will enable the calculation of a confidence level and reliability of the labels per annotator and per material class.

The proposed method is designed to grow CML in size (both number of annotations and classes) over time. The developed platform requires administrators (the authors in this case) who help contributors create IIMs, generate HITs, and review additional materials to be added to existing CML. Continuously maintaining this effort and limited resources to pay the MTurk users are the remaining challenges. The overall system can be improved to automate manual processes. However, dealing with paying monetary rewards to annotators and quality controls of annotations cannot or perhaps should not be automated.

One possible solution would be deploying to the cloud (the platform is currently running in the authors’ server) and having researchers who will be using this platform to contribute and add to CML along the way. The research community, in this case, will serve as administrators and annotators. Gamifying the platform could be a potential solution for motivating more people to contribute while reducing the cost associated with paying annotators. However, an obvious challenge of these solutions would be the cost associated with maintaining the cloud space as CML grows. Until securing the necessary funding to fully deploy the platform to the cloud, the developed platform will be continuously used and managed by the authors and the annotated projects will be publicly available upon request.


  • Amazon (2016). Amazon mechanical turk. Accessed 20 Dec 2016.

  • Bell, S, Upchurch, P, Snavely, N, Bala, K (2013). OpenSurfaces: A richly annotated catalog of surface appearance. ACM Trans. on Graphics (SIGGRAPH), 32.

  • Bosché, F, Guillemet, A, Turkan, Y, Haas, C, Haas, R (2013). Tracking the built status of mep works: Assessing the value of a scan-vs-bim system. Journal of Computing in Civil Engineering.

  • Dana, KJ, van Ginneken, B, Nayar, SK, Koenderink, JJ (1999). Reflectance and texture of real-world surfaces. ACM Trans. Graph, 18(1), 1–34.

    Article  Google Scholar 

  • Deng, J, Dong, W, Socher, R, Li, LJ, Li, K, Fei-Fei, L (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. doi:10.1109/CVPR.2009.5206848 (pp. 248–255). Miami.

  • Dimitrov, A, & Golparvar-Fard, M (2014). Vision-based material recognition for automated monitoring of construction progress and generating building information modeling from unordered site image collections. Advanced Engineering Informatics, 28, 37–49.

    Article  Google Scholar 

  • Endres, I, Farhadi, A, Hoiem, D, Forsyth, DA (2010). The benefits and challenges of collecting richer object annotations. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, (pp. 1–8).

  • Furukawa, Y, & Ponce, J (2010). Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1362–1376.

    Article  Google Scholar 

  • Girardeau-Montaut, D (2016). Cloudcompare. Accessed 15 Nov 2016.

  • Goesele, M, Snavely, N, Curless, B, Hoppe, H, Seitz, SM (2007). Multi-view stereo for community photo collections. In 2007 IEEE 11th International Conference on Computer Vision, (pp. 1–8).

  • Golparvar-Fard, M, Peña Mora, F, Savarese, S (2009). Application of d4ar - a 4-dimensional augmented reality model for automating construction progress monitoring data collection, processing and communication. ITcon, 14, 129–153.

    Google Scholar 

  • Han, K, Cline, D, Golparvar-Fard, M (2015). Formalized knowledge of construction sequencing for visual monitoring of work-in-progress via incomplete point clouds and low-lod 4d bims. Advanced Engineering Informatics, 29, 889–901.

    Article  Google Scholar 

  • Han, K, & Golparvar-Fard, M (2015). Appearance-based material classification for monitoring of operation-level construction progress using 4d bim and site photologs. Automation in Construction, 53, 44–57.

    Article  Google Scholar 

  • Han, K, & Golparvar-Fard, M (2017). Potential of big visual data and building information modeling for construction performance analytics: An exploratory study. Automation in Construction, 73, 184–198.

    Article  Google Scholar 

  • Horn, BKP (1987). Closed-form solution of absolute orientation using unit quaternions. Journal of the Optical Society of America A, 4, 629–642.

    Article  Google Scholar 

  • Hu, D, Bo, L, Ren, X (2011). Toward robust material recognition for everyday objects. In Proceedings of the British Machine Vision Conference. BMVA Press, (pp. 48.1–48.11).

  • Jancosek, M, & Pajdla, T. (2011). Multi-view reconstruction preserving weakly-supported surfaces, (pp. 3121–3128). Providence: CVPR. doi:10.1109/CVPR.2011.5995693.

    Google Scholar 

  • Kim, C, Son, H, Kim, C (2013a). Automated construction progress measurement using a 4d building information model and 3d data. Automation in Construction, 31, 75–82.

    Article  Google Scholar 

  • Kim, C, Kim, B, Kim, H (2013b). 4d {CAD} model updating using image processing-based construction progress monitoring. Automation in Construction, 35, 44–52.

    Article  Google Scholar 

  • Kim, MK, Cheng, JC, Sohn, H, Chang, CC (2015). A framework for dimensional and surface quality assessment of precast concrete elements using {BIM} and 3d laser scanning. Automation in Construction, Part B:225–238, 30th {ISARC} Special Issue.

  • Le, J, Edmonds, A, Hester, V, Biewald, L (2011). Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution.

  • Liu, C, Sharan, L, Adelson, EH, Rosenholtz, R (2010). Exploring features in a bayesian framework for material recognition. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. doi:10.1109/CVPR.2010.5540207 (pp. 239–246). Francisco.

  • Liu, K, & Golparvar-Fard, M (2015). Crowdsourcing construction activity analysis from jobsite video streams. Journal of Construction Engineering and Management, 141, 04015035.

    Article  Google Scholar 

  • Liu, Y, Lin, WC, Hays, J (2004). Near-regular texture analysis and manipulation. In ACM SIGGRAPH 2004 Papers. ACM, New York, (pp. 368–376).

    Chapter  Google Scholar 

  • Moulon, P, Monasse, P, Marlet, R, et al. (2016). Openmvg. Accessed 30 May 2016.

  • Russell, B, Torralba, A, Murphy, K, Freeman, W (2008). Labelme: A database and web-based tool for image annotation. International Journal of Computer Vision, 77, 157–173.

    Article  Google Scholar 

  • Snavely, N, Seitz, SM, Szeliski, R (2006). Photo tourism: exploring photo collections in 3d. ACM transactions on graphics (TOG), 25, 835–846.

    Article  Google Scholar 

  • Torralba, A, Fergus, R, Freeman, WT (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 1958–1970.

    Article  Google Scholar 

  • Wu, C (2016). Visualsfm: A visual structure from motion system. Accessed 30 May 2016.

  • Wu, C, Agarwal, S, Curless, B, Seitz, SM. (2011). Multicore bundle adjustment, (pp. 3057–3064). Providence: CVPR. doi:10.1109/CVPR.2011.5995552.

    Google Scholar 

  • Xiao, J, Hays, J, Ehinger, KA, Oliva, A, Torralba, A (2010). Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. doi:10.1109/CVPR.2010.5539970 (pp. 3485–3492). San Francisco.

Download references


This work is funded in part by the National Science Foundation (NSF)’s grant CMMI-1360562 and CMMI-1446765 and the National Center for Supercomputing Applications (NCSA)’s Institute for Advanced Computing Applications and Technologies Fellows program. They were used for the design of the study and collection, analysis, and interpretation of data.

Author information

Authors and Affiliations



KH developed, reviewed literature review, and wrote this manuscript under supervision of MG. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Kevin Han.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, K., Golparvar-Fard, M. Crowdsourcing BIM-guided collection of construction material library from site photologs. Vis. in Eng. 5, 14 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: