Skip to main content

Window detection in facade images for risk assessment in tunneling



Settlements induced by tunneling in inner urban areas can easily damage above ground structures. This already has to be considered in early planning of tunneling routes. Assessing the risk of damages to structures on hypothetical tunneling routes inflicted by such settlements beforehand enables routes’ comparability. Hereby, it facilitates the choice of the optimal tunneling route in terms of potential damages and of suitable countermeasures. Risk analyses of structures establishing the assessment obtain relevant data from various sources. Some data even has to be gathered manually. Virtual building models could ease this process and facilitate analyses for entire districts as they combine several required information in a single data set. Commonly, these are yet modelled very coarse. Relevant details like facade openings, which highly affect a structures stiffness, are not included.


In this paper, we propose a system which detects windows in facade images. This is used to subsequently enrich existing virtual building models allowing for a precise risk assessment. For this, we apply a sliding window detector which employs a cascaded classifier to obtain windows in images patches.


Our system yields sufficient results on facade images of several countries showing its general applicability despite regional and architectural variation in the facades’ and windows’ appearance. In an ensuing case study, we assess the risk of damages to structures based on detections of our system using different analysis methods.


We contrast these results to assessments using manually gathered data. Hereby, we show that the detection rate of our proposed system is sufficient for a reliable estimation of a structure’s damage class.


The risk assessment of settlement induced damages to structures is a relevant thread in almost all phases of a tunneling project starting from early route planning up to the boring process itself. Virtual building models offer the potential to automate this process. Publicly available models, though, lack features which are indispensable for required risk analyses. Most land registry offices provide 3D models of buildings for large urban areas. However, these commonly constitute coarse block models of buildings which may be extended by simplified roof shapes as shown in Fig. 1. Any further information about facade elements is usually not included. Web services like Google Earth or OpenStreetMaps offer similar models. Although some buildings are modeled manually resulting in more detailed facade and roof shapes, facade elements are not explicitly modeled but mapped to the model as textures only. Accordingly, facade elements like windows are currently not included into publicly available models. Since windows account for the major proportion of openings in a facade, they are but crucial for analyses regarding a structure’s stiffness. Those neglecting windows are inaccurate and insufficient for our purpose (Neugebauer et al. 2015; Schindler 2014). A subsequent enrichment of existing models by windows or more specifically the facade’s opening-ratio, thus, is inevitable for a proper assessment. Currently, determining the opening-ratio is done manually. Assessors survey construction plans for each building along possible tunneling alignments to derive their opening-ratio. This is time consuming and demands high human effort resulting in high costs. In addition, many construction plans of older buildings are already obsolete which necessitates a manual on-site inspection and drastically increases the effort. To reduces time and cost, an automated solution is desirable.

Fig. 1
figure 1

3D block model. Block model with simplified roof shapes of a group of buildings located near the tunneling alignment of the subway project Wehrhahn-Linie provided by the land registry office of Düsseldorf

Images of facades are publicly available for almost all urban areas from web services like Google Street View or can easily be gathered. For this reason, we propose a pattern recognition approach to detect windows in facade images. This allows for automatically inferring the opening-ratio from existing structures. Through this, required information can be provided to risk analyses with significantly less effort. Accordingly, alternative tunneling routes can be evaluated already in early phases of planning which enables the selection of an optimal route. It can be taken to be optimal if it minimizes structural damages to buildings inflicted by settlements. Structural damages are considered relevant if they cause a tilt and potentially induce cracks in structural and non-structural members that impair appearance, serviceability or even bearing capacities.

Our general approach is visualized in Fig. 2. First, external openings in facades are detected by pattern recognition techniques (a). Second, expected settlements are computed employing relevant parameters from construction drawings or 3D city models. Next, this information is integrated into models for damage assessment (c). In general two types of models must be distinguished here. While simple models, advantageous for quick pre-assessments, account for openings by a generalized factor distributing the openings over the total facade evenly (LTSM), more extensive models allow for individual sizes and locations of openings in 2D finite element simulation explicitly.

Fig. 2
figure 2

Single steps for the risk assessment during tunnelling. Window detection (a); structural idealisation and settlement prediction (b); models for damage assessment and damage criteria (c)

The remainder of this paper is organized as follows: In section “Related work” we discuss techniques for risk analyses as well as previously made window detection approaches. Section “Methods” gives insight into our detection system consisting of a soft cascaded classifier (see section “Soft cascaded classifier”) in combination with a sliding window detector (see section “Sliding window detector”). In section “Results” we evaluate the performance of our proposed system and discuss its limitations. The obtained insights and results are then used in the context of a case study (section “Case study”) to test our detection system and risk analyses on real data from a tunneling scenario of the reference subway project Wehrhahn-Linie (WHL) in Düsseldorf, Germany. In the course of this, we compare our results to common methods and highlight the advantages with respect to nowadays idealization of structures. Finally, we conclude our findings and provide an outlook (see section “Discussion”).

Related work

For damage assessment of tunneling induced settlements a variety of established methods is recently at hand; (Obel et al. 2017) includes a summary. In short, section “Damage risk assessment” recalls the basics regarding application.

Window detection is a challenging task which has not sufficiently been solved yet. Although it is of high relevance for several research areas, it is mostly referred to as a subtask of 3D building reconstruction. In recent decades a comprehensive body of literature arose concerning the 3D reconstruction of existing buildings. Approaches made are highly versatile. In section “Building reconstruction” we discuss the suitability of different kinds of input data with respect to window detection. Furthermore, we outline previously made approaches to window detection and address their application areas and limitations.

Damage risk assessment

Settlements as a damage causing event are usually assessed by means of analytical models (Peck 1969; Attewell et al. 1986). Most frequently Peck’s model is applied (Peck 1969). Accordingly transversal to the tunnel’s axis settlements are suggested to follow a Gaussian curve (bell-shaped) which is characterized by parts in sagging and hogging depending on the curvature (cf. Fig. 3). However in practice, an analytical assessment is often accompanied and double-checked by continuous structural monitoring on site (Schindler et al. 2016; Mark et al. 2012).

Fig. 3
figure 3

Relation between settlement and structure. Shape of the settlement trough with sagging and hogging areas and input parameters for the LTSM

Building damages are commonly referred to strains exceeding certain limits. For assessment, strains are computed on simple or deep beams assuming linear elastic material behavior for convenience as usual for numerical analysis of tunnels (Kämper et al. 2016), or more sophisticated considering non-linear effects, too. The obtained strains are compared to limit strains characteristic for non-cracked conditions as well as micro or macro cracking of brittle materials like concrete and masonry (Mark and Schütgen 2001). Table 1 assigns damage categories to typical tensile limit strains. While strains in categories 0-2 just visually impair structural aesthetics, strains in category 3 already influence the serviceability. Finally, in category 4 even the bearing capacity of a structure is affected (Boscardin and Cording 1989).

Table 1 Relation of damage category and limiting tensile strain, according to Boscardin and Cording (1989)

The so-called Limiting Tensile Strain Method (LTSM) idealizes facades to beams and assesses damage risks employing bending and shear strains (Burland and Wroth 1974). With respect to a position in hogging or sagging the strains are generally computed from Eqs. 1-2 assuming green-field (gf) conditions. Therein, the resultant diagonal strains are split into bending (b) and shear (d) components.


The relative settlement Δ denotes the maximum vertical distance between the settlement trough and a straight horizontal line of length L connecting two reference points usually associated with the outermost building edges, the points of inflection or the maximum extent of the trough (cf. Fig. 3). A structure’s position regarding the trough is accounted for by means of an eccentricity e. It equals the distance from the strain maximum to the facade’s center of gravity. The height H from the basement to the eaves is 12 m at maximum. Recalculations of practically relevant configurations above that limit have shown no significant stiffness gains for soft soils (Neugebauer et al. 2015). The ratio of Young’s to shear modulus (E/G-ratio) is usually set to 2.5 regarding the constitutive relationship of linear elastic materials (Burland and Wroth 1974). Table 2 lists a reduction factor regarding the effective stiffness at foundation with respect to an opening-ratio in masonry structures. The opening-ratio is the quota of openings (windows and doors) of a total facade but limited to 50% since with higher ratios frame constructions made from reinforced concrete elements would be typical. The opening-ratio allows for linear interpolation in general. Schindler (2014) documents the relevance of reduced effective stiffness with openings in facades.

Table 2 Stiffness reduction factor f α according to opening ratio (Neugebauer et al. 2015)

In case of deep beam models damage of facades is typically assessed employing plane finite elements since it is necessary to consider details to gain a realistic prognosis of the total settlement induced strain distribution. Simultaneously it comes along with higher numerical effort concerning modelling and computation which is justified only in case of important structures having a significant damage potential as it is with low coverage.

Compared to the analytical approach employing beam models significant differences occur regarding both, material model and handling of soil-structure interaction (SSI). On request, the linear elastic material model can be enhanced to a nonlinear one accounting for isotropic damage and hence delivers more accurate damages prognoses (Schindler 2014). The SSI is approximated with non-linear springs for the soil (stiffness C) that account for an initial slip by means of a gap (Schindler 2014). Additional to vertical springs, horizontal springs cover contact of foundation and soil by friction and its coefficient μ. Figure 4 shows the principle exemplified by a linear spring’s stiffness with initial slip without bedding.

Fig. 4
figure 4

Approach for simplified simulation of soil-strucutre-interaction. Facade support with springs (a); applied discrete displacements (gap) on the springs (b)

Building reconstruction

Previously made building reconstruction approaches can be roughly categorized by means of the required kind of input data to review their suitability for window detection. Surveys by Baltsavias (2004) and Brenner (2005) cover approaches using airborne photography and laser scans. Due to the perspective, data acquisition even for large areas is easily feasible compared to the collection of data from the ground. As airborne data usually only contains satisfactory information about large structures like buildings and streets its potential for building reconstruction is limited. The reconstruction of simple building and roof shapes is possible but in consequence of the highly distorted facades through bird’s-eye view, the detection and reconstruction of its elements is impractical. In addition to airborne data acquisition a survey of Haala and Kada (2010) mainly addresses terrestrial laser scans providing precise point clouds to generate 3D meshes of existing buildings. Beside approaches mapping textures to these meshes the authors present methods roughly detecting facade elements. This is done by extracting areas displaced behind the facades’ planes or, as shown in Fig. 5, determining no-measurement areas in the plane (Pu and Vosselman 2009) and categorize them as windows, doors, etc. by their size and position on the facade. Although the use of terrestrial laser scans eases the detection process and some approaches have already been made to facilitate the data collection (Haala et al. 2008; Barber et al. 2008), such data is not available regionwide yet. Gathering and preprocessing data for all buildings along potential tunneling alignments in the urban area, though, would involve unreasonable effort.

Fig. 5
figure 5

Laser scan of a facade. Windows can be identified by no-measurements areas

Musialski et al. (2013) provide the most encompassing survey of building reconstruction that also comprises methods based on terrestrial imagery. While pattern recognition, matching, and facade parsing approaches are discussed, the detection of facade elements in facade images taken from the ground perspective is only mentioned briefly. Since ground perspective imagery of facades can be gathered with low effort even for larger areas or alternatively can be received from web services like Google Street View, image based window detection approaches qualify best for our purpose. Preliminary work by Neuhausen et. al (2016) focused on a juxtaposition of the most promising approaches concerning window detection in facade images taken from the ground. The discussed methods are divided into three categories: Grammar-based, image processing only, and machine learning aided methods.

Grammar-based methods apply formal grammars to facade images splitting these into increasingly smaller regions until the facade is decomposed into its elements. Ripperda (2008) and Ripperda and Brenner (2009) express a simple grammar based on symmetry and repetition to subdivide facade images. Teboul et al. (2010), on the contrary, develop a detailed shape grammar which also models semantic relationships between certain elements. For this purpose they introduce rules, amongst others, to substitute the ground floor by shops and doors or to split the attic into roof and windows. As can be seen, in general, defining an adequate set of rules is non-trivial and presumes prior knowledge about the expected architecture. Furthermore, grammars allow numerous possible decompositions for a facade. Sampling methods like Markov Chain Monte Carlo (Ripperda and Brenner 2006) or parsing algorithms (Riemenschneider et al. 2012) have to be applied to identify the most probable subdivision. Such methods drastically increase in complexity with the number of rules within the set. Accordingly, sets have to be kept small to be applicable denying a high detailed modeling of relationships between facade elements.

Pattern recognition approaches divide into further two categories. Methods using only image processing rely, similar to grammar-based approaches, on prior knowledge and assumptions about the facades’ appearance. Assuming that windows are aligned grid-like, Lee and Nevatia (2004) superimpose histograms of horizontal and vertical edges in rectified facade images. As result peaks emerge at windows’ locations. Meixner et al. (2011) resumed their work and figured out that it works well on highly regular facades but fails for complex facades with asymmetric window patterns or extensions like balconies or awnings. This illustrates how assumptions and contributed prior knowledge may narrow the field of application. The appearance of facades highly alters in different countries and may even vary between adjacent urban areas. Detection algorithms based on those assumptions have to be adjusted to particular conditions. This involves high effort and raises the need for experts. It would be desirable to have a more general solution.

Machine learning techniques meet this requirement as they neither rely on assumptions on the windows’ alignment nor on prior knowledge about the architecture. Windows can be detected by image features which represent their inherent characteristics. In this context, Haugeard et al. (2009) proposed an approach classifying windows by their edges using a support vector machine with an inexact graph matching kernel. Such methods require an explicitly given feature vector. Alternatively, boosted classifiers as proposed by Viola and Jones (2004) avoid this by choosing a subset from a pool of features. Its practicability and limitations for window detection tasks are investigated by the work of Ali et al. (2007).


Previous approaches either address large facades with highly regular facade element patterns or provide low detection rates whereas in the context of urban tunneling the investigation of small houses with irregular facades is not uncommon. A sufficient approach, thus, has to cope with this issue. Furthermore, a high detection rate is desirable to derive reliable idealizations of structures for further analyses and a precise risk assessment.


There is a wide range of window shapes and sizes (see Fig. 6) and it is not directly evident which image features qualify best to characterize them all. For this reason, a boosted classifier seems to be promising. Since the best features are automatically chosen from a feature pool while training the classifier, it overcomes the necessity for a prior manual selection. Apart from that, boosted cascades of classifiers usually outperform most monolithic approaches (Lienhart et al. 2003).

Fig. 6
figure 6

Window examples from facade images. Windows occur in very different shapes and sizes which complicates the composition of an adequate set of features describing the windows’ characteristics

A boosted cascade of classifiers has already been applied to the window detection task by Ali et al. (2007) via the Viola-Jones object detection framework (Viola and Jones 2004) albeit the reported detection rate is rather low. Bourdev and Brandt (2005) developed a soft cascaded classifier which, in general, improves the detection rate over the Viola-Jones framework and is more robust regarding a high variability of positive samples. Their classifier, additionally, relies on less features compared to the one of Viola and Jones at similar detection rates. Based on these findings, we decide for a soft cascaded approach which is described in detail in section “Soft cascaded classifier”.

Both, the Viola-Jones object detection framework and the soft cascaded classifier, were originally developed for face detection and yield high accuracy in this field. Beyond this, these classifiers also proved to be successful in further application areas like traffic light detection (Michael and Schlipsing 2015). An application to other areas is, thus, generally possible. However, this requires the objects to be detected to possess a sufficient amount of well separating features as classification relies on many cascaded stages containing several image features. The fact that windows are poor in those features, hence, may complicate the detection.


Although the soft cascaded classifier is robust regarding high variability, a normalization of the windows’ appearance can significantly reduce the variability and, thus, simplify the classification. We rectify the facade images semi-automatically to eliminate unnecessary variation which is due to distortion by the angle of vision. The vertices of the quadrilateral spanned by the facade are chosen as corresponding points allowing to estimate a homography which transforms the quadrilateral into a rectangle. For the purpose of our case study it was sufficient to determine the vertices manually. If a manual selection is tedious due to a multitude of facades to be investigated, instead a repetitive pattern approach (Wendel et al. 2010) can be used to automatically infer the facades’ dimensions. Despite the rectification of the facades, the windows’ variation remains to be rather high. Hence, the soft cascaded approach is preferable compared to the classifier of Viola and Jones. Since the soft cascaded classifier is similar in speed to the classifier used in the Viola-Jones framework, there is no need to optimize the enclosing detection algorithm. Therefore, we keep their naive sliding window detector (see section “Sliding window detector”). Facade images which we rectified beforehand are scanned via a rectangular subwindow sliding across the entire image in various sizes. At each position, the underlying image patch is passed to the classifier which is trained on a set of rectified window samples and non-windows. Positively classified patches are memorized, merged to larger patches if appropriate, and finally returned as detections. An illustration of this procedure can be found in Fig. 7.

Fig. 7
figure 7

Design of our detection system. Facade images are rectified before being passed to the detector. The detector slides a subwindow across the rectified image and cuts out underlying image patches which are passed to classification. A cascaded classifier which is trained and calibrated beforehand classifies each patch. Overlapping regions of interest (ROIs) of positive classifications are merged and returned as detected windows

Soft cascaded classifier

The soft cascaded classifier as proposed by Bourdev and Brandt (2005) consists of a set c of weighted weak classifiers αh(x) where α is the weight and h(x) denotes the classification function of a weak classifier for a sample x. Each of these classifies at least slightly better than guessing. By connecting them in series a strong classifier with a high detection rate emerges. We deduce weak classifiers from the Haar-like features shown in Fig. 8. Each weak classifier consists of such a feature at a fixed position in the image x as illustrated in Fig. 9. The features’ responses are thresholded which enables a binary classification of the image region the feature is applied to. Those thresholds are trained by the algorithm given by Viola and Jones (2004) by means of a training dataset with positive and negative samples x i , their corresponding labels y i , and their weights w i representing each sample’s importance.

Fig. 8
figure 8

Haar-like features. Types of Haar-like features used for classification

Fig. 9
figure 9

Two Haar-like features applied to image patch. Each weak classifier consists of a feature at a fixed position in the image x. Features’ responses are thresholded to classify the corresponding image regions. Connecting these in series results in a strong classifier

While training a strong classifier, weak classifiers h(x) which provide the lowest training errors are consecutively drawn from a pool. According to their particular classification quality on the training dataset, a weight α is assigned to them. Then they are added to a strong classifier c. Since the negative class of non-windows is infinitely large, it is not possible to project it to a discrete amount of negative samples. For this reason, new negative samples which are misclassified by the current strong classifier are bootstrapped after each selection of a weak classifier to approximate the negative class. This leads to a reduced variance and also makes overfitting more unlikely. Additionally, each sample weight w i is adapted according to the sample’s classification result of the current strong classifier so that the priority of currently misclassified samples increases and vice versa. This shifts the focus of the selection of following weak classifiers towards so far misclassified samples. The emerging strong classifier \(c(x) = \sum _{t}^{T} \alpha _{t} h_{t}(x)\) can already be used for classification via weigthed majority vote where x is the concerning sample:

$$ \sum_{t}^{T} c_{t}(x) \geq \frac{1}{2} \sum_{t}^{T} \alpha_{t}. $$

Since only a few weak classifiers are needed to reject most negative samples, it is unnecessary to evaluate all weak classifiers on every input image. Weak classifiers are, hence, arranged in cascading stages and samples are only passed to the succeeding stage if it is classified positively in the current one. This dramatically speeds up the classification process for negative samples which are much more likely to be found in an image. It again speeds up the detection process by the sliding window detector since most cut out image patches can be declined after evaluating only few stages so that a real-time detection is possible. In contrast to the Viola-Jones framework, in the soft cascaded approach each weak classifier is its own stage of the cascade. Whether a sample passes a stage and is, thus, further processed by subsequent stages depends not only on the current stage but on all previous stages. This is realized by a sample trace accumulating the confidence of each weak classifier’s classification which is compared to a rejection threshold. The choice of this threshold influences the detection and false positive rates as well as the speed of the classifier. For a proper setting a calibration is done. Given a trained strong classifier c, its stages c t are reordered within the cascade based on the performance on a calibration dataset. Additionally, for each sample of this dataset its sample traces are successively updated. The rejection threshold of each stage is determined in relation to the sample traces in a way that only a certain fraction of positive samples is declined but a maximum of negative samples. For the classification of a sample x with the calibrated classifier each stage c t updates consecutively the sample trace. Once it is below the rejection threshold the sample is declined and classified as negative. Otherwise, if the sample passes all stages, it is classified positively. This procedure is illustrated in Fig. 10.

Fig. 10
figure 10

Schematic illustration of the cascaded classification procedure. The weak classifier function h i (x) of a stage i is evaluated only if the current sample traces s is above the rejection threshold ri−1 of the preceding stage. If the sample passes all stages it is positively classified. Otherwise, it is prematurely declined and classified negatively

Sliding window detector

The soft cascaded classifier only accepts image patches. For those, the classifier determines if it is a window. A detector is needed which passes relevant image patches to the classifier and further processes the provided classifications. Due to the cascading layout of our classifier non-window patches on average can be rejected very fast. For this reason, an optimized detection algorithm is unnecessary.

We apply the detector proposed by Viola and Jones (2004) which shifts a rectangular subwindow of multiple scales across an image (see Fig. 11). For each position and size, the covered image patch is passed to the classifier. As we suppose the images forwarded to the detector to be filled by a complete facade we can approximate the expected smallest window dimensions by the image dimensions. Starting the scanning process at a subwindow scale of s=1.0 which we define as the 0.5-fold of the inferred window dimensions, it is scaled by a factor of 1.25 after each run across the image. The step size of shifting the subwindow is affected by its current scale s and the starting step size Δ=1.0. The respective step size of each run is determined by sΔ, where denotes a rounding operation. Since the classifier is insensitive in regard to small translations of the objects within the image patch, multiple overlapping detections may occur. These are afterwards merged by averaging their positions. We allow the merging of detections only if their areas overlap by a factor of 0.7 or above.

Fig. 11
figure 11

Scaling and shifting of the sliding subwindow. After each run the subwindow is scaled by s. The step size in which the subwindow is shifted across the image depends on the current scaling factor


We split the evaluation of our detection system into two experiments. In section “Detection quality” we examine the quality of the detection results by means of a calibrated classifier. Following the setup of Ali et al. (2007) we investigate the accuracy of our detections by determining the overlap of the ground truth data with our detection results. As the calibrated classifier makes a compromise such that the detection rate decreases to the benefit of a low false positive rate, we additionally evaluate the trained classifier in a second experiment (see section “Detection distribution”). For that, we steadily increase the classification threshold and observe the resulting distribution of the detections.

To provide comparability we proceed analogously to the single window method of the evaluation framework proposed by Ali et al. (2007) for both experiments. According to this, a detection will only be marked as true positive if it is inside the ground truth label’s rectangle or marginally exceeds the boundaries and covers the label’s area to at least a certain extent. On the contrary, a detection will be marked as false positives if it covers less than 5% of the ground truth label. Since detections have to be as exact as possible to guarantee accurate risk analyses unlike Ali et al. (2007) we constrain exclusively to an overlap of at least 75% between detection and ground truth for true positives.


In previous experiments (Neuhausen et al. 2017), we found that the classification results improve if samples are constrained to be rectified. For training and calibration we use the CMP-base facade database (Radim Tyleček 2013) consisting of 387 rectified images of planar facades without substantial occlusion by vegetation or man-made objects. Images were taken in various countries which is necessary as the windows’ appearance differ between countries and it is desirable to obtain an universal classifier that can be applied in tunneling projects around the world. For the training phase we randomly choose 5000 windows and initially provide as many negative samples generated from parts of the images containing no windows. Similarly, we randomly choose 2000 positive samples and as many non-windows for the calibration phase.

We evaluate our detection system on the Ecole Central Paris facades Database (Teboul et al. 2010). This contains 487 rectified facade images of characteristic architectural styles of the six countries the images were taken in. To observe the performance and applicability for general use of our detection system we choose images of each country. In many images a major part of the windows is considerably occluded by either objects in the foreground or window shutters. To ensure a meaningful evaluation we choose a set of images for each country with a minimum of occlusions. For ground truth, windows were labeled manually in these images. We explicitly exclude shop windows from our evaluation since these are mostly occluded by cars, pedestrians or other objects. The particular numbers of images and windows per set are given in Table 3.

Table 3 Number of facade images per country we use to evaluate our detection system

Detection quality

As outlined in Table 4 the system proposed in this paper yields a detection rate of 85% on average over all countries while only 2% of the detections are falsely classified as windows. As can be seen from Table 2 it is sufficient to determine the opening-ratio with an accuracy of 10% to 20%. Based on the achieved rates the real opening-ratio of a facade can, thus, be approximated sufficiently to serve for further risk analyses. Per country the detection rate ranges from 82.3 to 90.7% with less than 5% false positives. Hence, an adequate estimation of the opening-ratio is ensured in all cases. This highlights the general applicability of our approach to facades in countries around the world.

Table 4 Detection results of the calibrated classifier on facades of diverse countries

Furthermore, as shown in Fig. 12 (a) and (c) our detection system is capable of dealing with partial occlusions caused by, e. g., balcony rails, vegetation or flag poles as long as enough prominent features are visible. However, it also responds to other facade elements which appear similar to windows. Especially for the French test set this led to a high increase in false positives as can be seen in Fig. 12 (b). Although it does not affect our evaluation results it is worth mentioning that a detection of shop windows is not reliably feasible with our system due to their large differences in size and the lack of any image feature as they mostly are frameless (see Fig. 12 (d)).

Fig. 12
figure 12

Exemplary detection results of our system. Green rectangles indicate true positive window detections whereas red rectangles indicate false detections. Windows are detected despite slightly occluding balcony rails (a) or partial occlusions due to flags or vegetation (c). Facade elements shaped similarly like windows may lead to misclassifications (b). No detection of store windows (b),(d)

Detection distribution

We showed that our detector yields good results on facade images of various countries. To further discuss the performance and general applicability of our detector, we investigate the distributions of detections and false positives with regard to the classification threshold of the trained classifier as shown in Fig. 13.

Fig. 13
figure 13

Distributions of detection rates and false positive rates. Rates are plotted with regard to the classification threshold of the trained classifier. Solid lines indicate detection rates while dashed lines indicate false positive rates

The bell-like shapes of the distributions emerge from the subsequent merging of the positively classified regions. The smaller the threshold, the more regions are positively classified. Consequently, more regions overlap which results in crucially less remaining regions after merging. Especially due to a higher scale and translation invariance at lower classification thresholds, nearby correctly classified regions often overlap heavily. The merged region, thus, is shifted between those owing to the averaging strategy of the merging process. This reduces the detection rate for small thresholds while simultaneously increasing the false positive rate. Reliable results can be expected only for thresholds from 0.54.

Except for few outliers of the Greek image set on thresholds higher than 0.6, the particular distributions for each country’s image set closely resemble eachother. The outliers can be disregarded as they are due to the small number of images within the Greek set. The similarity of the distributions indicates that the window concept was learned properly so that the classifier relies on general image features which recur in image patches of windows regardless of the country. The detection rates of the calibrated classifier (see Table 4) are close to the maximum detection rates of the trained classifier but often shifted towards a slightly higher threshold. This reduces the detection rate but results in a lower false detection rate.

Case study

In the context of our reference subway project Wehrhahnlinie (WHL) in Düsseldorf, Germany, we examine three representative structures. These are typical inner-urban masonry houses with different facades and individual opening-ratios in mixed usage with shops and restaurants in ground and first floors while in upper floors offices or residential use pre-dominates. For comparability of results equivalent material parameters are used throughout.

Providing opening-ratios

We apply our window detection system to rectified images of the chosen structures’ facades to determine their opening-ratios. Since we presuppose a facade completely filling the image we infer the opening-ratio of a facade from the ratio of detection areas to the image area. Similar to the evaluation in section “Results” for each facade, we provide a distribution of opening-ratios over the classification threshold additionally to the particular responses of our system in case of a calibrated classifier. As the calibrated classifier misses some windows but yields reasonably low false positives rates its results can be interpreted as the minimum opening-ratio a facade definitely possesses. By means of the distribution we identify a range within which the actual opening-ratio lies. This facilitates the categorization of facades into their most probable damage class.

Figures 14, 15 and 16 show the detection results of our calibrated system on the chosen structures. It can be seen that few windows are missed or detections are too small but no false positive detections occur which supports our interpretation of the calibrated system’s results. We achieve opening-ratios of 0.135, 0.186 and 0.292 for the structures I, II and III, respectively. Like already stated in section “Detection quality” the figures show that our detector does not manage the detection of shop windows resulting in lower detected opening-ratios compared to the actual ratios. This is satisfactory for achieving a first risk assessment while planning a tunneling route which is what we focus on in this paper, though it has to be taken into account for a more precise risk assessment.

Fig. 14
figure 14

Facade of building I. The window detections by our system using the calibrated classifier are marked in green

Fig. 15
figure 15

Facade of building II. We applied our calibrated system to the front (a), left (b), and right (c) facade of the building. The detection results are marked in green

Fig. 16
figure 16

Facade of building III. The window detections by our calibrated system are marked in green

Figure 17 shows the distributions of the opening-ratios as a function of the classification threshold for each of the relevant structures. The solid lines illustrate the distributions of opening-ratios whereas the dashed lines indicate the ground truth. For this we consider the opening-ratios derived from both, actual construction plans and manually labeled facade images. As displaced stories may be occluded by lower parts of the facade due to the ground view perspective, the labeled ground truth may deviate from the actual opening-ratio. The opening-ratios as derived from the calibrated system’s results are slightly lower than the distributions’ peaks which may contain some false positive detections. The best approximation of the actual opening-ratio based on the detection system, thus, can be expected to be between these two values. Therefore we pass both results to the risk assessment analyses. The gap between the detection distributions and the particular ground truth again illustrates the system’s deficiency regarding shop windows.

Fig. 17
figure 17

Distributions of the detected opening-ratio per building. Solid blue lines indicate the distributions of opening-ratios based on the detections of our system corresponding to its classification threshold. Their peaks are highlighted by a purple triangle. The opening-ratios inferred from the system with a calibrated classifier are marked with a green asterisk. Dashed lines indicate manually labeled (red) and actual (yellow) ground truth

Influence of detected opening layout

Since detection delivers a scalar factor for the opening-ratio in facades only and lacks information on individual sizes and window positions, the impact of randomly generated samples of openings in facades on the maximum strains is analyzed in numerical simulation. Therefore, the opening-ratio is kept constant while the distribution and size of windows in the facade is varied. As representative, an opening-ratio of about 23% is assumed which equals the global mean of two values gained for our three reference structures by detection. For every structure a lower and an upper value of the opening-ratio is detected. The first is the distribution’s maximum while the second one corresponds to the calbirated system (cf. Fig. 17).

Geometry and material parameters of a masonry facade resting on a foundation made from reinforced concrete are presented in Fig. 18 (a). While the height of the foundation H F is 0.5 m, the wall’s height H W is 7.5 m. Both have a length of 10 m and a depth D of 0.3 m. Strains are computed linear elastically considering a single pre-scribed displacement u(x) at the lower edge of foundation which is limited to u max =10 mm. In sagging the expected deformation of the facade is affine to a simply supported beam subjected to a distributed load just as it can be idealized as a clamped beam in case of hogging. For both, Fig. 18 (b, c) displays the principal strains in a homogeneous deep beam.

Fig. 18
figure 18

Assessment of the distribution of forces in deep beams. Structural system with corresponding boundary condition u(x) (a); idealized displacements for the sagging (b) and hogging case (c) and related principle stresses in deep beams

Separately for hogging and sagging Fig. 19 contrasts the number of openings to the maximum tensile stresses. Black solid lines serve for reference. They are obtained initially dividing a single square window in the facade according to the opening-ratio stepwise into smaller quarters. That way, 64 equally sized openings are obtained in step four. Consequently the facade has a constant opening-ratio and a grid of square windows in all steps. While for sagging the general tendency shows lower strains along with a rising number of openings it is vice versa for hogging. This is well-reasoned observing the different stress distributions in both cases. While an opening at center, or its subdivision into smaller regular parts, in sagging (cf. Fig. 18 (b)) does not disturb the strain trajectories much, the situation is totally different in case of hogging. Comparatively great tensile stresses are then located at the edges of the deep beam along with little material in case of numerous openings.

Fig. 19
figure 19

Analysis of the influence of varying openings. Facades with different opening layouts (a); relation between number of openings and maximum strains in sagging and hogging cases (b)

For sure, these results give insight into the mechanics of deep beams with a variable number of regular openings, but still lack practical relevance. Thus several more realistic configurations have been simulated in a second step. These configurations are characterized by irregular grids of windows, similar/equivalent opening-ratios and possess two or three floors typical for houses of 7.5 m height. Here, the openings are distributed randomly preserving minimum vertical and horizontal distances of 0.30 m to neighbors. The results indicated by x in Fig. 19 (b) slightly deviate from the general tendency lines and scatter due to random properties. However, interpretation is as follows: If only the opening-ratio is available but information on the exact sizes and locations lacks, the distribution of openings might be idealized to gain maximum numerical strains with a deviation of about ±25% calculated for an opening-ratio of 23%.

Damage risk assessment

Subsequently, the impact of the opening-ratio (OR) and exact window positions on the expected structural damages of buildings subjected to tunneling induced settlements is analyzed. Therefore, the maximum strains of our three reference structures (I-III) are determined analytically (LTSM) and contrasted to numerical simulation results. Table 5 summarizes geometrical and material parameters as well as the individual eccentricities of the structures to the tunnel axis and lists the detected and exact opening-ratios. Throughout the analysis the detected opening-ratio is now the mean of the distribution’s maximum and the calbirated system ones for every single structure in Fig. 17. Finite element analysis employs non-linear material behavior of concrete according to Schindler (2014) while the settlements are pre-calculated according to Peck (1969) (cf. Fig. 3, bottom). To cover the unknown pattern of openings in the buildings’ facades a maximum deviation of 25% is applied, too, but does not change the results much (cf. whiskers in Fig. 20). The results are contrasted in Fig. 20 and similar for all reference structures:

Fig. 20
figure 20

Results of the two different methods for damage estimation. The analysis were carried out with detected and exact opening-ratios for three reference strucutres

Table 5 Geometry and parameters of the structures
  • Evidently the strains obtained from numerical simulation are much smaller than the analytical ones and thus smaller damage categories are predicted, too. Generally, this is traced back to more realistically covered material properties and the soil-structure interaction considered by numerical simulation.

  • Analytical results employing detected or exact opening-ratios with the LTSM are close and differ one damage category at most. Similar holds true for numerical results when the scatter of unknown window locations is consequently included.

  • Due to correlation of opening-ratio and strains, the generally smaller opening-ratios from detection always deliver smaller strains and hence assigned damage categories. Applying LTSM the stiffness is reduced little by a small opening-ratio (cf. Table 2) just as the distribution of strains is in numerical simulation.


Risk assessment of tunneling induced damages to existing structures is essential for planning an optimal tunnel route in the urban area. Hereby, damages and the resulting costs of a subsequent maintenance are minimized. Besides the dimensions of a structure, for a precise risk assessment the opening-ratio plays a major role as it directly affects a structure’s stiffness. Deriving this manually from construction plans or by inspections conducted by surveyors is highly cost and time consuming and even not feasible for the vast amount of structures along potential tunneling routes. Virtual building models which could automate this process are yet publicly available but usually lack information about the relevant openings. As windows are the major reason for openings in facades, in this paper we proposed a system to detect windows in facade images. We trained and calibrated a soft cascaded classifier using rectified facade images gathered from different countries to avoid constraining our system to a specific one. A naïve sliding window detector which passes image patches to the classifier and merges overlapping detections is used to scan facade images. We showed that our system achieves detection rates of over 82% in various countries while only exhibiting a false positive rate of 2% on average. For risk assessment these rates are satisfactory to reliably estimate the damage class of a building. Another experiment reveals that the detection rate can be slightly increased entailing an increase of the false positive rate. In our case study we exploit this to define a lower and upper bound for the opening-ratio of a given facade.

With respect to damage assessment LTSM delivers more conservative results compared to numerical simulation by finite elements. The two approaches may differ up to three damage categories. Our detected opening-ratios are well-suited for a quick pre-assessment of settlement induced damages since they deviate from results employing true opening-ratios by one category at most. Exact positions of openings in the facade are of minor interest. It is sufficient to idealize the openings in regular grids over the facade respecting distances to neighbors of about 0.30 m while the number of floors is estimated from the buildings total height. Then automated detection of opening-factors delivers adequately precise results for damage assessment. Much effort might be saved neglecting exact positions and sizes of individual openings.


As the case study demonstrates, our approach yields promising results and is already applicable to aid the risk assessment of tunneling projects in terms of a pre-assessment. This ejects irrelevant structures from further time-consuming analyses. Provided opening-ratios, however, do not satisfy the demands of a precise assessment. To obtain more accurate opening-ratios the system’s detection rate and accuracy have to be increased. A postprocessing based on already detected windows would be desirable to enhance the detection results. The dimensions of present detections could be refined towards actual window edges in the image to improve accuracy. For increasing the detection rate, windows with less image evidence and, hence, lower feature responses have to be taken into account. Decreasing the detection threshold of the classifier would allow the classifier to detect such windows but would also dramatically increase the amount of false positives if applied to the entire image. This classifier should preferably only be applied to image regions which most likely contain windows to keep the false positive rate low. Positions of further potential windows could be derived from the alignment of present detections. A less strictly calibrated classifier could, then, be applied solely to these positions revealing further windows. Moreover, as mentioned before our system is not capable of detecting store windows. Further research has to be done to identify image features which reliably characterize such windows.


  • Attewell, PB, Yeats, J, Selby, AR. (1986). Soil Movements Induced by Tunneling and Their Effects on Pipelines and Structures. Glasgow: Blackie.

    Google Scholar 

  • Ali, H, Seifert, C, Jindal, N, Paletta, L, Paar, G (2007). Window Detection in Facades. In Proceedings of the 14th Conference on Image Analysis and Processing(pp. 837–842). IEEE Computer Society.

  • Baltsavias, EP (2004). Object extraction and revision by image analysis using existing geodata and knowledge: current status and steps towards operational systems. ISPRS Journal of Photogrammetry and Remote Sensing, 58, 129–151.

    Article  Google Scholar 

  • Barber, D, Mills, J, Smith-Voysey, S (2008). Geometric validation of a ground-based mobile laser scanning system. ISPRS Journal of Photogrammetry and Remote Sensing, 61(1), 128–141.

    Article  Google Scholar 

  • Boscardin, MD, & Cording, EJ (1989). Building Response to Excavation Induced Settlement. Journal of Geotechnical Engineering, 115, 1–21.

    Article  Google Scholar 

  • Bourdev, L, & Brandt, J (2005). Robust Object Detection Via Soft Cascade. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, (pp. 236–243).

  • Burland, JB, & Wroth, CP (1974). Settlement of buildings and associated damage. In Proceedings of theConference on Settlement of Structures. PentechPress Ltd., London, Cambridge, (pp. 611–654).

    Google Scholar 

  • Brenner, C (2005). Building reconstruction from images and laser scanning. International Journal of Applied Earth Observation and Geoinformation, 6(3-4), 187–198.

    Article  Google Scholar 

  • Haala, N, Peter, M, Kremer, J, Hunter, G (2008). Mobile LiDAR mapping for 3D point cloud collection in urban areas – a performance test. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 37, 1119–1124.

    Google Scholar 

  • Haala, N, & Kada, M (2010). An update on automatic 3D building reconstruction. ISPRS Journal of Photogrammetry and Remote Sensing, 65, 570–580.

    Article  Google Scholar 

  • Haugeard, J, Philipp-Foliguet, S, Precioso, F, Lebrun, J (2009). Extraction of Windows in facade using Kernel on Graph of Contours. In Proceedings of the 16th Scandinavian Conference on Image Analysis. Springer, Berlin, (pp. 646–656).

    Google Scholar 

  • Kämper, C, Putke, T, Zhao, C, Lavasan, AA, Barciaga, T, Mark, P, Schanz, T (2016). Vergleichsrechnungen zu Modellierungsvarianten für Tunnel mit Tübbingauskleidung. Bautechnik, 93(7), 421–432.

  • Lee, SC, & Nevatia, R (2004). Extraction and Integration of Window in a 3D Building Model from Ground View images. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, (pp. 106–113).

  • Lienhart, R, Kuranov, A, Pisarevsky, V (2003). Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection. In: Michaelis, B, & Krell, G (Eds.) In Pattern Recognition. DAGM 2003. Lecture Notes in Computer Sciencevol. 2781, (pp. 297–304). Berlin, Heidelberg: Springer.

  • Mark, P, & Schütgen, B (2001). Grenzen elastischen Materialverhaltens von Beton. Beton- und Stahlbetonbau, 96(5), 373–378.

  • Mark, P, Niemeier, W, Schindler, S, Blome, A, Heek, P, Krivenko, A, Ziem, E (2012). Radarinterferometrie zum Setzungsmonitoring beim Tunnelbau: Anwendung am Beispiel der Wehrhahn-Linie in Düsseldorf. Bautechnik, 89(11), 764–776.

  • Musialski, P, Wonka, P, Aliaga, DG, Wimmer, M, van Gool, L, Purgathofer, W (2013). A survey of urban reconstruction. Computer Graphics Forum, 32(6), 146–177.

    Article  Google Scholar 

  • Meixner, P, Leberl, F, Brédif, M (2011). Interpretation of 2D and 3D Building Details on Facades and Roofs. In Proceedings of the Conference on Photogrammetric Image Analysis.. ISPRS, (pp. 137–142)

  • Michael, M, & Schlipsing, M (2015). Extending Traffic Light Recognition: Efficient Classification of Phase and Pictogram. In Proceedings of the International Joint Conference on Neural Networks. IEEE.

  • Neugebauer, P, Schindler, S, Pähler, I, Blome, A, Mark, P (2015). Präventives Schädigungsmanagement im Tunnelbau – Schutz der oberirdischen Bebauung. In Taschenbuch Für Den Tunnelbau 2015(pp. s318–361). Berlin: Ernst und Sohn.

  • Neuhausen, M, Koch, C, König, M (2016). Image-Based Window Detection - An Overview. In Proceedings of Workshop of the European Group for Intelligent Computing in Engineering.

  • Neuhausen, M, Martin, A, Obel, M, Mark, P, König, M (2017). A Cascaded Classifier Approach to Window Detection in Facade Images. In Proceedings of the International Symposium on Automation and Robotics in Construction, Taipei, Taiwan. IAARC.

  • Obel, M, Ahrens, MA, Mark, P (2017). Settlement risk assessment for existent structures during mechanized tunneling based on uncertain data. In Proceeding of the 4th International Conference on Computational Methods in Tunneling and Subsurface Engineering. Studia, Innsbruck.

    Google Scholar 

  • Peck, RB (1969). Deep excavation and tunneling in soft ground. In Proceedings of the 7th International Conference on Soil Mechanics and Foundation Engineering. Sociedad Mexicana de Mecanica, Mexico City, (pp. 225–260).

    Google Scholar 

  • Pu, S, & Vosselman, G (2009). Knowledge based reconstruction of building models from terrestrial laser scanning data. ISPRS Journal of Photogrammetry and Remote Sensing, 64(6), 575–584.

    Article  Google Scholar 

  • Radim Tyleček, RŠ (2013). Spatial Pattern Templates for Recognition of Objects with Regular Structure. In Proceedings of the German Conference on Pattern Recognition. Springer, Berlin.

    Google Scholar 

  • Ripperda, N (2008). Grammar based facade reconstruction using rjMCMC. Photogrammetrie Fernerkundung Geoinformation, 2, 83–92.

    Google Scholar 

  • Ripperda, N, & Brenner, C (2006). Reconstruction of Facade Structures Using a Formal Grammar and RjMCMC. In: Franke, K, Müller, K-R, Nickolay, B, Schäfer, R (Eds.) In Pattern Recognition: Proceedings of the 28th DAGM Symposium. Springer, Berlin, Germany, (pp. 750–759).

    Chapter  Google Scholar 

  • Ripperda, N, & Brenner, C (2009). Application of a Formal Grammar to Facade Reconstruction in Semiautomatic and Automatic Environments. In Proceedings of the AGILE Conference on Geographic Information Science.

  • Riemenschneider, H, Krispel, U, Thaller, W, Donoser, M, Havemann, S, Fellner, D, Bischof, H (2012). Irregular lattices for complex shape grammar facade parsing. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, (pp. 1640–1647).

  • Schindler, S (2014). Monitoringbasierte strukturmechanische Schadensanalyse von Bauwerken beim Tunnelbau. Dissertation, Ruhr-University Bochum.

  • Schindler, S, Hegemann, F, Koch, C, König, M, Mark, P (2016). Radar interferometry based settlement monitoring in tunnelling: Visualisation and accuracy analyses. Visualization in Engineering, 4(7), 1–16.

  • Teboul, O, Simon, L, Koutsourakis, P, Paragios, N (2010). Segmentation of Building Facades Using Procedural Shape Priors. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, (pp. 3105–3112).

  • Viola, P, & Jones, MJ (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.

  • Wendel, A, Donoser, M, Bischof, H (2010). Unsupervised facade segmentation using repetitive patterns. In Proceedings of the Joint Pattern Recognition Symposium. Springer, Berlin, (pp. 51–60).

    Google Scholar 

Download references


Financial support was provided by the German Research Foundation (DFG) in the framework of the subproject D3 of the Collaborative Research Center SFB 837 “Interaction Modeling in Mechanized Tunneling”.


The authors declare that they have not received any external funding.

Availability of data and material

All data sets used in the context of this paper were preexisting. Those are available from the websites of their creators. For CMP-base façade database this is, for Ecole Central Paris facades Database this is

Author information

Authors and Affiliations



All authors contributed extensively to the work presented in this paper. Neuhausen reviewed the computer vision related literature and developed the methodology of the detection system. Obel did the literature review and methodology for risk assessment. Both elaborated the case study and concluded their findings. Mark and König supervised the entire process of this study. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Marcel Neuhausen.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Neuhausen, M., Obel, M., Martin, A. et al. Window detection in facade images for risk assessment in tunneling. Vis. in Eng. 6, 1 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: