Bridge vibration measurements using different camera placements and techniques of computer vision and deep learning

Abstract

In this paper, a new framework is proposed for monitoring the dynamic performance of bridges using three different camera placements and a few visual data processing techniques at low cost and high efficiency. A deep learning method validated by an optical flow approach for motion tracking is included in the framework. To verify it, videos taken by stationary cameras of two shaking table tests were processed at first. Then, the vibrations of six pedestrian bridges were measured using structure-mounted, remote, and drone-mounted cameras, respectively. Two techniques, displacement and frequency subtractions, are applied to remove systematic motions of cameras and to capture the natural frequencies of the tested structures. Measurements on these bridges were compared with the data from wireless accelerometers and structural analysis. Influences of critical parameters for camera setting and data processing, such as video frame rates, data window size, and data sampling rates, were also studied carefully. The research results show that the vibrations and frequencies of structures on the shaking tables and existing bridges can be captured accurately with the proposed framework. These camera placements and data processing techniques can be successfully used for monitoring their dynamic performance.

Introduction

Most bridges and buildings are large in size and volume. It is difficult and challenging for measuring their response to external excitations and assessing their performance at full scale for Structural Health Monitoring (SHM) missions. In practice, SHM can be implemented at a local scale when a limited number of sensors are placed only at critical locations of important infrastructure. Vibration- and vision-based sensors are commonly used for deformation and vibration measurements of existing structures. The former can record the acceleration or dynamic response of the structures, thus, inherent dynamic characteristics such as structural frequencies and vibration modes can be inferred from the measurements and observations, and utilized to assess the performance of the structures. The vision-based systems in civil engineering have been used not only for detecting visual damage (Bai et al. 2023), but also for measuring deformations and vibrations (Dong and Catbas 2021). In recent years, the cost of cameras and Unmanned Aerial Vehicles (UAVs) has decreased dramatically to make them an attractive option for SHM.

Computer vision has been used for monitoring the movement of objects accurately, which our vision system can notice but is unable to quantify. The objective of this research is to apply vision-based technologies to quantitatively capture and analyze the motion of structures. We were motivated by studies using computer vision methods to measure the movement of structures (Feng et al. 2015; Chen et al. 2017). Also, some practices (Bai et al. 2021b2023) provide insights on how current knowledge of computer vision and deep learning can be facilitated to monitor and measure structural response to excitations in laboratory and field experiments.

In this research, three practical ways of camera placements are illustrated in Figs. 45 and 6 are utilized to capture the motions of structures: 1) stationary cameras placed remotely and focused on a bridge for tracking one or multiple targets simultaneously, 2) structure-mounted cameras fixed on a bridge as contact sensors (e.g., accelerometers, see Section 2) to measure the bridge’s vibrations, and 3) UAVs deployed to record both motions of the bridge and drones when nearby stationary objects are utilized as the reference. The first camera placement was tested with laboratory experiments and achieved subpixel accuracy (Bai et al. 2021a). Then, a more comprehensive study is implemented further using data from field experiments and other laboratory tests. A stationary camera itself can be the reference for the moving bridge. Otherwise, the nearby buildings or other motionless surroundings can be treated as a stationary reference to eliminate the movement of cameras caused by ambient motions, including the wind, ground motion induced by the surrounding traffic, and the UAV’s movement during its flight. Also, it is strongly believed that the combined experience and expertise of structural engineering and computer vision are helpful to achieve good performance with the proposed framework when the appropriate targets and references are selected. For example, in the experiments of bridge vibration measurements, it is the mid-spans where the most significant motion happens and it is their joint regions where there are more textures that can be considered as the tracking targets. The cameras used in the field experiments cost 60 to 500 US dollars each and were operated at low speeds (i.e., 30 to 84 frames/second). Markers were not needed for the tests.

We have three main objectives for using camera-based technologies easily and effectively to monitor and assess the dynamic performance of in-service bridges: 1) integrating three possible camera placements mounted on different platforms to measure the vibrations of bridges using a deep learning framework, 2) validating the proposed framework of processing visual data with two shaking table tests and field experiments of pedestrian bridges, and 3) applying both displacement and frequency subtractions to remove camera motion in these measurements. These camera placements and techniques are addressed in Section 3, and their limitations are discussed in Sections 45, and 6. Because pedestrian bridges are easily excited and the experiments can be conveniently repeated in the field, six of them were selected as the demonstration for the experimental studies in this paper. With the same framework, tests on traffic and railway bridges and a building in the progressive collapse study were also conducted and showed promising results for deflection and vibration measurements (Bai 2022). In all these field experiments, accelerations measured by the wireless accelerometers and displacements measured by various cameras were not compared directly. Rather, the structural frequencies obtained from the measured accelerations and displacements (e.g., Figs. 1516 and 17 and Table 1) were used to validate the method proposed in this paper.

Related work

In order to better understand our work, research, and applications on displacement and vibration measurements with computer vision and deep learning methods, which inspired us to have a new framework in this paper, are addressed in this section. Also, studies about UAVs and wireless accelerometers are reviewed since we used both in our research.

Conventional computer vision techniques for measuring displacement or vibration can achieve high accuracy in practice. Feng et al. (2015) proposed an upsampled cross-correlation as a template matching algorithm to measure the displacement and vibration in a shaking table test and performed two field tests, including the tests on a railway and a pedestrian bridge. Chen et al. (2021) proposed a method with Digital Image Correlation to track the displacement or vibration of individual points on a model bridge. Dong et al. (2019) utilized optical flow estimation to track non-target objects on grandstand structures and implemented modal identification. Some researchers also employed various template-matching algorithms to track and measure the displacement or vibrations of buildings (Lee et al. 2017; Yin et al. 2014; Liu et al. 2016; Rajaram et al. 2017; Chen et al. 2017). Guo and Zhu (2016) applied the Lucas-Kanade template tracking algorithm on displacement measurement in an experiment, which is used as another method in this research. In these studies, how to obtain the subpixel accuracy for the measurements was addressed. An accuracy of 0.016 to 0.25 mm and 0.64 to 3.5 mm is reported, respectively, for the displacement measurements with cameras in these laboratory and field experiments.

Image-based deep learning methods for vibration measurements have achieved better performance since they can extract more useful features in the following research. Dong et al. (2020) implemented FlowNet2 on displacement and vibration measurements of various structures. Their approach to eliminating the movement of cameras during field tests is instructive to us about how to use displacement subtraction for true vibrations. Xiao et al. (2020) investigated a proposed SHM system using deep learning algorithms to evaluate the structural responses with visual data fused with data from conventional sensors. Dong and Catbas (2019) applied the Visual Graph Visual Geometry Group network to extract features on the target and performed a field test on a two-span bridge. From these papers, an accuracy from 0.0087 to 0.08 mm was reported in the laboratory tests. Bai et al. (2021a) proposed a High-resolution Mask Regional Convolutional Neural Network (HR Mask R-CNN) to track and accurately measure the deflection of three concrete beams, and the vibrations of three masses on a shaking table in the laboratory tests. The deep learning method was trained by following standard data annotation, loss regulation, and parameter settings. Moreover, a measurement-smoothing technique referred to as the Scale-Invariant Feature Transformation (SIFT) was also introduced for high-accuracy measurements. Thus, the average error of deflection measurements from HR Mask R-CNN + SIFT for three test beams is 0.13 mm, and the difference between the extracted and input frequencies is less than 9 by identifying all the intended frequencies. This paper is based on our previous study but is more comprehensive.

Recently, UAVs are largely deployed for vibration measurements for their high mobility and efficiency. These studies (Yoon et al. 2018; Chen et al. 2021; Hoskere et al. 2019; Ribeiro et al. 2021; Perry and Guo 2021; Khuc et al. 2020) provided good examples for researchers to follow, and their methods to process the visual data from the drones are helpful to this research. For example, Khuc et al. (2020) utilized the UAVs to measure the swaying displacement of small-scale structures. Between two consecutive frames in a video, the keypoints on a target were located and matched so that their average movement represents the displacement. This is a method similar to our research on how to eliminate camera motion. But, in our study, all the frames in a video are aligned to the first frame by the affine transformation (Szeliski 2010).

Wireless accelerometers have been used as contact sensors to detect and capture the dynamic response of various bridges (Gheitasi et al. 2016; Gibbs et al. 2019; Baisthakur and Chakraborty 2020; White et al. 2020), and their data can validate the vibration measurements from cameras. In our study, several G-Link-200-8G wireless accelerometers (LORD 2022) were employed as ground-truth sensors. These accelerometers have high sensitivity for three axes (i.e., the input range is ), and their bandwidth can reach . Its noise will be lowered to , but its wireless range and sampling rate can be up to one kilometer and . Also, programmable high- and low-pass digital filters are utilized in the built-in program. This sensor can work continuously until the batteries lose their power. In addition, multiple accelerometers can be used simultaneously for lossless data collections, scalable network sizes, and node synchronizations of . Two or three of them were placed on testing bridges as another vibration data source in our field experiments.

Methodologies

In this section, how a camera can perform the displacement or vibration measurements is introduced at first, then, three different camera placements, filters to suppress the noise, and two techniques to remove the camera motion are discussed. The experiments conducted on the pedestrian bridges belong to free-damped vibration, in which a bridge is subjected to an initial excitation (e.g., a jump on the bridge’s deck) and can vibrate at one or more frequencies. This oscillation diminishes from the peak to the standstill because of damping (Chopra 2019). In contrast, simulated structures on shaking tables are forced to vibrate by the tables and reach their resonant states when their frequencies are equal to the input ones from the tables.

READ FULL ARTICLE & PAPER 

Published on: