xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001917494
007 cr mnu|||uuuuu
008 071121s2007 flu sbm 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0002045
Randeniya, Duminda I. B.
Automatic geo-referencing by integrating camera vision and inertial measurements
h [electronic resource] /
by Duminda I. B. Randeniya.
[Tampa, Fla.] :
b University of South Florida,
ABSTRACT: Importance of an alternative sensor system to an inertial measurement unit (IMU) is essential for intelligent land navigation systems when the vehicle travels in a GPS deprived environment. The sensor system that has to be used in updating the IMU for a reliable navigation solution has to be a passive sensor system which does not depend on any outside signal. This dissertation presents the results of an effort where position and orientation data from vision and inertial sensors are integrated. Information from a sequence of images captured by a monocular camera attached to a survey vehicle at a maximum frequency of 3 frames per second was used in upgrading the inertial system installed in the same vehicle for its inherent error accumulation. Specifically, the rotations and translations estimated from point correspondences tracked through a sequence of images were used in the integration. However, for such an effort, two types of tasks need to be performed.The first task is the calibration to estimate the intrinsic properties of the vision sensors (cameras), such as the focal length and lens distortion parameters and determination of the transformation between the camera and the inertial systems. Calibration of a two sensor system under indoor conditions does not provide an appropriate and practical transformation for use in outdoor maneuvers due to invariable differences between outdoor and indoor conditions. Also, use of custom calibration objects in outdoor operational conditions is not feasible due to larger field of view that requires relatively large calibration object sizes. Hence calibration becomes one of the critical issues particularly if the integrated system is used in Intelligent Transportation Systems applications. In order to successfully estimate the rotations and translations from vision system the calibration has to be performed prior to the integration process.The second task is the effective fusion of inertial and vision sensor systems. The automated algorithm that identifies point correspondences in images enables its use in real-time autonomous driving maneuvers. In order to verify the accuracy of the established correspondences, independent constraints such as epipolar lines and correspondence flow directions were used. Also a pre-filter was utilized to smoothen out the noise associated with the vision sensor (camera) measurements. A novel approach was used to obtain the geodetic coordinates, i.e. latitude, longitude and altitude, from the normalized translations determined from the vision sensor. Finally, the position locations based on the vision sensor was integrated with those of the inertial system in a decentralized format using a Kalman filter. The vision/inertial integrated position estimates are successfully compared with those from 1) inertial/GPS system output and 2) actual survey performed on the same roadway.This comparison demonstrates that vision can in fact be used successfully to supplement the inertial measurements during potential GPS outages. The derived intrinsic properties and the transformation between individual sensors are also verified during two separate test runs on an actual roadway section.
Dissertation (Ph.D.)--University of South Florida, 2007.
Includes bibliographical references.
Text (Electronic dissertation) in PDF format.
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
Title from PDF of title page.
Document formatted into pages; contains 162 pages.
Advisor: Manjriker Gunaratne, Ph.D.
Intelligent vehicular systems.
x Civil Engineering
t USF Electronic Theses and Dissertations.
Automatic Geo-Referencing by Integrating Camera Vision and Inertial Measurements by Duminda I. B. Randeniya A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Civil and Environmental Engineering College of Engineering University of South Florida Co-Major Professor: Manjriker Gunaratne, Ph.D. Co-Major Professor: Sudeep Sarkar, Ph.D. Ram Pendyala, Ph.D. Auroop Ganguly, Ph.D. George Yanev, Ph.D. Date of Approval: June 22, 2007 Keywords: Multi-Sensor Fusion, Vision/INS Integration, Intell igent Vehicular Systems, Computer Vision, Inertial Navigation Copyright 2007, Duminda I. B. Randeniya
DEDICATION This work is dedicated to my parents Tissa and Hema Randeniya and to my two sisters Dinishi and Sajini. Without their support and en couragement I would never achieve this.
ACKNOWLEDGEMENTS I would like to thank my co-a dvisors, Dr. Manjriker Gunaratn e and Dr. Sudeep Sarkar for providing me with the opportunity to work on this project. Your guidance, patience and mentorship are greatly appreciated. I trul y could not have asked for a better people to have worked under. Also, the Florida Depa rtment of Transporta tion should be thanked for supporting my research efforts through a re search grant to the University of South Florida. Additionally, I would like to extend my thanks to the members of my examining committee, Dr. Auroop Ganguly, Dr. Ram Pendya la, Dr. George Yanev and Dr. John Lu for all the support and courage gi ven to me through out my stay at USF. Also I thank Dr. Elaine Chang for her support. Farther, I would be very grateful to my colleagues especially M. Jeyishanker, K. Jeyishanker and L. Fuentes for helping me at different occasions surveying the roadways under scorching summer sun. I would also like to express my sincere appreciation to the State Materials Office personnel in Gainesville for th eir continuous suppor t throughout the course of the project and allowing me to collect vital data for my dissertation. In speci fic, the help of Mr. Abdenour Nazef, Mr. Stacy Scott and Mr. Glenn Salvo is acknowledged. Last but not the least I would like to give my warmest gratitude to my parents and two sisters for their unconditional support and l ove. I wouldnt have achieved this without their guidance and support.
i TABLE OF CONTENTS LIST OF TABLES.............................................................................................................iv LIST OF FIGURES.............................................................................................................v ABSTRACT.......................................................................................................................ix CHAPTER 1 INTRODUCTION......................................................................................1 1.1 Background........................................................................................................1 1.2 Scope of Work...................................................................................................6 1.3 Multi-Purpose Survey Vehicle...........................................................................9 1.3.1 Inertial Measuring Unit (IMU).................................................................10 1.3.2 Lever Arm Effect on MPSV.....................................................................11 1.3.3 Forward View and Side View Cameras....................................................13 1.3.4 Laser Profile System.................................................................................15 1.3.5 Pavement Imaging System........................................................................16 1.4 Motivation........................................................................................................17 1.5 Organization of Dissertation............................................................................20 CHAPTER 2 FUNDAMENTALS OF INERTIAL NAVIGATION..............................22 2.1 Coordinate Frames Used in This Research......................................................22 2.1.1 Inertial Frame (i-frame)............................................................................22 2.1.2 Earth Centered Earth Fixed Frame (e-frame)...........................................23 2.1.3 Navigation Frame (n-frame).....................................................................25 2.1.4 Wander Angle Frame (w-frame)...............................................................26 2.1.5 Body Coordinate Frame (b-frame)............................................................26 2.1.6 Camera Coordinate Frame (c-frame)........................................................28 2.2 Representing Rotation......................................................................................28 2.2.1 Euler Angles..............................................................................................29 2.2.2 Tate-Bryant Rotation Sequence................................................................31 2.2.3 Quaternions...............................................................................................32 2.3 Coordinate Transformations............................................................................34 2.3.1 Transformation from i-frame to e-frame...................................................34 2.3.2 Transformation from n-frame to e-frame..................................................35 2.3.3 Transformation from b-frame to n-frame..................................................36
ii2.4 Introduction to Inertial Measuring Unit...........................................................37 2.4.1 Gimbaled IMU..........................................................................................38 2.4.2 Strapdown IMU.........................................................................................39 2.4.3 Fiber Optic Gyroscopes............................................................................41 2.4.4 Accelerometers..........................................................................................43 2.5 Estimating Navigation Equations....................................................................43 2.6 IMU Error Model.............................................................................................51 CHAPTER 3 ESTIMATING THE ROADWAY GEOMETRY FROM INERTIAL MEASUREMENTS...............................................................53 3.1 Scope of Investigation......................................................................................54 3.2 Evaluation of Cross slope................................................................................54 3.2.1 Linear L. S. A. Fit (The Existing Algorithm)...........................................58 3.2.2 Least Squares Parabolic Fit (The Modified Algorithm)...........................58 3.3 Radius of Curvature.........................................................................................58 3.3.1 Determination of Curvature Using Kinematics........................................60 3.3.2 Determination of Curvature Using Geometry...........................................62 3.4 Estimation of Curvature in the Field................................................................63 3.4.1 Chord-Offset Method................................................................................63 3.4.2 Compass Method.......................................................................................64 3.4.3 Manual Estimation of Radius of Curvature..............................................66 CHAPTER 4 FUNDAMENTALS OF VISION NAVIGATION...................................67 4.1 Introduction......................................................................................................67 4.1.1 Homogeneous Coordinate System............................................................69 4.2 Estimating Intrinsic Parameters of the Camera...............................................71 4.3 Feature Corresponding in Images....................................................................74 4.4 Estimating Motion from Point Correspondences.............................................78 4.5 Estimating Motion from Line Correspondences..............................................86 CHAPTER 5 KALMAN FILTERING...........................................................................89 5.1 Introduction......................................................................................................89 5.2 Standard Kalman Filter Equations...................................................................92 5.3 Design of Vision Only Kalman (Local) Filter.................................................94 5.4 Design of Master Kalman Filter......................................................................96 CHAPTER 6 MATHEMATICAL FORMULATIONS FOR SYSTEM CALIBRATION......................................................................................101 6.1 Introduction....................................................................................................101 6.2 Determination of the Unique Mathematical Transformation between the Inertial and Vision Sensor Measurements...............................................101 6.2.1 Determination of the Visi on-Inertial Transformation.............................104 6.2.2 Optimization of the Visi on-Inertial Transformation...............................107
iii6.2.3 Verification with the Survey Measurements...........................................109 6.3 Determination of Position Residual from the Two Sensor Systems..............110 CHAPTER 7 EXPERIMENTAL SETUP AND RESULTS........................................114 7.1 Experimental Setup........................................................................................114 7.2 Experimental Verification of Highway Geometric Data...............................114 7.2.1 Cross Slope Data.....................................................................................114 7.2.2 Radius of Curvature Data........................................................................117 7.3 Calibration for Intrinsic Prope rties of the Vision System..............................124 7.3.1 Estimating the Transformation betw een Inertial Sensor and Vision Sensor.......................................................................................................126 7.3.2 Validation of Results...............................................................................128 7.3.3 Verification with Ground Truth..............................................................130 7.4 Comparison of the Two Vision pose Measurement Systems........................131 7.5 Results of the IMU/Vision Integration..........................................................137 7.6 Validation of IMU/Vision Orient ation Results with Ground Truth...............143 CHAPTER 8 CONCLUSIONS....................................................................................146 8.1 Future Research.............................................................................................148 REFERENCES................................................................................................................149 APPENDICES.................................................................................................................155 Appendix A Structure from Motion Algorithms.................................................156 A.1 Motion Estimation from Line Corre spondences (Taylor et al., 1995)......156 Appendix B Multi Purpose Survey Vehicles IMU and Camera Specifications..................................................................................159 B.1 IMU Specifications (Northrop Grumman LN 200)................................159 B.2 Forward View Camera Speci fications (DVC company)...........................161 ABOUT THE AUTHOR...................................................................................END PAGE
iv LIST OF TABLES Table 1: Current research performed in IMU/Vision integration (x no and yes)........................................................................................................................7 Table 2: Comparison of different gr ades of IMU (Shin 2005, Grewal 2001)..................40 Table 3: Data obtained from the manual survey...............................................................66 Table 4: Comparison of crossslopes for first run on I-10..............................................115 Table 5: Comparison of cross-slopes for second run on I-10.........................................116 Table 6: Comparison of radius va lues obtained for the two curves...............................121 Table 7: Intrinsic parameters of the vision system.........................................................125 Table 8: Orientation difference between two sensor systems estimated at four locations............................................................................................................128 Table 9: Maximum and minimum erro rs between IMU/GPS, IMU-only and IMU/Vision systems.........................................................................................141 Table 10: Maximum error percentages be tween the IMU/Vision and IMU/GPS system estimates and the actual survey.............................................................145
v LIST OF FIGURES Figure 1: FDOT-multi purpose survey vehicle....................................................................9 Figure 2: Block diagram of the GPS/INS system in the MPSV........................................11 Figure 3: Illustration of lever arms of sensors...................................................................12 Figure 4: Forward and side view cameras.........................................................................13 Figure 5: An image from forward view camera.................................................................14 Figure 6: An image from side view camera.......................................................................14 Figure 7: Laser installation................................................................................................16 Figure 8: Pavement lighting system and camera...............................................................17 Figure 9: Test section drawn at I-4 when the GPS signal is available...............................18 Figure 10: Test section dr awn at I-4 when the GPS signal is not available.......................19 Figure 11: Illustration of ECEF coordinate frame.............................................................24 Figure 12: Illustration of lo cal coordinate frames (a) body coordinate frame, (b) camera coordinate frame..................................................................................27 Figure 13: Illustration of roll, pitch and yaw....................................................................30 Figure 14: (a) Strapdown IMU and (b ) Gimbaled IMU (Grewal 2001)............................38 Figure 15: Schematics of IFOG.........................................................................................42 Figure 16: Estimation of cross slope from laser measurement height and IMU roll angle.................................................................................................................56 Figure 17: Geometry of the first cu rve of the revers e curve section.................................60 Figure 18: Illustration of not ation used in Table 4............................................................66 Figure 19: Image formati on in a pinhole camera...............................................................68 Figure 20: Established point correspondences from KLT tracker.....................................76
viFigure 21: (a) Image after edge detec tion, (b) Establishing correspondences...................77 Figure 22: Rigidity constrain for estimating pose from point correspondences................79 Figure 23: Epipolar lines drawn for the vision system......................................................80 Figure 24: Epipolar lines (a) estimated before removing false correspondences, (b) after removing false correspondences........................................................83 Figure 25: Filtering point correspondences using motion tracking (a) before (b) after..................................................................................................................85 Figure 26: Rigidity constrain fo r the motion estimation using line correspondences...............................................................................................87 Figure 27: Illustration of processes involved in the fusion algorithm...............................90 Figure 28: Variability of vision tr anslations due to high noise..........................................91 Figure 29: The final fusion architec ture of the IMU/Vision system..................................92 Figure 30: Illustration of master Kalman filter.................................................................97 Figure 31: Steps involved in estimating the transformation between the two sensor systems...........................................................................................................102 Figure 32: Measurement points marked on the pavement by reflective tapes.................103 Figure 33: Illustration of the coinciden ce of IMU and survey measurement points and the offset of image measurements...........................................................104 Figure 34: Three coordinate systems associated with the alignment procedure and the respective transformations.......................................................................106 Figure 35: Illustration of meas urement residual estimation............................................111 Figure 36: Parabolic fit for a point along the test section, 50 ft away from the starting point of th e test section.....................................................................115 Figure 37: Radius calculated by the ge ometric method for a data acquisition frequency of 200 Hz (a) without averaging, (b) with averaging....................117 Figure 38: Variation of radius with time evaluated from Eqn (3-12) (frequency 200 Hz)...........................................................................................................118 Figure 39: Plot of body acceleration in the lateral (Y) directio n with time for the total section....................................................................................................119
viiFigure 40: Curved sections separated from the tangent sections, using upper and lower bounds..................................................................................................120 Figure 41: Comparison of radii values calculated by (1) using kinematic method, (2) using geometric method, (3) using modified kinematic method..............120 Figure 42: Radius values obtained by compass method for two curves..........................122 Figure 43:Comparison of radius values obtained from 1) compass method, 2) geometric method, 3) kinematic me thod and 4) modified kinematic method for (a) curve 1 and (b) curv e 2 including transition curves...............123 Figure 44: Calibration target wi th circular control points...............................................124 Figure 45: Comparison of or iginal and the distorted data points from the model on the calibration target.................................................................................126 Figure 46: Comparison of raw inertial data with transformed vision data for (a) roll, (b) pitch, and (c) yaw for th e straight section (longer run)....................129 Figure 47: Comparison of raw inertial data with transformed vision data for (a) roll, (b) pitch, and (c) yaw for th e horizontal curv e (longer run)...................129 Figure 48: Comparison of transformed iner tial and transformed vision data with survey data for (a) roll, (b) pitc h, and (c) yaw on the short section...............131 Figure 49: Comparison of orientations on the (A) straight s ection, (B) horizontal curve for (a) roll, (b ) pitch and (c) yaw.........................................................133 Figure 50: Comparison of orientation af ter moving median and moving average filtering for (A) straight section, (B) horizontal curve (a) roll, (b) pitch and (c) yaw.....................................................................................................135 Figure 51: Comparison of normalized translat ions for the straight run before and after Kalman filtering (a) x-direction, (b) y-direction and (c) zdirection.........................................................................................................136 Figure 52: Comparison of normalized transl ations for the horizontal curve before and after Kalman filtering (a) x-di rection, (b) y-direction and (c) zdirection.........................................................................................................137 Figure 53: Comparison of (a) roll, (b) pitch and (c) yaw of IMU and filtered Vision.............................................................................................................138 Figure 54: Comparison of translations (a ) x-direction, (b) y-direction and (c) zdirection.........................................................................................................139 Figure 55: Comparison of (a) Latitude and (b) Longitude..............................................140
viiiFigure 56: Error associated with (a) Latitude and (b) Longitude....................................142 Figure 57: Comparison of IMU/Vision orientations with survey data............................144
ix AUTOMATIC GEO-REFERENCING BY INTEGRATING CAMERA VISION AND INERTIAL MEASUREMENTS D. I. B. RANDENIYA ABSTRACT Importance of an alternative sensor system to an inertial measurement unit (IMU) is essential for intelligent land navigation sy stems when the vehicle travels in a GPS deprived environment. The sensor system that has to be used in updating the IMU for a reliable navigation solution has to be a pa ssive sensor system which does not depend on any outside signal. This dissertation presents the results of an effort where position and orientation data from vision and inertial sensors are integrated. Information from a sequence of images captured by a monocular ca mera attached to a survey vehicle at a maximum frequency of 3 frames per second wa s used in upgrading th e inertial system installed in the same vehicle for its in herent error accumulation. Specifically, the rotations and translations estimated fr om point correspondences tracked through a sequence of images were used in the integra tion. However, for such an effort, two types of tasks need to be performed. The first task is the calibration to estimate the intrinsic properties of the vision sensors (cameras), su ch as the focal length and lens distortion parameters and determination of the transf ormation between the camera and the inertial systems. Calibration of a two sensor system under indoor conditions does not provide an appropriate and practical transformation for use in outdoor maneuvers due to invariable differences between outdoor and indoor c onditions. Also, use of custom calibration
x objects in outdoor operational conditions is not feasible due to larger field of view that requires relatively large calibration object si zes. Hence calibration becomes one of the critical issues particularly if the integrated system is used in Intelligent Transportation Systems applications. In order to successfully estimate the rotations and translations from vision system the calibration has to be perf ormed prior to the integration process. The second task is the effective fusion of inertial and vision sensor systems. The automated algorithm that identifies point co rrespondences in images enables its use in real-time autonomous driving maneuvers. In order to verify the accuracy of the established correspondences, inde pendent constraints such as epipolar lines and correspondence flow directions were used. Also a pre-filter was utilized to smoothen out the noise associated with the vision sensor (camera) measurements. A novel approach was used to obtain the geodetic coordinates, i. e. latitude, longitude and altitude, from the normalized translations determined from the vision sensor. Finally, the position locations based on the vision sensor was integrated wi th those of the inertial system in a decentralized format using a Kalman filter. The vision/inertial integrated position estimates are successfully compared with t hose from 1) inertial/GPS system output and 2) actual survey performed on the same roadway. This comparison demonstrates that vision can in fact be used successfully to suppl ement the inertial measurements during potential GPS outages. The derived intrinsic propertie s and the transformation between individual sensors are also verified dur ing two separate test runs on an actual roadway section.
1 CHAPTER 1 INTRODUCTION 1.1 Background Error growth in the measurements of inerti al systems is a major issue that limits the accuracy of inertial navigationa l systems. However, due to the high accuracy associated with inertial systems in short term applic ations, many techniques such as Differential Global Positioning Systems (DGPS), camera (vision) sensors etc. have been experimented by researchers to be used in conjunction with inertial systems and overcome the error growth in long-term a pplications. A breakth rough in Intelligent Transportation Systems (ITS) and Mobile Mapping Technology (Cramer, 2005) was made when the Global Positioning System (GPS) and the Inertial Navigation System (INS) was successfully integrated using a Ka lman filter. This integration has been achieved in different integration architect ures (Wei and Schwar z, 1990) and applied widely since, it addresses the above me ntioned error accumulation in Inertial Measurement Unit (IMU). With the improved accuracies achieved from the differential correction of GPS signal and elimination of the Selective Availability a variety of promising research endeavors in INS/GPS in tegration technology have sprung up. Yet, a vulnerability of the INS/GPS system lies in the fact that it solely relies on the GPS signal to correct the position and att itude estimation of the vehicle in the longer run. Therefore, INS/GPS integration system fails to provide a sufficiently accurate continuous output in
2 the presence of GPS outages either in ur ban canyons or locations where communication is hindered (Feng and Law, 2002). To overcome this issue in real-time navi gation, researchers have attempted various alternatives; development of highway infr astructure and Automated Highway Systems (AHS) with embedded magnetic markers (Fa rrell and Barth, 2002) and vehicles with devices that can locate thes e markers. Although the combina tion of AHS with ITS would provide more reliable measurements, improveme nt of infrastructure is costly. Hence, alternative means of position location with wireless or m obile phone networks (Zhao, 2000) that preclude the need for upgrading the highway infrastructure have also been explored. At the same time, use of mobile phone networks or any signal-based locating systems have the disadvantage of possibl e signal outages and bl ockages which cause system failure. Even when the signal is ava ilable the vehicle location estimation accuracy is limited since mobile network base stations have a 50-100m error band. Therefore, the use of primary vehicle-based and external signal-independent systems are vital to reliable position and orientation es timation in all types of terrain. On the other hand, due to the advances in co mputer vision, potentially promising studies that involve vision sensing ar e being carried out in the ar eas of Intelligent Transport Systems (ITS) and Automatic Highway System s (AHS). The most common applications of vision systems include detection of obstacles such as pedestrians crossing a roadway, sudden movement of other vehicles (Kumar et al., 2005; Sun et al., 2004; Franke and Heinrich, 2002) vehicle maneuvers such as lane changing (Dellaert and Thorpe, 1997;
3 Sotelo et al., 2004; Li and Leung, 2004) and utilizing vision sensor s to aid autonomous navigation without fusing the visi on sensor to other sensor systems that measure absolute position such as GPS (Bertozzi et al. 1998, Be rtozzi et al. 2000, McCall et al. 2006). In detecting the moving objects in these applications research ers first track the surrounding area for moving objects or vehicles by pr ocessing images and using computer vision techniques. Once a moving object is located, th e movement of the obj ect is tracked using a tracking algorithm, in order to avoid collision. In addition, researchers (Sot elo et al. 2004, Bertozzi et al. 1998, Bertozzi et al. 2000, McCall et al. 2006) used vision measurements to track the lane markings and other vehicles traveling in the vici nity. For this purpose, they us ed image processing techniques such as segmentation, with a second order polynomial function, to approximate the road geometry. This development made it possible to identify the position of the vehicle with respect to the pavement boundaries. Also these authors used GPS measurements combined with the positions of the vehicle obtained from vision system with respect to the tracked lane markings in navigating the vehicle. However, these methods do not use inertial navigation system measurements in estimating the position of the vehicle. Also researchers have experimented combining inertial sensors with vision sensors, in navigation. Most common application of this ty pe of combination has been in an indoor environment (Diel, 2005, Foxlin 2003) where a vision sensor is used to identify a landmark and perform the navigation task acco rdingly. In such situations artificial landmarks have been placed for the visi on sensor to obtain f eature correspondences
4 conveniently (Foxlin 2003, Hu 2004). Obvious ly, one cannot expect the same indoor algorithms to perform with the same degree of accuracy in an unc ontrolled environment. Therefore, it is essential that algorith ms be designed for uncontrolled (outdoor) environments that also incor porate tools to filter out e rroneous correspondences. For a vision algorithm to produce accurate rotations an d translations, either the selected feature correspondences or the camera must be st ationary (Faugeras 1996). However, this condition can be violated easily when corre spondence techniques developed for indoor conditions are extended to outdoors. This is due to the possibility of uncontrolled selection of correspondences from moving obj ects by the camera which itself moves with the vehicle. As more and more studies are being conducted in integrating inertial and vision sensors, one of the crucial issues faced by researchers is the appropriate calibration of the inertial sensor with the vision sensor. Whether the a pplication that the integrated system is intended for use is indoors (e.g. indoor robot navigation) or outdoor s (e.g. vision aided inertial navigation of a vehicle) accurate calibration becomes critical. This is because, for successful integration of vision and iner tial data, the mathematical relationship (transformation) between thes e two measurement systems must be determined accurately. In spite of its vitality in implementing the integration mechanism, only limited literature is available on reports of su ccessful calibration. The initia l groundwork to estimate the relative position and orientation ( pose ) of two independent systems with separate coordinate frames was laid, in the form of a mathematical formulation, in (Horn, 1987). Horn (Horn, 1987) used unit quaternions in obtaining a closed-form solution for the
5 absolute orientation between two different c oordinate frames by using three non-collinear points. In extending this work to the calibration of inertial and vision sensor integrated systems, (Alves et al., 2003) describe a pro cedure that can be used in indoor settings where both inertial and vision sensors are fi xed to a pendulum. The approach used in (Alves et al., 2003) is applicab le in determining the relative pose between the two sensors when both of them sense vertical quantities, i.e. when the inertial sensor measures accelerations due to gravity and the vision se nsor captures images with predominantly vertical edges. This idea was further exte nded in (Lang and Pinz 2005) to incorporate measurements in any desired direction wit hout constraining the measurements to the vertical direction. The relevant calibration was also performed in an indoor setting in a controlled environment and simulated using MATLAB. In addition, (Foxlin et al., 2003) used a specially constructed calibration rig and a stand to estimate the relative pose between inertial and vision sensors. Roumeliotis et al. (Roumeliotis 2002) designe d a vision inertial fusion system for use in landing a space vehicle using aer ial photographs and an inertial measuring unit (IMU). This system was designed using an indirect Ka lman filter, which incorporates the errors in the estimated position estima tion, for the input of defined pose from camera and IMU systems. However, the fusion was performed on the relative pose estimated from the two sensor systems and due to this reason a mu ch simpler inertial navi gation model was used compared to that of an absolute pose estimation system. Testing was performed on a gantry system designed in the laboratory. Chen et al. (Chen 2004) investigated the estimation of a structure of a scene and motion of the camera by integrating a camera
6 system and an inertial system. However, the ma in task of this fusion was to estimate the accurate and robust pose of the camera. Foxlin et al. (F oxlin 2003) used inertial vision integration strategy to develop a miniature se lf-tracker which uses artificial fiducials. Fusion was performed using a bank of Kalman filters designed for acquisition, tracking and hybrid tracking of these fiduc ials. The IMU data was used to predict the vicinity of the fiducials in the next image. On the ot her hand, You et al. (You 2001) developed an integration system that could be used in Augmented Reality (AR) applications. This system used a vision sensor in estimating th e relative position whereas the rotation was estimated using gyroscopes. No accelerometers were used in the fusion. Dial et al. (Dial 2005) used an IMU and a vision integration system in navigating a robot under indoor conditions. Gyroscopes were used to get the ro tation of the cameras and the main target of the fusion was to interpret the visual m easurements. Finally, Huster et al. (Huster 2003) used the vision inertial fusion to position an autonomous underwater vehicle (AUV) relative to a fixed landmark. Only one landmark was used in this process making it impossible to estimate the pose of the AUV using a camera so that the IMU system was used to fulfill this task. 1.2 Scope of Work Table 1 illustrates a summary of researches that use IMU/Vision integrated systems. Although most of the work on IMU/Vision integration work has been performed in augmented reality, yet a few other works i nvolves actual conditions. Furthermore, most of the systems have been tested and used in indoor and controlled environments with a few only tested in outdoor and uncontrolled se ttings. The above are some of the critical problems one has to consider when designing an IMU/Vision system for land navigation
7 system. Hence the contents of Table1 are of immense significance in illustrating the addressing of these vital fact ors in IMU/Vision fusion system s and highlighting the work to be performed. Table 1: Current research performed in IMU/Vision integration (x no and yes). Reference Augmented Realty IMU for camera pose pose from camera INS vision fusion Filter IMU nav. equations Outdoor testing Camera speed Control data Results compared Roumeliotis et al. (2002) x x KF x x x Chen et al. (2004) x x KF x Foxlin et al. (2003) x KF x 24 Hz real-data You et al. (2001) x EKF x 30Hz Dial et al. (2005) x Rota only T only MLKF x 10 Hz x Raw (Unaid) data Huster et al. (2003) x x EKF x Hu et al. (2004) x x PMM x x Back projection This dissertation x x KF 3 Hz x Real & Survey data The abbreviations KF, EKF, MLKF and PMM stand for Kalman filter, Extended Kalman filter, Modified Linear Kalman filter and Parameterized model matching respectively. The fusion approach presented in this dissert ation differs from the above mentioned work in many respects. One of the key differences is that the vision system used in this paper has a much slower frame rate, which intr oduces additional challenges in autonomous navigational task. In addition, the goal of this work is to investig ate a fusion technique that would utilize the pose estimation of the vision system to correct the inherent error growth in the IMU system in a GPS deprived environment. Therefore, this system will act as an alternative navigation system until the GPS signal reception is recovered. It is obvious from this objective that this system must incorporate the absolute position in the
8 fusion algorithm rather than the relative pos ition of the two sensor systems. However, estimating the absolute position from camera pose is impossible unless the absolute position of the initial camera position is known. Also, in achieving this one has to execute more complex IMU navigation algorithm and error modeling. The above developments differentiate the work presen ted in this work from the previously published work. In addition, here the authors introduce an outdoor calibration methodology which can be used in estimating the unique transformation th at exists between the inertial sensor and the vision sensor. This is very important in Intelligent Transp ortation System (ITS) applications, i.e. the calibration must be performed outdoors, since the widely varying and more rigorous geo-positioning demands of outdoor conditions. These include fluctuating ambient lighting and temperature co nditions, uncontrollable vibrations due to vehicle dynamics and roadway roughness and humidity changes etc. Furthermore, to address the problem of uncontrolled selec tion correspondences, a special validation criterion is employed to esta blish correct corres pondences using epipolar geometry and correspondence motion fields. Also in this di ssertation, the authors successfully compare a test run performed on an actual roadway setting to validate the presented fusion algorithm. In addition, to minimize the noise from the vision system, a pre-filter is used (Grewal 2001) to process the vision estimati on prior to fusion with IMU data. The final results obtained from the fusion are successfully compared with positions obtained from a DGPS/INS integrated system installed in the sa me survey vehicle. Similarly, the rotations are confirmed with actual survey measurements.
9 1.3 Multi-Purpose Survey Vehicle The sensor data for the experiments performe d in this research was collected using a survey vehicle owned by the Florida Department of Transportation (F DOT) (Fig. 1) that is equipped with a cluster of sensors. Some of the sensors included in this vehicle are, 1) Navigational grade IMU 2) Two DVC1030C monocu lar vision sensors 3) A Basler L-103 camera 4) Laser profile system 5) Two Global Positioning System (GPS) receivers 6) A Distant Measuring Unit (DMI) 7) Pavement illumination system 8) Navigational computer The original installation of sensors in th is vehicle allows almost no freedom for adjustment of the sensors, which undersco res the need for an initial calibration. Figure 1: FDOT-multi pur pose survey vehicle.
10 1.3.1 Inertial Measuring Unit (IMU) The navigational grade IMU installed in th e above vehicle (Fig. 1) was assembled by Applanix Corp., Toronto, Canada and insta lled by International C ybernetics Cooperation (ICC) of Largo, FL. It contai ns three solid state fiber-opt ic gyroscopes an d three solid state silicon accelerometers that measure instan taneous accelerations and rates of rotation in three perpendicular directions. The IMU data is logged at any fre quency in the range of 1Hz-200Hz. Due to its high frequency and th e high accuracy, especially in short time intervals, IMU acts as the base for acqui ring navigational data in data collection. Moreover, the IMU is a passive sensor which does not depend on a ny outside signal in measuring the accelerations and angular rates of the vehicle which enhances its role as the base data gathering sensor in the vehicle. However, due to the accelerometer biases and gyroscope drifts, which are unavoidable, the IMU measurements diverge. This is especially the case with the acce lerometer on the vertical axis. Therefore, in order for the IMU to produce reliable navigational solutions its errors has to be corrected frequently. The position and velocities obtained from the GPS is generally used for this purpose. Although the idea of fusion is qui te intuitive the actual fusion process is intense and has to be performed at the system level since both the GPS and IMU systems have different errors which need to be minimized before integrating the two systems. FDOT multi purpose survey vehicle (MPSV) is also equipp ed with an INS/DGPS integrated system designed to overcome the afore-mentioned issue of error growth and measure the vehicles spatial location and the orientation at any given instant. The integration of the INS and the DGPS system is achieved using a Kalman filtering based statistical
11 optimization algorithm which minimizes the long term error invol ved in the position prediction (Scherzinger, 2000). It also includes a computer th at facilitates the navigation solution through its inte raction with the DGPS unit. The block configuration of the GPS/IMU system is shown in Fig. 2. Figure 2: Block diagram of the GPS/INS system in the MPSV. 1.3.2 Lever Arm Effect on MPSV The MPSV contains a multiple sensor system with each of the sensor systems having its coordinate origin at different positions. Out of these different sensor units some selected sensors are used in the aided navigation sy stem. They are GPS, IMU and the camera system. When two or more systems with diffe rent coordinate origins are used in aided navigation architecture the translations meas ured by each sensor has to be brought to a one uniform origin in order to achieve accura te fusion of the sensors. Therefore, the system must be aware of the difference between the two sensor units in vector format for proper combination of the measured transla tions. This difference vector between two Mobile Data Recorder
12 sensor units is called the Lever arm and the co rrections that have to be performed for the accelerations, velocities and dist ances due to this difference is called the lever arm effect. Figure 3: Illustration of lever arms of sensors. In Fig. 3, C is the origin of the base coordi nate frame while i, g represent two separate sensor units. Vectors CCi and CCg are the lever arms between the two sensor units with the base coordinate system.
13 1.3.3 Forward View and Side View Cameras The FDOT survey vehicle also uses two hi gh resolution (1300 x 1030) digital area-scan cameras for front-view and side-view imaging (Fig. 4) at a rate up to 11 frames per second. This enables capturing of digital images up to an operating speed of 60 mph. Figure 4: Forward and side view cameras. The front-view camera with a 16.5 mm nominal focal length lens captures the panoramic view which includes pavement markings, numbe r of lanes, roadway signing, work zones, traffic control and monitoring devi ces and other structures, Fig. 5.
14 Figure 5: An image from forward view camera. Meanwhile, the side-view camera which uses a 25 mm nominal focal length lens is used to record right-of-way signs, street signs, br idge identification and miscellaneous safety features, Fig. 6. Figure 6: An image fr om side view camera. Both the forward view and side view cameras also utilize a 2/3 inch progressive scan interline Charge Coupled Device (CCD) sens or with a Bayer filter to minimize the
15 resolution loss. The CCD is an integrated circuit that contai ns many layers of semiconductor material, mostly Silicon, etch ed to perform different tasks such as capturing the light coming from the lens, stor ing these and various connections between each layer and out of the CCD chip. Out of thes e the light capturing layer consists of light sensitive capacitors coupled together as an a rray. These arrayed capac itors are also called picture elements (with 1339000 of them in forward and side view camera), which is abbreviated as pixels. Pixels define the smalle st element that is used to make the image and the more pixels one has in an image th e more detailed the image is. Due to this architecture of the CCD chip it captures more than 70% of the incident light on the chip making it far more efficient is photographi c films which captures only less than 5% incident light. Additionally, in these cameras Bayer filter is used to filter the color of the incident light. This filter is closer to human eye in the resp ect that the filter has two green sensitive pixels for four pixels whil e the other two are red and blue. 1.3.4 Laser Profile System One of the important sensors in measuring the cross-slope and grade of a roadway and rutting of the pavement is the laser profiler system installed in the FDOT Multi-Purpose Survey Vehicle (MPSV). The lasers in the M PSV project a laser beam onto the pavement and measures the distance between the paveme nt and the laser using the reflected beam. There are five lasers strategically placed on the MPSV in order to measure the roadway profile parameters mentioned ea rlier. Out of these, four la sers are on the front bumper, Fig. 7, of the vehicle allowing them to measure the height of the bumper from the pavement and the other one is at the rear bumper, Fig. 8, aligned with the middle laser
16 enabling grade measurement of the roadway. Of the four bumper lasers, one is at the right corner of the bumper while the othe rs are located at distances of 33.25//, 54.25// and 68// respectively from the right corner of the bumper. Figure 7: Laser installation. 1.3.5 Pavement Imaging System The pavement imaging system consists of a Basler L-103 line scan camera and pavement lighting system. The camera has a focal length of 15mm and a resolution of 2048x2942. The camera is mounted 9.25 ft above the pavement surface to capture an image of 14x20 ft respectively in width a nd length. In addition, the camer a has two preset exposure values 1/19,000 and 1/40,000. The pavement lighti ng system is used to ensure that the pavement images captured are in excellent quality without and shadows. This system consists of 10, 150 W lamps.
17 Figure 8: Pavement light ing system and camera. 1.4 Motivation Figs. 9 and 10 show two different runs performed by the Florida Department of Transportation (FDOT) Multi-Purpose Survey Vehicle (MPSV). MPSV is equipped with INS/DGPS system to measure the roadway char acteristics using both the inertial system as well as the differential GPS systems. Two runs were made at different times on the same day on Interstate-4 in Volusia County. The first run was made with the GPS signal on, while the GPS signal was not available th roughout the entire se cond run. Even though the manufacturer claims that the system would be stable even without the GPS signal, the obtained data refutes that. As seen from Fig. 9, when the G PS signal was available, the data obtained by the system correctly trac es I-4 in Volusia County. However, Fig. 10 shows that navigation is meaningless when the GPS signal is not available the GPS location shows drastic errors. Therefore, fi nding an alternative error correction method for INS is essential.
18 Figure 9: Test section drawn at I-4 when the GPS signal is available.
19 Figure 10: Test section drawn at I-4 when the GPS signal is not available.
20 Therefore, the specific objectives of this study are as follows: 1) Perform the intrinsic calibration of the vision system of the MPSV and estimate the unique transformation that exists between the vision and the Inertial Measurement Unit (IMU ) installed in the vehicle. 2) Design a pre-filter for the noisy vision measurements so that it could be used in the fusion process. 3) Fuse vision and IMU sensor systems using a decentralized Kalman filter architecture and evaluate the results against the current IMU/GPS system and the ground truth. 1.5 Organization of Dissertation This dissertation is organized as follows. The basic concepts of different coordinate frames used in the inertia l navigation systems and camera systems are introduced in Chapter 2. Also the transformations involved w ith different coordinate frames are also included at the end of Chapter 2. This is followed by the inertial navigation fundamentals starting from formulating the navigational e quations to inertial error propagations. Chapter 3 is devoted to the criteria deve loped by USF researchers in estimating and verifying geometric data obtained from the in ertial readings on an actual roadway setting. A detailed description of fundamental comput er vision theories used in this work is provided in Chapter 4. The fusion algorithm used in this work, i.e. Kalman filter, is introduced in Chapter 5 with detailed explanatio ns of two particular filters developed in this research. The necessary mathematical formulations that form a critical part of the fusion and calibration algorithms are given in Chapter 6. Chapter 7 is devoted to
21 describing the experiments performed during the data collection process and all the results obtained during the experimentation. Finally, the conclusions related to the calibration and fusion algorithms are given in Chapter 8 followed by the appendices, which elaborate the relevant theoretical formulations.
22 CHAPTER 2 FUNDAMENTALS OF INERTIAL NAVIGATION 2.1 Coordinate Frames Used in This Research In Dead Reckoning (DR) navigation, which is the estimation of the present global position of an object by using the previous pos ition, velocity, time and distance traveled by the vehicle, defining a coordinate system is very important. The obvious reason for this is the need for a proper reference system to describe a location on the Earth. In this chapter all the coordinate frames used in this research work are elaborated. For completeness the inertial frame, which is the fundamental coordinate frame, is also explained. 2.1.1 Inertial Frame (i-frame) Inertial frame can be defined as a right handed coordinate frame where the Newtons laws of motion hold. Therefore, in an inertial fr ame, an object at rest will remain at rest and an object in motion will remain in mo tion with no acceleration in the absence of external forces. The object starts accelerati ng only when an external physical force is applied to the object. The dynami cs of the motion of an object that is in an inertial frame can be formulated using Newtons laws of mo tion. In contrast, in non-inertial reference frames objects experience fictitious forces, wh ich are the forces that act on the object due to the angular or linear accelerations of the reference frame, due to the acceleration of the reference frame but not because of any exte rnal force directly acting on the object.
23 Examples of two fictitious forces are centr ifugal force and Coriolis force. Since these fictitious forces act on the obj ect in non-inertial reference frames the dynamics of the motion cannot be modeled using Newtons laws of motion. Defining a global inertial frame is an abstraction since any re ference frame defined in the vicinity of planets, sun, moon and the othe r planets, are subject ed to gravitational attractions, in varying magnitude according to the spatial location, making the reference frame a non-inertial frame. Because of this f act the classical definition of the inertial reference frame is given such that the origin of the reference frame is at the center of the Earth with 3rd axis of the frame parallel to the polar axis of the Earth, 1st axis realized by the distant quasars, which are extremely distant celestial objects that are of constant relative orientation, and the 2nd axis making a right handed co ordinate system [Jekeli, 2000]. In addition to that there are various other definitions of th e inertial frame obtained by modifying the Newtons laws of motion which ar e used in specific applications. In this research the inertial frame is taken as defi ned above and will be denoted as the i-frame hereafter. 2.1.2 Earth Centered Earth Fixed Frame (e-frame) Earth Centered Earth Fixed (ECEF) frame is a right handed coordinate frame that is centered at the Earths center of mass and with a fixed set of axes with respect to the Earth. The coordinate axes of the ECEF frame is defined as, the 3rd axis being parallel to
24 the mean and fixed polar axis, the 1st axis being the axis connec ting center of mass of the Earth with the intersection of prime meridian (zero longit ude) with the eq uator and the 2nd axis making the system a right handed coordi nate frame. This is illustrated in Fig. 11. It is clear from the definitions of the i-frame and the e-frame that the difference between the two reference frames lie on the fact that e-frame is rotating with a constant angular velocity about its 3rd axis, which will be denoted ase Figure 11: Illustration of ECEF coordinate frame.
25 2.1.3 Navigation Frame (n-frame) The navigation coordinate frame is a locally le veled right handed reference frame, i.e. it changes from place to place. In other words this is a reference frame that moves with the object. This is the most commonly used fr ame in land navigation systems in providing the velocities along the directions of its c oordinate axes. Even though the velocities are provided in the directions of the coordinate ax es of the navigation frame this is not used to coordinatize the position of the vehicle since the reference frame moves with the object. Therefore, the primary use of the naviga tion frame is to provide local directions to the object. The most commonly used navigation frame, or n-frame, in land navigation and geodetic applications is the reference frame defined with 3rd axis aligned with the normal to the Earths surface where the object is at and towards the down direction, i.e. same direction as gravity. The 1st axis points in the North dir ection, which is the direction parallel to the tangent to the meridian where the object is placed, and the 2nd axis points towards East making a right handed frame. This frame is also called as North-East-Down reference frame (NED frame). There are vari ous other locally defi ned navigation frames available according to the application. Howe ver, throughout this dissertation the NED coordinate frame is taken as th e navigation frame of the object.
26 2.1.4 Wander Angle Frame (w-frame) There are instances which makes the NED fr ame not suitable for navigation due to singularities that occur, if the navigation solution is given in the NED frame. This happens at the poles where longitudes converg e rapidly making the longitude rate, which is used to get the velocity along the 2nd axis of the reference frame, to become infinite. This problem can be overcome if the platform is rotated by an angle different from the longitude rate about the 3rd axis. This reference frame is called the wander angle frame, or the w-frame. Since the w-frame is only different from the n-frame by this rotation introduced about the 3rd axis, transformation between th e two frames can be achieved simply by the rotation matrix given as, 1 0 0 0 ) cos( ) sin( 0 ) sin( ) cos( w nC (2-1) Where, w nC is the transformation matrix from the n-frame to w-frame and is the wander angle. 2.1.5 Body Coordinate Frame (b-frame) This is the reference frame that the IMU measurements are made in a navigation system. This frame has its origin at a predefin ed location on the sensor and has its 1st axis towards the front, or forward movement direction of the system, 3rd axis towards the gravity direction and the 2nd axis toward the right side of the navigation system making a right handed coordinate frame (Fig. 12a). In this dissertation the body coordinate frame will be called as the b-frame.
27 (a) Figure 12: Illustration of local coordinate frames (a) body coordinate frame, (b) camera coordinate frame.
28 2.1.6 Camera Coordinate Frame (c-frame) The reference frame that the vision measur ements are made in is called the camera coordinate frame, or the c-frame. This reference frame for the camera system has its 3rd axis along the principal axis of the camera system, the 1st axis towards the right side of the image plane and the 2nd axis setup making the c-frame a right handed frame (Fig. 12b). 2.2 Representing Rotation When considering the 3D navigation problem one of the most commonly encountered measurements are the translations of the obj ect and the orientati on, or rotation, of the object from the initial state to the next state. Out of these, translati ons can be represented by a simple arithmetic summation of differences of coordinates given in the vector form. But the orientation cannot be incorporated as easily as translations due to the mutual dependency of a rotation angle about one axis with respect to other axes in 3D space. But since the reference frames that are dealt here with are orthogonal and right handed reference frames, to describe relative orient ation only three angles are needed and the transformation that exists between the two reference frames is also orthogonal. The transformation matrices defined for orthogonal, right handed frames have the property, T a b b a T a b b aC C I C C ) ( ) ( ) (1 (2-2) Where, a b are two arbitrary frames and C denotes the transformation matrix between the two frames.
29 Therefore, it is clear that as long as the reference frames are orthogonal the transformation between the frames will be in SO(3) space and making the number of independent elements in C to be six, implying the relativ e orientation is only governed by three degrees of freedom. There are various ways of representing the transformation between two reference frames. In this dissertation, two mos tly utilized methods are descri bed. Further information about transformations can be found on [J ekeli, 2000, Titterton, 1997]. 2.2.1 Euler Angles The most common method of representing the re lative orientation between two frames is by a sequence of rotations about three perpendi cular axes, for instan ce the three Cartesian axes. This was first introduced by the pioneer ing work of Leonhard Euler in describing the rigid body rotation in 3D Euclidean space. Due to this, the angles about each axes of the coordinate system are called Euler angles However, according the field of application these Euler angles are given special nome nclature. As an example, in Aerospace Engineering they are called roll, pitch and yaw. In land navigation, they are called bank, attitude and heading. In this dissertation th e Euler angles are also called as roll (Euler angle about x-axis), pi tch (Euler angle about y-axis) and yaw (Euler angle about z-axis). This is illustrated in Fig. 13.
30 Figure 13: Illustration of roll, pitch and yaw. These rotations can be written as rotation ma trices along each of the axes and can be represented as; ) cos( ) sin( 0 ) sin( ) cos( 0 0 0 1 Rx (2-3a) ) cos( 0 ) sin( 0 1 0 ) sin( 0 ) cos( Ry (2-3b) 1 0 0 0 ) cos( ) sin( 0 ) sin( ) cos( R z (2-3c)
31 Where, and are respectively roll, pitch and yaw given in Euler angles and Rx, Ry and Rz are the rotation matrices given respectivel y about x, y and z axes. Using, Eqns. (23a -c) one can obtain one rotation matrix (R) as; R = Rx* Ry* Rz (2-4) The Euler angles are on the SO(3), also know n as Special Orthogonal group, which is a mathematical structure for rotations in 3D space representing the orthonormal matrices that preserve both the length and angles betw een vectors, or right handedness, once they were rotated while keeping the determinant as +1. But as can be seen from Eqn (2-4) the order at which the system rotates defines the final rotation of the system which makes the Euler angle rotations a compli cated method of representing rotation in some situations. Additionally, when used in gyroscopes to measure the rotation the Euler angle rotations cause the Gimbal lock phenomenon, where one of the rotation references becomes inactive which reduces the 3D rotation into a 2D rotation problem. 2.2.2 Tate-Bryant Rotation Sequence In order to avoid the confusion with differe nt orders of rotation and define the Euler rotations in one constant sequence of ro tation, the INS in MPSV [Pospac manual, POS/SV manual] uses a fixe d rotation sequence called the Tate-Bryant sequence. The three rotations are onc e about x-axis, once about y-axis and once about z-axis. The TateBryant rotation sequence is given as; x y z B TR R R R (2-5)
32 The Applanix system [Pospac manual, POS/SV manual] uses this rotation to define the Euler angle rotations. 2.2.3 Quaternions A quaternion can be used to represent the rota tion of an object using four parameters with three of them associated with a regular pos ition vector and the other representing the angle of rotation as follows: k q j q i q q qz y x w (2-6) Where, qw defines the angle of rotation, qx, qy and qz are the projections of the position vector along i, j, k direct ions respectively. Due to th e extra dimensionality of the quaternion, four elements compared to three elements in Euler angles, the expression for the orientation becomes much simpler and ea sy to handle when comparing with Euler angles. One must note that not all the quaternions can be used to denote an orientation. Only the unit quaternion, i.e.2 z 2 y 2 x 2 w) q ( ) q ( ) q ( ) q ( 1 can be used to represent a rotation [Jekeli, 2000]. Since throughout this work bot h Euler angles and quaternions are used alternatively, depending on the need, it is impor tant that one can convert Euler angles to quaternions and vise-versa. The conventional orientati on matrix between the two reference frames can be derived in the quaternion space, for an arbitrary two spaces a and b as follows (Titterton and Weston, 1997).
33 ) 2 y q 2 x q 2 z q 2 w (q ) w q x q z q y 2(q ) w q y q z q x 2(q ) w q x q z q y 2(q ) 2 z q 2 x q 2 y q 2 w (q ) w q z q y q x 2(q ) w q y q z q x 2(q ) w q z q yq x 2(q ) 2 z q 2 y q 2 x q 2 w (q C b a (2-7) Formulation of the quaternion-based orientat ion matrix (2-7) using known Euler angles and the inverse operation of obtaining the respective Euler angles using a known quaternion orientation matrix can both be pe rformed conveniently (Titterton and Weston, 1997; Shoemake, 1985). For instance, if the orie ntation between two reference frames are given in Euler angles they can be converted to quaternions using E qns (2-4) and (2-8). Considering the order of rotation of Euler angles Rz Ry Rx the quaternions can be given as, ) 2 cos( ) 2 sin( ) 2 sin( ) 2 sin( ) 2 cos( ) 2 cos( ) 2 sin( ) 2 cos( ) 2 sin( ) 2 cos( ) 2 sin( ) 2 cos( ) 2 sin( ) 2 sin( ) 2 cos( ) 2 cos( ) 2 cos( ) 2 sin( ) 2 sin( ) 2 sin( ) 2 sin( ) 2 cos( ) 2 cos( ) 2 cos( w w x wq q q q (2-8) Where, and are respectively the roll, pitch and yaw given in Euler angles. Similarly, it is possible to convert a given quaternion to Euler angles. The procedure followed here is to first estimate the transf ormation matrix given by Eqn (2-4) and then determine the Euler angles. In this work the quaternion representation is us ed extensively to vari ous applications such as interpolating between two rotations and es timating the orientati on from a sequence of images (Taylor and Kriegman., 1995) etc. Th e main reasons for this choice are the complexities introduced by the Euler angles in interpolation and the ability of the
34 quaternion method to eliminate the Gimbal -lock a phenomenon which arises from the misalignment of gimbals. 2.3 Coordinate Transformations When designing a navigation system, deali ng with different reference frames is unavoidable. This is due to the fact that th e measurements are obtained in one reference frame, b-frame, while the navigation solution must be given in another reference frame, n-frame or e-frame. Therefore, it is essential that one be able to transform measurements or data in one reference frame to any other reference frame as required. In this section transformation between the important reference frames are considered. 2.3.1 Transformation from i-frame to e-frame Since the origins of the i-frame and the e-fr ame are at the Earths center of mass this particular transformation is relatively simp le. Furthermore, the difference between the above two reference frames is a simple rotation about the 3rd axis, to compensate for the Earths rotation. This can be shown as, 1 0 0 0 ) cos( ) sin( 0 ) sin( ) cos(t t t t Ce e e e e i (2-9) Where, e denotes the rate rotation of the Earth. It is clear that the angular velocity vector between the two reference frames can be given as,
35 T e e ie 0 0 (2-10) In Eqn (2-10) e ieis used to denote the angular velocity vector of the e-frame with respect to i-frame, coordinatized in the e-frame. 2.3.2 Transformation from n-frame to e-frame The e-frame and the n-frame have two different origins, i.e. they are not concentric. Due to this reason the transformation between the above two frames are relatively more complex than the transformation between i-frame and e-frame. Since both reference frames are right handed and the selected nframe is given in NED directions, this transformation can be derived using Euler angl es, first by rotating th e coordinate system about the 2nd axis by an angle of 2 and then rotating it about the 3rd axis by ( ). The overall transformation matrix can be given as, ) sin( 0 ) cos( ) sin( ) cos( ) cos( ) sin( ) sin( ) cos( ) cos( ) sin( ) cos( ) sin( e nC (2-11) Where, and represent the geodetic latitude and th e longitude respectively. Also the angular rates vector is given [Jekeli, 2000] as, T n en) sin( ) cos( (2-12)
36 Where, and denote the first time derivative of the latitude and the longitude. Also one can find the angular rotation rates of the n-frame with respect to the i-frame using the simple vector manipulations and known vectors as, T e e n in n en n ie n in) ) sin( ) ( ) cos( ) ( ( (2-13) 2.3.3 Transformation from b-frame to n-frame Transformation between the b-frame and the nframe can also be obtained as a sequence of rotations in Euler angles. In [Titterton, 19 97, Jekeli, 2000] this transformation is given as, ) ( ) ( ) (3 3 3 R R R Cn b (2-14) Where and are respectively roll, pitch and heading of the vehicle measured by the IMU. Although it is re latively easy to obtain the transf ormation between the b-frame and the n-frame, in practice it must be noted that the result will depend on the mechanization of the configuration. This can be given in matrix format as, ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( c c s c s c s s s c s s s c c c s s s c s c s c s s c c c Cn b (2-15) In Eqn (2-15) c(),s() represents cosine and sine respectively.
372.4 Introduction to Inertial Measuring Unit Navigation can be defined as the estimation of the current position of an object with respect to a pre-specified final destination. There are many forms of navigation available [Grewal 2001]. Of these, one of the most commonly used methods is called dead reckoning which corresponds to knowing the init ial position and orient ation (pose) of the vehicle and obtaining the curr ent position by estimating the heading information and the acceleration information. One of the key instrume nts that is used in dead reckoning is the Inertial Measurement unit (IMU) which uses gyroscopes and accelerometers. Gyroscopes are used to measure the rotation while th e accelerometers are used to measure the acceleration of the object there by giving the velocity and the displacement. Typically, an IMU contains three accelerometers and thre e gyroscopes fixed in three perpendicular directions allowing the measurem ents to be made in six degrees of freedom. The first use of gyroscopes for navigation purposes starte d in 1911 when it was used in iron ships for automatic steering and since then they have e volved rapidly. The first IMU was used as a navigational sensor in the 1950s. Since then inertial measurement units have been used in almost all the navigation systems due to their high sensitivity and accuracy under certain conditions. There are two common types of IMU used in navigation systems; 1) Strapdown IMU (Fig. 14 a) 2) Gimbaled IMU (Fig. 14 b)
38 Figure 14: (a) Strapdown IMU and (b) Gimbaled IMU (Grewal 2001). 2.4.1 Gimbaled IMU As shown in Fig 14 (b) the gimbaled system c onsists of three gimbal s that are pivoted to each other so as to make the input axes for the three rings form three perpendicular directions that one wishes to measure the ro tations. When the object the gimbaled IMU is attached to is subjected to rotations about any axis, th e three gimbals isolate these rotations making an element placed in between the gimbals stable at all times. This isolation from the rotation is achieved not only by the gimbals but also the inertia of the stable element. Ideally these two alone can is olate the rotation but in practice the friction in the bearings coul d cause a drift. Therefore, a servo motor is used in many cas es to isolate the rota tions of the body from the stable element. Since the gyroscopes measur e the angular rate, i.e. the output is small angles, and this angle is measured in volta ge using an electric pickoff device. This
39 voltage will generate a current that will used to drive a servo motor which will bring the stable element back to the same frame isola ting the rotations. Due to this arrangement of the stable element, if one places the accelerometers on the st able platform, between the gimbals, the accelerometers always oriented in the same coordinate frame that they were oriented at the start of the navigation process. 2.4.2 Strapdown IMU As the name suggests, in the strapdown case, the IMU is rigidly fi xed to the platform making the IMU go through all th e translations and rotations the platform which it is fixed is going through during navigation. Si nce the accelerometers and gyroscopes are subjected to all the dynamics the vehicle is experiencing, their performances tend to degrade compared to the gimbaled counterpa rt. The typical errors that cause this degradation are errors in gyroscopes due to cross-coupling of angul ar rates and angular acceleration, errors in accelerometers due to lever arm effect arising from vehicle rotations and errors due to integration of gyroscope and accelerometer readings in a rotating frame. The frame that those readings are integrated must be non-rotating causing coning and sculling errors in gyroscopes and accelerom eters respectively. Therefore, a navigational computer is crucial in the case of a strapdown IMU in order to correct for these errors and provide accurate angula r rates and accelerations. Additionally, the navigational computer is used to transform the angular rates and accelerations that are measured in body coordinates to the navigation frame.
40 Since the strapdown IMU measurements are ma de in the body coordinate frame, the highest accuracy that could be achieved from early strapdown systems were worse than that of the gimbaled IMU. As the technology in developing gyroscopes such as ring laser gyros and fiber optic gyros and accelerometers such as silicon accelerometers advances, the strapdown IMU system measurements approached gimbaled IMU measurements. Moreover, due to its robustness with no m oving parts, compactness, light weight, low power consumption, less maintenance and low cost makes the strapdown IMU a better choice in most navigational applications than gimbaled IMU. As the technology is developed the performance and the accuracy of the strapdown IMU also enhanced. There are four major grades of strapdown IMU availa ble according to the intended usage. They are described in Table 2. Table 2: Comparison of different grad es of IMU (Shin 20 05, Grewal 2001).
412.4.3 Fiber Optic Gyroscopes Fiber optic gyroscopes (FOG) are based on kine matics rather than dynamics in measuring the angular rates. Instead of the spinning ma ss in mechanical gyroscopes, in FOG light acts as the sensor element. FOG is a closed pa th optical fiber combined with a coupler, a polarizer, a photo-detector and so me digital signal processing equipment that are used in estimating the phase shift of two light b eams after traveling in opposite directions through the loop (Fig. 15). When a broad spect rum of light is passe d through the coupler it directs the light from one fiber to another op tical fiber. Then the light beam enters the polarizer which passes only light of the proper po larization, which is useful to assure total reciprocal optical path lengths for counter-propagating light beams, to the optical fiber. This is then split into two waves that propagate in opposite directions. Once the two beams exit the loop after propagating they are combined and sent to the photo-detector. The photo-detector is used to convert the out put light intensity into an electric signal which can be related to the phase difference of the two light beams. The phase difference can be related to the angular rate by; c A 4 (2-16) Where, is the phase of the light beam, A is the area of the loop, c is the speed of light and is the angular rate of the loop. It is cl ear from the Eqn (2-16) that the rotational sensitivity of the FOG can be increased by incr easing the length of the fiber optic cable. Eqn (2-16), also called the ro tational sensitivity of the FOG, is due to a phenomenon
42 called Sagnac effect which not only FOG but all the optical gyroscopes are based on. Sagnac effect is the lengtheni ng (or shortening) of the distan ce that a light beam has to propagate around a closed path to reach the emitt er in a frame that ro tates with respect to the inertial frame. Since the phase diffe rence between the clockwise and counterclockwise waves is measured, these types of FOG are also called interferometric type FOG (IFOG). Figure 15: Schematics of IFOG. By definition the FOG has a single degree of freedom and to measure the angular rates on three perpendicular directions the IMU need s three separate FOG s placed along the three axes.
432.4.4 Accelerometers Accelerometer is an instrument that is used to measure the specific force and hence it is one of the prominent sensors is terrestrial navi gation. The specific force can be defined as the acceleration caused by the real forces applied on a unit mass. The gravitational force exerts no real force on the system. Theref ore, an accelerometer cannot sense the gravitational acceleration even though the syst em is accelerating. So an accelerometer can be defined as a device that senses the specific force on a known mass for the purpose of measuring acceleration. Since the main task of the accelerometer is to measure the specific force on the proof mass, different techniques can be used in measuring the specific force including the Micro Electro-Mechanical Systems (MEMS) accelerometers. The latter type is used in the FDOT MPSV Construction of accelerometers is beyond the scope of this dissertation and the interested reader can see a more elaborate discussion of different types of accelerometers and the dynamics behind them in (Jekeli, 1999). 2.5 Estimating Navigation Equations Inertial measuring unit in the MPSV is a st rapdown IMU with three, Single Degree of Freedom (SDF) silicon, MEMS accelerometers and three fiber optic gyroscopes aligned in three mutually perpendicu lar axes. When the vehicle is in motion the accelerometers measure the specific force and the gyroscopes m easure the rate of change of angle of the vehicle. Therefore, it is cl ear that in order to navigate the MPSV with a known initial position, i.e. to dead reckon, one has to in tegrate the outputs of the accelerometers and gyroscopes. This means that one has to solv e a second order differential equation for the
44 accelerometers and a first order differential eq uation for gyroscopes both with respect to time in order to obtain the necessary position and attitude parameters. Although the final navigation solution for the velo city is given in the n-frame, and for the position in the e-frame, the measurements are made in the b-frame. Hence, one needs to find the transformation between the n-frame and th e b-frame at that particular instance. In other words, the measurements from the accelerometers and the gyroscopes are in body coordinate frame but the final position and or ientation has to be gi ven in ECEF. So the necessary transformation can be separated in to two steps, 1) Transformation between b-frame and n-frame. 2) Transformation between nframe and ECEF frame. Out of these the latter is easily obtaine d using Eqn (2-11), but estimating the transformation between the b-frame and n-fram e is not straight-forward. Therefore, one of the important tasks is to find the tr ansformation between the body frame and the navigation frame, which varies according to the vehicle movement. In order to obtain the transformation matrix between the b-frame and the n-frame one has to setup a differential equation in Euler angles which are measured by the gyroscopes. But due to the inherent problems of Euler angles, such as sin gularities at pitch angle approaching 2 and the complexity introduced to the problem due to the trigonometric func tions, quaternions are preferred in deriving this diffe rential equation. Therefore, in this work quaternions were used in deriving this transfor mation as illustrated below.
45 The time propagation equation in terms of quaternions can be given as; Aq q 2 1 (2-17) Where, q is defined as Eqn (2-6) and the skew-symmetric matrix A can be given as; 0 0 0 01 2 3 1 3 2 2 3 1 3 2 1 A (2-18) Where, T b nb3 2 1 is the angular rate of the b-frame with respect to the nframe given in the b-frame and j represents the angul ar rates about the jth axis ( j =1, 2, and 3). This b nb can be estimated by (vector) addition of the gyroscope readings Eqn (219) at each instance and the rate of change of the n-frame with respect to the i-frame. The angular rates measured by the gyroscopes in IM U are rates of change of angles of the bframe with respect to i-frame, i.e. b ib. Hence transformation between the two frames can be given (Jekeli 1999) as, n in b n b ib b nbC (2-19) The measurements from the IMU gyroscopes areb iband n inwhich are the angular rate of the n-frame and can be estimated by the tran sformation described in Eqn (2-13) if the positions are given in geodetic coordinates. Once b nbis found using Eqn (2-19) one can solve the differential equation given in Eqn (2-17) to estimate the quaternion. This
46 quaternion can be used to estimate the tran sformation matrix between the n-frame and bframe from Eqn (2-7) in usual matrix notation. The numerical integration used in estimating b nC must take into consideration the errors that could originate from integrating the angul ar rates on a rotating frame. This error is called the sculling error in gyroscopes. This error can be eliminated by reducing the time interval in the numerical integration pro cess. Reducing the time interval adds more complexity to the computation process. Ther efore, a suitable time interval has to be decided according to the level of accuracy needed for the particular application. On the other hand, accelerometers in the IMU measure the specific force which can be given as, i i i ia x g x ) ( (2-20) Where, ai is the specific force measured by the accelerometers in the inertial frame and gi(xi) is the acceleration due to the gravitational field which is a function of the position xi. From Eqn (2-17) and Eqn (2-20) one can deduce the navigation equations of the vehicle in any frame by taking into consider ation the different transformations between frames introduced in Section 2.4. In this work all the navigation solutions are given along the directions of the n-frame. In clarifying, wh at is desired in terrest rial navigation are the final position, velocity, and orientations provided in the local North-East-Down frame although the measurements are made in anothe r local frame, the b-frame. This is not possible since the n-frame also moves with th e vehicle making the vehicle stationary in
47 the horizontal direction on this local coordinate frame. Therefore, the desired coordinate frame is the fixed ECEF where the integrati on can be performed. To provide the output along the local n-frame the ECEF frame can be coordinatized along the NED, the nframe, directions. Hereafter in this disse rtation any parameter with the superscript n denotes the value of that parameter in th e ECEF frame but coordinatized along the nframe. Since both the frames considered here are non -inertial frames, frames that are rotating and accelerating, one has to take into consideration the Cori olis force that affects the measurements. Coriolis acceleration is the ch ange of acceleration due to rotation of the measuring coordinate frame with respect to the inertial frame. Additionally, since any object on the surface of Earth is subjected to gravitational forces both from the Earth itself and all other planets in the solar system one has to consider the gravitational force exerted on the vehicle when deriving the systems equations. Once the effects of these forces are considered the navigation equa tion can be given in the following form, n n n ie n in n ng v a v dt d ) ( (2-21a) n nv x dt d (2-21b) Where, Eqn (2-21a) is for the acceleration of the ve hicle in terms of velocity and Eqn (221b) is for the velocity in terms of position. The second and third terms in Eqn (2-21a) are respectively the Coriolis a cceleration and the gravitationa l acceleration on the vehicle.
48 The vector multiplication of angular rate is denoted as This represents vector multiplication of the measured angular rates a nd also called as an axial vector [Jekeli, 1999] having the form, (2-22a) 0 0 01 2 1 3 2 3 3 2 1 (2-22b) Although Eqns (2-21) describes the complete vehicle dynamics this navigation solution does not provide the orientation of the vehi cle. It provides only the position of the vehicle. But for the automatic navigation of la nd vehicles orientation of the vehicle is also vital and important in estimating the true location of the vehicle. This can be given as (Jekeli 1999, Titterton 1997); n nb n b n bC C dt d (2-23) In Eq. (2-23)n nbcan be obtained using Eqn (2-22) Therefore once the gyroscope and accelerometer measurements are obtained one can set up the complete set of navigation equations by using Eqns. (2-21) (2-23). Fr om the navigation equations given in Eqn (221) one can obtain the position of the vehicle and the traveling velocity by integrating the differential equations with respect to time. As in the case of the in tegration of gyroscopic readings the complexity of the integration al gorithm used in obtaini ng the velocities and distances depend on the needed accuracy. In ge neral, both the second and third terms of
49 Eqn (2-21a) are smaller and less va riable relative to the acceleration term. This is in fact true for most of the terrestrial navigation ta sks because of the rela tively slow rate of rotation of the n-frame with respect to i-fram e. Therefore, in obtaining the velocity one can use a higher order algorithm in only es timating the accelerat ion term while the Coriolis and gravitational terms can be integrated using a first order algorithm. This can be expressed as, t n n n ie n in b n b nt g v dt a C v ) ( (2-24) Integration of the first term in Eqn (2-24) can be performed as; b n b t b n ba C dt a C 1 5 0 5 0 5 0 1 5 0 5 0 5 0 11 2 1 3 2 3 (2-25) Where,T b nb3 2 1 have been estimated earlier. The gravitational acceleration can be estimated using the definition of the geoid given in WGS1984 definition [shin 2005]. Then the velocity of the vehicle at time step (k+1) can be given as, n n k n kv v v ) ( ) 1 ( (2-26) Once the velocities are obtained from the Eqn (2-26) the positions can be deduced with ease. But one important factor that one has to take into consideration when obtaining the
50 positions is that it has to be in geodetic coordi nates, i.e. in terms of latitude, longitude and height. It is essential to express the positions in geodetic coordinates not only because the positions provided in latitude, longitude and he ight would be more appropriate especially in terrestrial navigation but also in estimating the transformation between the ECEF frame and n-frame that was utilized in Sec tion 2.4. The positions can be obtained by integrating the Eqn (2-26) and can be convert ed to the geodetic coordinate frame as, ) ( ) () ( ) 1 (k k k n N k kh M t v (2-27) ) cos( ) ( ) () ( ) 1 (k k k k n E k kh N t v (2-28) t v h hk D k k ) () ( ) 1 ( (2-29) Where, vN, vE, vD are respectively the velocities estim ated in Eqn (2-26) coordinatized along the local North, East and Down directions while is the latitude, is the longitude and h is the height. And M and N are respectively the radius of curvature of the Earth at the meridian and the radius of curvature of the Earth at the prime vertical where the vehicle is located. They are given as follows, ) sin 1 (2 2e p N (2-30) 2 3 2 2 2) sin 1 ( ) 1 (e e p M (2-31) Where p is the semi-major axis of the Earth and e is the first eccentricity of the ellipsoid.
512.6 IMU Error Model Since the navigation solution was derived from the measurements obtained from gyroscopes and accelerometers which are complex sensor units with different characteristics, errors caused by these sens ors such as measurement, manufacturing and bias errors will be propagated to the navi gation solution making the navigation solution erroneous. Therefore, it is important to develop proper dynamics and appropriate error models in modeling the system error charac teristics to minimize the effects of these errors in the navigation soluti on. This error could have hi gher order terms but in this work only the first order error terms are cons idered implying that th e higher order terms contribute only a small portion of the error comp ared to the first order terms. In addition, by selecting only the first order terms the error dynamics of the navigation solution becomes linear with respect the er rors (Jekeli 1999, Titterton 1997). Error dynamics used in this work were derive d by differentially perturbing the navigation solution by a small differentials and then obtaining only the first order terms of the perturbed navigation solution. As in the derivation of th e navigation solutions (Section 2.3) perturbation equations can be given in a ny coordinate frame. But here the perturbed equations are given in the e-frame and resolv ed in the n-frame. Therefore, by perturbing the Eqns (2-21) one can obtain the linear erro r dynamics for the IMU in the ECEF frame coordinatized along the n-frame. These error difference equations take the following form (Jekeli 1999, Titterton 1997), n n n n env v x x (2-32)
52 Where, denotes the small perturbation given to the position differe ntial equation, Eqn (2-21b) and denotes the rotation vector for th e position error. And the vector multiplication denoted by x is same as given in Eqn (2-22b). Similarly, if one perturbs Eq.(2-21a) the following first order e rror dynamic equation can be obtained, n n in n ie n n in n ie n b n b b n b nv v g a C a C v ) ( ) ( (2-33) Where, denotes the rotation vector for the e rror in the transformation between the nframe and the b-frame. The first two terms on the right hand side of the Eqn.(2-33) are respectively due to the errors in specific force measurement and errors in transformation between the two frames, i.e. errors in gyroscope measurem ents. In other words when perturbing the navigation equations, the basic pr ocedure is to add small differentials to quantities that are obtained or derive d from the accelerometer and gyroscope measurements. When Eqn. (2-23) is pert urbed the following equation is obtained, b ib n b n in n inC (2-34) Eqns(2-32) (2-34) are linear with respect to the error of th e navigation equation. Therefore, they can be used in a linear Kalman filter to statistically optimize the error in propagation. This will be discussed further in Chapter 6.
53 CHAPTER 3 ESTIMATING THE ROADWAY GEOMETRY FROM INERTIAL MEASUREMENTS Position and orientation location systems e quipped with Inertial Navigation Systems (INS) coupled with Differential Global Po sitioning System (DGPS) have become quite popular in a variety of app lications including highway evaluation operations. DGPS navigation on its own can accurately identify the location of a moving vehicle using two antennas that receive signals transmitted by four satellites orbiting the earth. On the other hand, a typical INS system contains an Inertial Measurement Unit (IMU) with three gyroscopes and three linear, single-degree, accelerometers installed along the three orthogonal axes to measure, respectively, the 3-D orientation and the acceleration components. It also includes a navigationa l computer that faci litates the navigation solution through its inte raction with the DGPS unit. IN S/DGPS integrated systems are widely used in navigation due to the advantag es they offer when used as an integrated system as opposed to separate units (Wei 1990, Scherzinger 2000). The MPSV is also equipped with an INS/ DGPS coupled system that can measure the vehicles spatial location and the orientati on at any given instan t during the evaluation operation. To accomplish this task the INS/DGPS system uses state-o f-the-art fiber optic gyroscopes and silicon accelerom eters in its IMU. The integration of the INS and the DGPS is achieved using a statis tical algorithm which minimizes the error involved in the
54 position prediction. This algorithm and many ot her relevant programs are executed on the POS Computer System (PCS) that controls the above system. 3.1 Scope of Investigation The roadway geometric data typically obtaine d by the INS/DGPS coupled system are (1) cross-slopes, (2) grades of vertical curves (3 ) radii of curvature of horizontal curves, and (4) GPS data. High sensitivity of the instrume nts in the IMU unit and the relatively high frequency of data gathering (i.e.200 Hz) co mbined with even sm all scale variability caused by either the roadway, the operator or the instruments themselves can introduce significant errors in the IN S/DGPS readings. This paper reports the results of an experimental program that was executed to compare the INS/DGPS geometric data with the corresponding manual measurements. 3.2 Evaluation of Cross slope The function of the cross-slope in a roadway is to accelerate the draining of rain water and reduce the hydroplaning potential. The typi cal cross-slope require ment of a roadway is 2% in order to avoid ponding of water and facilitate runoff from the roadway. In addition, roads are sloped in the lateral dir ection to control the lateral wandering of vehicles traveling on a bend by meeting the cent rifugal force requirement of the vehicle. This construction feature is commonly known as the super-elevation at the bend and is different from the roadway cross-slope in that the former is only provided in the direction of the curve, in all of the la nes. For a curve, the super elevation has to be estimated based on the speed limit of the roadway, the radius of the curve, and friction characteristics (Roadway design manual, MNDOT).
55 R v f e 1272 (3-1) Where e is the super-elevation of the roadway in (m/m), f is the side friction factor, v is the travel velocity in (km/h) and R is the radius of curvat ure of the curve in (m). When measuring the cross-slope the standard procedure is to measure the height at two different locations on the lane, typically the e dge near to shoulder of the roadway and the crown. By dividing this hei ght difference by the lane widt h, one would obtain the crossslope of the lane (Figure 16) as: L ) ( L H slope Cross1 2y y (3-2) In the FDOT pavement evaluation vehicle, th e instruments used in measuring the crossslope include four lase r sensors that are attached to th e front bumper and the Inertial Measurement Unit (IMU). The four laser sens ors provide distance from the front bumper of the vehicle to the roadway surface. The best -fit straight line of the heights measured by the laser sensors provides the road slope with respect to the vehicle bumper. In addition, the tilt of the vehicle body and the bumper in the lateral direction can be obtained by the IMU, if the IMU is considered to be firmly fixed to the floorboard of the van. Then by considering the difference between the above tw o readings; (1) the tilt of the vehicle floor and (2) the slope of the road visualized by the lasers, one can approximately determine the cross-slope of the roadway. Figure 16 elaborates the above measurements.
56 Figure 16: Estimation of cross slope from la ser measurement height and IMU roll angle. Where, Slope measured by the laser sensors with respect to the AB. Slope of the best fit line AB. Roll angle measured by the IMU. ABBest fit line for the laser sensor readings. A/B/Parallel line to AB drawn at B/. RLLaser sensor located at th e right corner of the bumper. MLLaser sensor located in the middle of the bumper. LLLaser sensor located at th e left corner of the bumper. LL/Laser sensor located in between LL and ML. l: Width of the vehicle bumper (in inches). L: Width of the lane (in feet). (xi,yi): Coordinates with re spect to the bumper.
57 y2, y1: measured heights at the crown and th e shoulder of the lane respectively, expressed in horizontal-verti cal (x,y) coordinate system. The approximate cross slope of the roadway is given by: (3-3) In this study, alternative methods were also explored in dete rmining the cross-slope of the roadway from the available test data. Basi s for the second method in estimating crossslope was the Least Squares Approximation (LSA ) of the laser sensor readings using a parabolic fit. All of the laser measurements obtained from the vehicle are specified with respect to the coordinate system originating from the right corner of the vehicle bumper (Fig 16). Therefore, Eqn. (3-4) is used to convert the laser coor dinates from vehicle bumper frame to the vertical and horizontal (x,y) coordinate system shown in Figure 16. Transformation matrix for the coordinate transformation can be expressed as: ) cos( ) sin( ) sin( ) cos( t (3-4) And the (x, y) coordina tes are obtained as: i iy x y x t (3-5)
583.2.1 Linear L. S. A. Fit (The Existing Algorithm) Thus the equation of LSA lin ear fit could be given as: m mb x a y (3-6) 3.2.2 Least Squares Parabolic Fit (The Modified Algorithm) Since the road profile can be better approximated by a second order curve, a more accurate estimate can be obtained by using a seco nd order (parabolic) fit. In this case, one can use a second order fit as: m m mc x b x a y 2 (3-7) Once the least squares linear or the parabolic fit is obtained, the two curves are then extended to the total length of the road to determine the road cross-slope as measured by the evaluation van. To achieve this it is requ ired to make the assumption that the vehicle travels in the middle of the lane. After the height difference is estimated one can obtain the cross slope from Eqn (3-2). 3.3 Radius of Curvature The proper design of the radius of curvature of the roadway at a horizontal curve is vital in making the roadway safe for its users at th e design speed. Therefore, an important task of pavement infrastructure evaluations is th e verification of radius of curvature to be within the acceptable limits at all horizontal cu rves of a roadway. This especially applies
59 to rural roads and minor arte rials that have been cons tructed without adhering to appropriate standards. In a ddition, due to the relationshi p between the super-elevation and the radius of curvature (Eqn 3-1), knowledg e of the radius of cu rvature at a location is essential in checking the adequacy of supe r-elevation of that location. This becomes a critical issue when pavement evaluators ar e called upon to perform sa fety inspections of specific locations that are known to pres ent safety hazards. Automated and rapid curvature information provided by the IMU syst em would be immensely helpful in such situations especially from the perspe ctive of safety of survey crews. Roadways are designed in such a way that wh en a vehicle moves from a straight section of a road onto a curved section, the ve hicle would experience a relatively smooth transition in curvature. This is achieved by constructing transition curves at the extremes of the actual curve. Typically, this transition can be provide d by two spiral curves (1) a lead-in, which is the roadway section that starts asymptotic to the straight line segment and merges with the circular curve, and (2) a lead-out, the roadway section which diverges from the circular curve and merges with the second linear segment (Fig 17). In order to achieve this spiral curve, designers use a gradually varyi ng radius of curvature from a straight run to the desi gn circular curvature of the horizontal curve. Also the leadin and lead-out play a major role in gra dually adjusting the s uper-elevation from the regular cross slope on a strai ght roadway to the maximum s uper-elevation at the curve. This is also known as th e super-elevation runoff.
60 Figure 17: Geometry of the first cu rve of the reverse curve section. Generally, two methods can be employed to determine the radius of curvature of a horizontal curve using the IMU readings. They are, 1. The kinematic method using the velocity in the longitudinal direction and the centrifugal acceleration. 2. The geometric means by plotting the actual vehicular horizontal trajectory using the velocity vector and th en determining the radius 3.3.1 Determination of Curvature Using Kinematics The radius of the curvature of the circul ar section of the hor izontal curve can be determined using the following equation of circular motion: a v r2 (3-8) Lead-in Transition curve Circular curve Tangent
61 Where v is the velocity of the vehicle in the longitudinal (bo dy X) direction and a is the centrifugal acceleration of the vehicle on a horizontal plane. Since both v and a are obtained by the IMU, the algorithm used in th e FDOT pavement evaluation vehicle uses Eqn (3-8) to determine the instantaneous radius of the horizontal curve. Since the entire horizontal curve is not an exac t circular curve due to the spiral shaped transition curves (lead-in and lead-out) where there is a gradual ch ange of curvature, strictly, the generalized kinematic theory mu st be used in the determination of the instantaneous radius. Hence, on a non-circular curve, th e following relationships hold among the kinematic properties of the vehicl e and geometric parameters of the roadway (Beer 1977). Velocities: r vr (3-9a) r v (3-9b) Accelerations: 2 r r ar (3-10a) r r a 2 (3-10b) Where r are respectively the radial and angul ar coordinates of the vehicle in a horizontal plane. Within the circular curve since r is a constant0 r r Then it can be shown that the expression for the radius of curvature ( r ) reduces to Eqn (3-8).
623.3.2 Determination of Curvature Using Geometry The velocity vector obtained from the IMU can be used to plot the trajectory of the vehicle. Because of high frequency of the data collection (200 Hz), the error associated with the linear approximation tends to be relatively low with respect to the distance travelled. The basis of estimating the long itudinal displacement of the vehicle between two consecutive data points i and ( i+1 ) is illustrated in Eqn (3-11) using the average velocities at those two points. Because the horizontal curve is two dimensional, only the horizontal velocities must be considered. dt v v dsi i) ( 2 11 (3-11) Where ds is the longitudinal di splacement that occurs when the vehicle travels between the points i and (i+1) vi and v(i+1) are the respective velocities at points i and (i+1) and the dt represents the time elapse d during the travel between points i and ( i+1 ). If the vehicle trajectory on the x-y plane is described by the function y=f(x), the radius of curvature can be evaluated by the following equation; 2 2 2 3 21i i i idx y d dx dy R (3-12) Since x and y coordinates are found in terms of time as discrete quantities, the second order Forward Difference formulae can be us ed to numerically evaluate Eqn (3-12).
63 h x x x xi i i2 ) 3 4 () 1 ( ) 2 ( / (3-13) 2 ) 1 ( ) 2 ( ) 3 ( //2 5 4 h x x x x xi i i i (3-14) Where, i it t h ) 1 ( and i is the index that represents all gathered data points. Similarly expressions can be written for y/ and y// as well. The numerical forms of the derivatives needed for Eqn (3-12) can be derived as: / /x y dx dyi i (3-15) 2 / / // / // 2 2] [ ] [ x y x x y dx y di i (3-16) 3.4 Estimation of Curvature in the Field Two common methods of estimating the radius of curvature of a horizontal curve used by accident investigators are given in (Glenon 2003). Both of these methods yield accurate results when used on a circular section. Due to the ease of measurement and minimal labor requirement, these methods are widely used in the field. The two methods are outlined below: 3.4.1 Chord-Offset Method This method uses a 100-foot tape and a carpe nters ruler to determine the radius of curvature. The 100-foot tape is placed on either end at the precise edge of the roadway so
64 that an arc is separated from the curve. Then the carpenters ruler is used to measure the distance between the mid-point of the 100-f oot tape and the edge of the roadway perpendicularly across from the mid point of the 100-foot tape (offset). Once these two measurements are obtained, the radius of the curvature can be estimated by the following geometrical relationship: 2 82m m b R (3-17) Where, R is the radius of the curvature. b is the chord length. m is the measured middle offset. 3.4.2 Compass Method This method uses the distance traveled along a se gment of the circular section, or the arc length, and the angle subtended by that segmen t of the curve on the center of the circle (included angle) to estimate the radius of curvature. Arc length can be measured by a measuring wheel or the vehicle DMI. A compas s can be used to find the included angle, since it is equal to the deviation between th e tangents to the circular segment at its extremes. Once the arc length and the included angle are available, the radius of the curvature can be estimated by, s R (3-18)
65 Where, R is the radius of the curvature of the horizontal curve. s is the arc length of the c onsidered circular segment. is the included angle measured in radians. In order to address the undulations of the pred icted radius values, the same data are first averaged and moving average, moving median is used to respectiv ely smoothen out and eliminate the outliers. Averaging has been performed by considering a pre-determined number (n) of consecutive data points at one time, according to Eqn (3-19). n p p p p pn avg) (3 2 1 (3-19) Where, p denotes the variable that needs to be averaged. The average values for points (x,y) obtained from the operation of Eqn (3 -19) are then used as the new x and y coordinates for the averaged time period (i = 1..n). The interval size chosen in this work 10. Subsequently, the second order fo rward difference algorithm (Eqns. 3-13 .14) was applied to the x and y coordinates to fi nd the first and second derivatives needed for Eqn (3-12).
663.4.3 Manual Estimation of Radius of Curvature The radius of curvature was also estimated us ing a manual survey the results of which are given in Table 3. Figure 18: Illustration of notation used in Table 4. Table 3: Data obtained from the manual survey. Distance(d) (ft) Horizontal angle ( )(deg) X (ft) Y (ft) 12.89 0 12.8900 0 28.36 58.37611 14.8703 -24.1488 46.33 68.54306 16.9476 -43.119 66.39 72.21167 20.2822 -63.216 86.58 73.74528 24.2344 -83.1191 106.36 74.20806 28.9453 -102.346 126.57 74.25056 34.3550 -121.818 161.37 73.59861 45.5652 -154.803 205.11 72.77528 60.7372 -195.911 225.29 72.7325 69.2826 -215.136 244.66 72.96361 72.4964 -233.672 264.86 73.38167 75.7486 -253.797 284.17 73.92778 78.6716 -273.063 Survey Instrument Station-1 Station-2 Station-3 1 2 d1 d2 d3 x y
67 CHAPTER 4 FUNDAMENTALS OF VISION NAVIGATION 4.1 Introduction Utilization of a sequence of images obtaine d by a camera fixed on a moving vehicle in order to estimate the navigation parameters of the vehicle is the primary concern of this chapter. Estimation of navigation parameters, sp ecifically rotations and translations of the camera with respect to the lo cation of the camera at the fi rst image point, comes under a special area in Computer Engineering name d Computer vision. This specialization originated with the revolutionary ideas of psychologist David Ma rr in application of psychology and neurophysiology in visual processing. This a pplication, machine vision, attempts to setup an analogy between huma n or biological vision systems and images obtained using a camera. Machine vision deals with estimating the mo tion of a camera evaluated using a sequence of captured images. In understanding how one could extract useful navigation information from a sequence of images it is important to grasp the fundamentals of how an image is created in a camera. In addition, to the acquisition of an image, it is important to understand how a camera works and how one can model this computationally. The simplest and the most fundamental model for image formation in a camera is the pinhole camera model. The pinhole camera model consists of a small hole which allows the light rays coming from a 3D point travel through it and form an inverted object on the image
68 plane. For simplifying the visualization of image formation this image plane can be placed in between the principal point, the small hole in pinhole model, and the object. This can be elaborated using Fig. 19. Figure 19: Image formation in a pinhole camera. Where, ABC represents an object on 3D space while abc is the corresponding image point of the 3D points on the image plane. The small hole, pinhole, is denoted by C1. Note that the image plane in Fig. 19 is repr esented by a hypothetical plane in front of the pinhole to avoid the undesirable property of inverting the obj ect on the image plane. This not only helps visualize the image formation eas ily but also simplifies the image analysis process. It can be seen from Fig. 19 that image formation creates a 2D image from 3D points, i.e. image formation process is nothi ng but a transformation from the 3D space on to the 2D space by using straight lines that pass through a single point, the principal point. This is also called the perspective pr ojection. In fact Euclidean geometry is a special case of projective geometry and the a pplication of projective geometry in machine
69 vision makes the computational task simp ler. The relationship between the 3D coordinates and the 2D coordinate s in Fig 19 can be denoted as, x z f u (4-1a) y z f v (4-1b) Where, (u, v) are image plane coordinates and (x, y, z) are 3D coordinates. The focal length of the camera is given as f. When the Cartesian coordinates are used in order to represent the coordinates on the projected space, Eqns (4-1 a, b), these representations become non-linear. This non-lin earity in representation makes the computational task more complex. Therefore, a separate coordi nate system, called Homogeneous coordinate system, is introduced to make this non-linearity a linear representation. 4.1.1 Homogeneous Coordinate System The primary use of the Homogeneous coordina te system in projective geometry is to avoid the non-linearity that is cause when using the Cartesian coordinate system. In addition, Homogeneous coordinates are preferre d in computer vision since they represent the translations that occur in between the image locations as a matrix operation. To translate Homogeneous coordina tes (x, y, 1) by 2x1 vector A (= [a1, a2] T), it has to be multiplied by,
70 1 0 0 1 0 0 12 1a aT (4-2) Homogeneous coordinates of an N dimensiona l projective space can be represented by a vector of length (N+1) where not all th e components are zero. For instance, a 3D Homogeneous coordinate system can be expressed by a vector (x, y, z, w)T where at least one of the components is non zero. Applicat ion of linear algebra to the Homogeneous coordinate system becomes complicated as on e given point on that system is obtained by multiplying one point with a nonz ero scale factor. That is, 0 all for ; y xwhere x and y are homogeneous coordinates which repres ent the same point. The basis of an N-D projective space contains (N+2) poi nts such that there are (N+1) of them are independent. If one compares the Homogeneous coordinates and Cartesian coordina tes one can see that for every value of w other than w = 0, Hom ogeneous coordinates can be represented by an equivalent Cartesian coordinate point given by 1 w z w y w xT When, w = 0, it is impossible to represent in Cartesian coordinates and this plane is identified as the plane of infinity on Homogeneous coordinate system. This plane makes the Homogeneous coordinate system a compact space making it distinguishable from Cartesian coordinate system. This property solely makes the Homoge neous coordinate system more suitable in computer vision technology.
714.2 Estimating Intrinsic Parameters of the Camera In revising the IMU data with respect to its inherent error accumulation problem one needs to obtain the position a nd orientation at each camera location. Since the camera is rigidly fixed to the vehicle, these camera pos itions and orientations can be considered as the vehicles navigation parameters as well. However, it is essential that the camera be first calibrated for its intrinsic parameters, such as the focal length, principal point and distortion parameters. Furthermore, the fixed transformation between the b-frame and the c-frame must be determined. Intrinsic properties of the camera involve six parameters; focal length, scale factor, radial distortion coefficients (2 unknowns for a second order approximation), and tangential distortion coefficients (2 unknowns). In this work the algorithm introduced by Heikkila et al. (Heikkila 2000) is us ed to estimate the in trinsic properties of the camera. The basic models used for the camera and the di stortion parameters (Heikkila 2000) are summarized below. The perspective camera model which is used in homogeneous coordinate systems can be expressed as: 1 Z Y X . 1 Z Y X 1 v uM P F (4-3)
72 Where the P matrix denotes the intrinsic properties, the M matrix denotes the extrinsic properties related to rota tion and transformation and F is the combination of P and M known as the perspective transformation ma trix. u and v are the measured pixel coordinates obtained from the image. P and M can be expressed as: 0 1 0 0 0 0 0 00 0v f u sfP (4-4) 1 0T R M (4-5) Where s is the aspect ratio, f is the focal length and (u0, v0) are the coordinates of the principal point on the image plane. R and T denote the rotationa l and translational matrices between the two frames, i.e. from the camera frame to the global frame. The camera lens introduces nonlinear distortions due to the fact that the lens system is composed of several optical elements. Therefor e, the perspective projection is insufficient for accurate modeling without the incorporat ion of the magnitude of distortion. The distortion is divided into ra dial and tangential (de-centeri ng) components. The second order radial distortion m odel used in this work can be expressed as: ) ( ) (4 2 2 1 4 2 2 1 d d d d d d rr k r k v r k r k u D (4-6) Where, k1, k2 are coefficients that express the radial distortion, 0u u ud d 0v v vd d 2 2 d d dv u r and (ud,vd) are the distorted coordinates of any point of the image. Herein
73 only the first two terms of the radial distor tion model Eqn (4-6) are considered due to their dominant role in describing the radial distortion and also to avoid any numerical instability that would be caused by a more complicated model (Zhang 2006, Heikkila 2000). Similarly the tangential distortion model is given as; d d d d d d d d tv u p v r p u r p v u p2 2 2 1 2 2 2 12 ) 2 ( ) 2 ( 2 D (4-7) Where, p1, p2 are coefficients that express the tangential distortion. Therefore, the corrected coordina tes of a point (uc, vc) on the image can be expressed as; t r d d c cv u v uD D (4-8) The process of estimating the intrinsic and ex trinsic properties of the camera is initiated by forming the initial P and F matrices from the nominal parameter values and the perspective model respectively. Using initial P, F and the orthogonality constraint (RTR=I) of the rotation matrix, the initial M can be determined. Once the initial estimates are available, the upgraded estimates of intrinsic parameters can be obtained by minimizing the weighted sum of squares of error between the observations and the model.
744.3 Feature Corresponding in Images The camera is calibrated for the intrinsic and extrinsic parameters in order to use it in estimation of motion parameters from a sequenc e of images obtained from it. The entire motion estimation problem is two folds. They are, 1) Correspondence problem. 2) Motions estimation problem. The two following sections address the motion estimation problem for each type of correspondences that are used in this work Estimation of motion from a sequence of images is performed by establishing common objects that are visible in consecutive images. These common objects could be of a ny geometry, such as points or pixels, straight lines etc., and are calle d in this dissertation as featur es, deviating a little bit from the definition of features in computer vision literature. The same features on two or more images are called correspondences making the entire process of defining and locating features, feature correspondence. In this dissertation two types of features are considered, namely points and straight lin es. Also two different methods of correspondences are utilized, 1) Manual feature correspondence. 2) Automatic feature correspondence. The typical point feature, or pixel, correspondences are a ny point on the image that can be reliably distinguished from the rest of the points and easily located on the very next image. Similarly, the straight lines must also be reliably recognized and located on the next two images. As it is understood from th e definition of the feature and reliable
75 tracking constraints there is a certain ambiguity involved in the correspondence problem. Some of the most common and established me thods of identifying point correspondences are, 1) Light reflected by a ny object or point. This method involves measuring the irradi ance from a point and locating the same point on the second image that has the same irradiance value. Since in the nature surfaces are in between matte (reflecting light in all directions at the same intensity) and glossy (acting totally as a mirror), locating the exact point that has the same illumination is quite difficult. 2) Identifying edge pixels. This method involves establishing pixels th at are on the edge of any object which is distinguishable from th e rest at ease. This can be used to pick up the correspondences of both types, i.e. points and straight lines. In this work the feature correspondences are picked according to this method. In addition, to reduce the ambiguity involve d in establishing the correspondence problem one can use some constraints. One of th e most common constraints in establishing accurate point correspondences in two images is drawing the epipolar line. Establishing and drawing the epipolar lines and planes is described in depth in the next section. Furthermore, one can use geometric constraints arising from the object in order to obtain more accurate feature correspondences. In esta blishing the point correspondences in this research work the authors utilized a we ll established KLT (Kanade-Lucas-Tomasi)
76 feature tracker. This program is completely automated and the point features are tracked in the sequence of images with replacemen t (Fig. 20). Of these feature correspondences only the ones that are tracked in more than five images are identified and used as an input to the motion estimation algorithm for estimating the rotation and translation. On the other hand, straight line featur es of any given image were extracted, first using an edge detector program (Fig. 21-a), the Canny edge detector (Canny 1984), to identify the object boundaries or edges. Then the correspon dence of lines in a sequence of images is identified manually (Fig. 21-b). This met hod can be implemented more conveniently on highway travel because at highway speeds the correspondence of line features in a sequence of images can be achieved easily and more accurately than the correspondence of point features. Figure 20: Established point corres pondences from KLT tracker.
77 (a) (b) Figure 21: (a) Image after edge detect ion, (b) Establishing correspondences.
78 When the feature correspondences are picked one must be careful not to pick features that are moving with respect to the camera. This is due to the fact that the motion estimation algorithm considers the features to be stationary while the camera is in motion. One other important fact in picking line correspondences is not to pick all th e lines on the same plane which would cause degeneracy in th e motion estimation algorithm. In addition, when establishing correspondences feature occlus ion, i.e. depth discontinuities where the normal vector of the feature is away fr om the camera, must be avoided. 4.4 Estimating Motion from Point Correspondences Once the point features are tracked in the seque nce of images they are used to estimate the rotation and translation of the camera between tw o consecutive images. The algorithm used to estimate the pose requires at least eight point co rrespondences that are non-coplanar to solve for the pose of the cam era in two consecutive images. Therefore, the algorithm is called eight-point algorithm. A description of ei ght-point algorithm follows.
79 Figure 22: Rigidity constrain for estimating pose from point correspondences. Fig. 22 shows two images captured by the camera at two consecutive time instances. Point p is a point common to both images and o1 and o2 are the cameras coordinate frame centers at two instances. Point p1 and p2 are respectively the proj ection of the 3D point p on to two image planes. The epipole points where the line joining two co ordinate centers and the two image planes intersect are denoted by e1 and e2 respectively while the line joining e1p1 and e2p2 are called epipolar lines. Such epipolar lines pa ss through a common point (epipole) which corresponds to the image of the center of the camera at the first position, on the second image plane. Then the second point of any correspondence pair (such as a point A
80 in Fig. 23) must lie on the epi polar line (BC in Fig. 23) obtai ned from its partner point in the first image with the epipole. This is only applicable in situations where the locus of the c-frame center is not parallel to any of the consecutive image planes considered, a condition which is generally satisfied in auto mated driving tasks. Thus the importance of an epipolar line lies in th at it restricts the placement of any correspondence point on the second image to a straight line obtained usi ng its partner point in the first image. Figure 23: Epipolar lines drawn for the vision system. Let the rotation and translation between the tw o images are denoted as R and T. If the coordinates of points p1 and p2 are denoted as (x1, y1, z1) and (x2, y2, z2) respectively then the two coordinates can be related as, T z y x R z y xT 1 1 1 T 2 2 2 (4-9)
81 From Fig. 24-a, it is clear that the lines c onnecting the 3D point p with the camera centers and the line connecting to the two centers are on the same plane. This constraint, which is geometrically shown in Fig. 22, is given in the algebraic form (Fa ugeras 1996) in Eqn.(410). Since the three vector s lie on the same plane, 0 ) Rp (T p2 T 1 (4-10) Where, p1 and p2 on Eqn.(4-10) are the homogeneous c oordinates of the projection of 3D point onto two image planes respectively. Both 3R T, are in 3D space, so there will be nine unknowns involved in Eqn.(4-10). Bu t since all the measurements obtained from a camera is scaled in depth one has to solve only eight unknowns in Eqn.(4-10). Therefore, in order to find a solution to Eqn.(4-10) one should meet the criterion, 8 R) Rank(T .Let R T E the unknowns in E are taken as (e1 e2 e3 e4 e5 e6 e7 e8) and the scaled parameter is taken to be 1. Then one could setup Eqn.(4-10) as, 0 e A (4-11) Where 2 2 2 1 2 1 2 1 1 2 1 2 1f fy fx f y y y x y f x y x x x A, is known and the unknowns are 8 7 6 5 4 3 2 1e e e e e e e e 1 e Once the sufficient number of correspondence points is obtained Eqn.(4-11) can be solved and the E estimated. Based on the Epipolar condition, the camer a centers at each camera location o1 and o2 and the images of the point p on first and second image planes p1, p2 are on the same plane. Then one can restrict the search space for the second corresponding point on the second image plane to a 1D line. This 1D line search space is given as e2p2, epipolar line on the second image plane corresponding to the point p in Fig. 22. In other words, if one knows the
82 essential matrix, E, and the correspondences of 3D point p on the first image plane the correspondences for the same 3D point p on the second image plane must lie on the epipolar line e2p2. This is true for all the corresponde nces that are picked on both images. The epipolar lines drawn for all the corre spondences on the second image plane can be given as; In estimating the essential matrix (E) using th e data obtained from the vision system, i.e. the correspondence points obtained by the tr acking algorithm, the only variable involved in the estimation process is the focal length of the camera. Although the essential matrix can be estimated using the Eqn (4-11), the value obtained here could be different from the actual focal length value due to erroneous corr espondences in consecutive images. This is evident from the epipolar line s plot given in Fig. 24-a. The reason for this erroneous epipolar lines is because the false corresponde nces make the estimated Essential matrix erroneous in Eqn. (4-11). Therefore, by removing the fa lse correspondences it can be shown that the epipolar lines also become accurate (Fig. 24-b). In this work the false correspondences have been removed manually.
83 (a) (b) Figure 24: Epipolar lines (a) estimated before removing false corresponde nces, (b) after removing false correspondences.
84 In locating false correspondences before obtai ning the accurate Essential matrix, the flow direction of correspondences wa s used in this work. Locat ing the false correspondences by observing flow directions of tracked points in two consecutive images involve checking the flow pattern of each tracked path and removing the correspondence points that are randomly oriented. If the features are extracted in two c onsecutive images from the same location of a stationary object their flow direction must a lign with the direction that corresponds to the travel direction of the vehicle. As the data for this work was collected in an outdoor uncontrolled envir onment, movement of the objects used in extracting the features between the image cap turing instances, cha nges of irradiance of the object etc. could cause this subtle e rrors in extracting correspondences. This is illustrated in Fig. 25-a, b.
85 (a) (b) Figure 25: Filtering point correspondences usin g motion tracking (a) before (b) after. From Fig. 25 (a)-(b) one can see how the co rrespondence flow direct ion or the tracked paths can be utilized in removing erroneous features from the sequence of images. This
86 method can be used in removing both stati onary and moving features tracked on two consecutive images. Once the erroneous f eature correspondences are taken off, the features that are tracked in more than five images are identified and subsequently used as input to the motion estimation algorithm to obt ain the vehicles rotation and translation between each of the frames. Since, R T E once the matrix E is estimated it can be utilized to reco ver translations and rotations. The translations (T) and rotations (R) can be given as, E T E R T T c c TT 2 1 (4-12) Where, i ir T c (for i=1, 2, 3) and the column vector s of R matrix are given as r. Also E* is the matrix of cofactors of matrix E. In this work, in order to estimate the rotation and translation from given correspondence, the specific algorithm outlined in (Faugeras 1996) was used. 4.5 Estimating Motion from Line Correspondences From the edges located from an image us ing the edge detec tion algorithm one can estimate the pose of the camera. Since the line co rrespondences between the images are used in establishing motion parameters of th e camera the problem at hand would become more difficult than the motion estimation usin g the point correspondences. This could be attributed to the difficulty in constraining the rigid disp lacement between two consecutive images that have captured the same 3D line segment. In other words, the same 3D line
87 can be detected on the two consecutive imag es. However, mere observation of the line segments will not constrain the relative m ovement of the camera between two image positions. Therefore, the estimation process becomes much more rigorous. In order to estimate the pose of the camera using line correspondences one has to constrain the rigid motion of the camera so that it is possible to obtain an algebraic relationship for the motion. This can be achieved by considering the fact that the intersection of three plan es is a line. In physical terms, th is says that if one could find a 3D line, P, which can be found on three consecutive images then the three planes made by each images camera center, Ci where i=1, 2 and 3, and the 3D line, P, intersect on one straight line which in this case is the 3D line P. This is shown schematically in Fig. 26. Figure 26: Rigidity constrain for the moti on estimation using line correspondences. Where, P denotes the 3D line and P/, P// and P/// represent the projections of the 3D line onto the respective image planes. C1, C2 and C3 are the three camera coordinate centers at each image locations respectively. If the thr ee normal vectors to each of the planes given
88 by C1P/, C2P// and C3P/// are given as n/, n// and n/// the three vectors, with respect to the coordinate system at the first camera frame (P/, P// and P///) can be expressed as; 0 n' P' (4-13) n' R T n' R P'1 T 1 1 (4-14) n' R R ) T R T ( n' R R ' P'1 2 T 2 1 1 1 2 (4-15) Once these vectors are obtained one can setup a 4x3 matrix [P/, P//, P///] which can be used to solve for the rotations (R) and transl ations (T) using the af orementioned rigidity constrain (Eqn(4-16)). The rotations can be found by minimizing the function, N 1 i i 1 2 i 1 T i' ' n R R ' n R n' (4-16) Where, N is the number of correspondences detected between the three images. Once the two rotations are obtained the translations can be estimated by minimizing Eqn.(4-17); N 1 i 2 i 1 2 i 1 2 T 2 1 1 i 1 2 i i 1 T 1' n R R n ' n' R R ) T R (T ' n R R n ' n R T (4-17) The solution for the rotations and translati ons between the images 1, 2 and 3 can be estimated using the Eqn. (4-13) (4-17). In this work th e pose from line correspondences are estimated using the algorithm presented in (Taylor et al., 1996). Appendix A provides the details of this algorithm.
89 CHAPTER 5 KALMAN FILTERING 5.1 Introduction The inherent error growth problem of IM U has to be overcome in land navigation vehicles to obtain a meaningful and reliable navigation solution. In order to minimize the increasing error in IMU measurements the IMU readings have to be updated by an alternative measurement at regular interval s. Since every measurement instrument has errors involved in its measurements, such as bias errors, random measurement errors etc., and have different measurement sensitivities, care must be taken before a measurement from another sensor is utilized in updating a sensitive instrume nt like IMU. Therefore, to overcome this problem one needs to utilize a fusion technique that would reduce the error involved in both the measurement systems in a statistically optimized manner. Although there are varieties of st atistical filtering techniques in this work the authors selected the widely applied filtering technique of Kalman filtering. A Kalman filter is a minimum variance estimator designed for linea r systems. Therefore, for linear systems Kalman filtering provides the statistically op timized solution. One key reason to select Kalman filtering for fusion of vision sensor data with IMU data was due to its proven accuracy in similar data fusion applications a nd its wide applicability in the fusion field. In addition, the system equations derived for IMU error dynamics takes a linear form with respect to error terms which motivates the use of the Kalman filtering technique in
90 the aforementioned problem. All the pro cesses involved in the IMU/Vision fusion considered in this work is illustrated in Fig. 27. Figure 27: Illustration of processes involved in the fusion algorithm. IMU Computing Transformation Position velocity Camera Feature tracking Position velocit y Observation available No Yes R T , h , Navigation solution Kalman filter Optimized Navigation solution Estimated error Error com p ensation
91 Here (Fig. 27) the position and velocity obtai ned from the vision algorithm is fused with the position and orientation measured from th e IMU. In other words, the inputs to the Kalman filter are position and orientations obtained from two different sensor systems. Therefore, this filter has a close resemblan ce to the decentralized Kalman filter described in (Wei 1990) if merely the format of the input is considered. This form of fusion, using a decentralized filter, is further supported by th e local filter designed for the vision sensor system. The local vision filter, or the vision onl y Kalman filter, plays a prominent role in the overall fusion process due to the high meas urement noise in the vision system as illustrated in Fig. 28. In Fig. 28 the transla tions obtained from the IMU/GPS system are compared with those obtained from the vision system are compared. The high variability in the vision system corresponds to the high noise associated with the vision estimation. Figure 28: Variability of vision translations due to high noise.
92 Therefore, the fusion algorithm used in fu sing the IMU and the vision systems can be illustrated in Fig. 29; Figure 29: The final fusion archit ecture of the IMU/Vision system. In the following subsections a general description of the standard Kalman filter is given followed by illustrations of both the vision only Kalman filter and the master Kalman filter designed in this work. 5.2 Standard Kalman Filter Equations The standard Kalman filter (Kalman 1960) was developed for linear dynamic systems which would give the existing statistically op timized solution available for that system. Due to the versatility of the Kalman filter in almost all the scientific research areas it is used in non-linear systems as well. Kalman f ilters have been developed to facilitate prediction, filtering and smoothening. However, in this work Kalman filter will be Kalman Filter (vision only) Master Kalman Filter. IMU Images tn+1 Estimating R & T Transform to e-frame Error compensation Navigation solution
93 predominantly used as a filter and a smoother. The typical Kalman filter equations (Grewal 2001, Jazwinski 1970) can be given as, Updating: ) ( k k k k k kx H y K x x (5-1) 1) ( k T k k k T k k kR H P H H P K (5-2) k k k kP H K I P) ( (5-3) Prediction: k k kx x ) 1 ( (5-4) k T k k k kQ P P ) 1 ( (5-5) For the state equations, (5-6) (5-7) where, xk is the state matrix, yk is the measurement at kth time step, Pk is the error covariance matrix at kth time step, Kk is the Kalman gain at time k, Rk and Qk is the variances associated to measurement and pr ocess noises respectivel y and subscript (-) denotes the priori estimate. k is the state transition matrix while Hk is the measurement sensitivity matrix. (5-8) k k k k k k k kv y H y u x x 1 k R N k v k Q N k u 0 ~ 0 ~
94 For the proposed model kH and k could be non-linear. Theref ore, one has to linearize them at the given time. 5.3 Design of Vision Only Kalman (Local) Filter The pose estimated from the vision sensor system is corrupted due to the various noise types present in the pose estimation algorithm. Thus, it is important to minimize this noise and optimize the estimated pose from the vision system. The vision sensor predictions can be optimized using a local Kalman filter. Kalman filters have been developed to facilitate predic tion, filtering and smoothening. In this context it is only used for smoothen out the rotations and tran slations predicted by the vision algorithm. A brief description of this local Kalman filter designed for the vision system is outlined in this section and a more thorough descripti on can be found in (Grewal 2001, Jazwinski 1970). The states relevant to this work consist of translations, rates of translations and orientations. Due to the relative ease of formulating differential equations, associated linearity and the ability to avoid Gimblelock, the orientations are expressed in quaternions. Thus, the state vector can be given as; T k k k kq T T X, (5-9)
95 Where, Tk is the translation, qk is the orientation given in quaternions andkT is the rate of translation, at time k. Th en the updating differential equations for translations and quaternions can be given as; k k t t k k kq q dt T T Tk k 2 11 11 (5-10) Where, A is given in Eqn (2-18). Then the stat e transition matrix can be obtained as; 4 4 3 4 3 4 4 3 3 3 3 3 4 3 3 3 3 3 x x x x x x x x x ktA 0 0 0 I 0 0 I I (5-11) Where, I and 0 are the identity and null matrices of the shown dimensions respectively and t represents the time difference between two consecutive images. The measurements in the Kalman formulation ca n be considered as the translations and rotations estimated by the vision algorithm. Therefore, the measurement vector can be expressed as, T k k kq T Y (5-12)
96 Hence, the measurement transiti on matrix will take the form; 4 4 3 3 3 3 4 4 3 3 3 3 x x x x x x kI 0 0 0 0 I H (5-13) Once the necessary matrices are setup using Eqns (5-9)-(5-13) and the initial state vector and the initial covariance matrix are obtaine d, the vision outputs can be smoothed using the Kalman filter equations. The Initial cond itions can be defined conveniently based on the IMU output at the starting lo cation of the test section. 5.4 Design of Master Kalman Filter The Kalman filter designed to fuse the IMU readings and vision measurements continuously evaluates the error between th e two sensor systems and statistically optimizes it. A decentralized Kalman f ilter architecture (Wei 1990, Allerton 2004, Allerton 2005) is used in this work to fuse the two sensor systems. Since the main aim of the integration of the two systems is to corr ect the high frequency IM U readings for their error growth, the vision system is used as the updated or precision measurement. Hence, the IMU system is the process of the Kalman filter algorithm. Since the two sensor systems have two different data gatheri ng frequencies, multi-rate fusion approach (Armesto 2004) has been used in fusing the IMU and the vision systems. On the other hand, the function of the Vision Only Kalman filter is to remove the significantly high noise associated with the vision reading b ecause of the relatively high accuracy of measurements demanded by the fusion algorithm. The system architecture of this master Kalman filter is shown in Fig. 30.
97 Figure 30: Illustration of master Kalman filter. The typical inputs to update the master Ka lman filter consists of positions (in the eframe) and the orientations of the b-frame and the c-frame with respect to the n-frame. Since the vision system provides rotations a nd translations between the camera frames, one needs the position and orientation of the first camera location. These can be conveniently considered as respectively the IMU position in the e-frame, and the orientation between the b-frame and the n-frame. The orientation update of the camera between two consecutive images (at tk-1 and tk) with respect to the n-frame can be given as; k k n c k n ct R t VC t VC1 (5-14) Where, k n ct VCis the transformation matrix between the camera orientat ion with respect to the n-frame at tk and R(tk) is the rotation estimated by the Eight Point algorithm at tk (Faugeras 1996). The IMU used in the test vehicle is a navigational grade IMU which was assumed to be calibrated and aligned quite accurately. Therefore, the main error that Vision R & T IMU Readings Error compensation Convert to Position + Master Kalman Filter + Navigation solution Residual
98 could occur in the IMU measurements is due to biases of gyroscopes and accelerometers. A more detailed explanation of inertial system errors can be found in (Grewal 2001 b). In this work only bias errors were considered in error compensation for the IMU. For gyroscopes and accelerometers, the manufactur er specified bias terms were used in correcting the IMU measurements. These bias te rms were considered to be propagating in time as; ) ( ) ( ) (1 k k i k it w t b t b (5-15) Where, bi(tk) denotes the bias of the ith sensor (i = accelerometer or gyroscope) at time tk and w(tk) is a random number. The processing system of the Kalman filte r consists of the error terms obtained by perturbati on analysis described in Ch apter 2. Since these errors are linear, the standard Kalman filter equations can be utilized without any linearization. There are sixteen (16) system states used for the Kalman filter employed in the IMU/vision integration. Th ese are; (i) three states for the position, (ii) three states for the velocity, (iii) four states for the orientation, which are give n in quaternions and (iv) six states for accelerometer and gyroscope biases. Therefore, the entire state vector for the system (in quaternions) takes the following form; T gz gy gx az ay ax z y x w d e n kb b b b b b q q q q v v v h X ] [ (5-16) Where, denotes the estimated error in the state and vN, vE, vD are respectively the velocity components along the n-frame directions while and h are the latitude, the
99 longitude and the altitude respectively. The e rror in the orientation is converted to the quaternion form and its elements are represented as qi where i= w, x, y, z. And the bias terms in both accelerometers and gyroscopes, i.e. i=a, b along three directions, j=x, y and z are given as bij. The system equations in the form of Eqns (5-6) and (5-7) are used in the Kalman filter process since the measurements from both the IMU and the vision system are discrete. The state transition matrix for this probl em would be a 16x16 matrix with the terms obtained from the navigation equations given in Eqns. (2-32)-(2-34). The measurements equation is obtained similarly by considering the measurement residual. T) ( ) P (P yimu Vis imu Vis k (5-17) Where, Pi and i represent the position vector (3x1) given in geodetic coordinates and the orientation quaternion (4x1) re spectively measured using the ith sensor system with i = vision or IMU. Then the measurement sensitivity matrix would take the form; 0 0 0 0 I 0 0 0 0 I H4x4 3x3 k (5-18) The last critical step in the design of Ka lman filter is to evaluate the process ( Rk) and measurement ( Qk) variances of the system. These parameters are quite important in that these define the reliability of the Kalman filter on the system and the measurements (Jazwinski 1970). The optimum values for thes e parameters must be estimated based on the accuracy of the navigation solution or ot herwise a noisy input will dominate the filter
100 output making it erroneous. In this work, to estimate Rk and Qk, the authors used a separate data set; one of the three trial runs on the same section that was not used in the subsequent computations. The same Kalman filter was used as a smoother for this purpose. This was important specifically fo r the vision measurements since it involves more noise in its measurements. When setting up the differential equations to obtain the navigation solution from the Kalman filter, especially the orientation, quaternions are used in order to avoid undesirable Gimble-lock phenomenon associat ed with the use of Euler angles. In addition, when quaternions are used those can be used to interpolate and obtain any desired orientation in betw een the known orientations.
101 CHAPTER 6 MATHEMATICAL FORMULATIONS FOR SYSTEM CALIBRATION 6.1 Introduction This chapter is devoted to formulate two ma in mathematical relationships that will be used in both calibration and fu sion algorithms. The importance of these two mathematical formulations is quite evident as the estima tions from both the IMU and the vision system need further processing in transforming th e measured sensor data into relevant information. In addition these transformati ons are needed for the calibration and fusion algorithms. The two transfor mations explained are; 1) Determine the unique mathematical transformation between the inertial and vision sensor measurements. 2) Determination of position residual from the two sensor systems 6.2 Determination of the Unique Mathematical Transformation between the Inertial and Vision Sensor Measurements Fig. 31 illustrates the procedure to determine the final transformation between the inertial and vision sensors. The deta ils of obtaining the transfor mation are described below.
102 Figure 31: Steps involved in estimating the transformation between the two sensor systems. The calibration process involves estimating the pose of the systems, inertial and vision, at designated locations. These locations can be marked on the pavement (Fig. 32) by special reflective tapes placed at know n distance intervals. The FDOT survey vehicle is equipped with a triggering system that is activated to record an event on all the measurement systems when the latter encounter s these tapes. Therefore, the pose measured by the IMU at the exact tape location can be captured. Also, since the markings are identified on the
103 pavement, it is possible to evaluate the pose from a manual survey at these locations which would be helpful in the verificati on of the accuracy of the transformation. Figure 32: Measurement points marked on the pavement by reflective tapes. However, capture of images cannot be trigge red by the above events (reflective tapes) and hence invariably there will be a time offset between the event positions and the image recording positions as illustrated in Fig. 33. This problem can be addressed by interpolating the pose estimation obtained by the images. Since it is more accurate to interpolate pose estimations in the quaternion space (Shoemake 1985), spherical linear interpolation (SLERP) is used to predict the pose of the marked locati ons with respect to the camera. This is possible since the images also depict the location of tapes even though the camera cannot be triggered by them. The qua ternion at any intermediate point can be expressed by the following (SLERP equation) as;
104 Figure 33: Illustration of the co incidence of IMU and survey meas urement points and the offset of image measurements. 2 1 ) cos( 2 ) sin( ) sin( 1 ) sin( ) ) 1 sin(( ) 2 1 ( q q q t q t t q q slerp (6-1) Where, 1 qand 2 q are the known quaternions at the extremes, is the angle between the two quaternions and t is any scalar parameter such as time that relates the extremes and intermediate points. 6.2.1 Determination of the Vision -Inertial Transformation The unique transformation between the two sens or coordinate frames can be determined using a simple optimization technique. In this work it is assumed that the two frames have the same origin but different orientations First, the orientati on of the vehicle at a given position measured with resp ect to the inertial and vision systems is estimated. Then an initial transformation can be obtained fr om these measurements. At the subsequent measurement locations, this transformation is optimized by minimizing the total error
105 between the transformed vision data and the measured iner tial data. The optimization produces the unique transformati on between the two sensors. In extending the calibration procedures reported in (Horn 1987, Alves 2003, Lang 2005), modifications must be made to the calib ration equations in (Horn 1987, Lang 2005) to incorporate the orientation measurements, i.e. roll, pitch, and yaw, instead of 3D position coordinates. The transformations between each pair of the right-handed coordinate frames considered are illustrated in Fi g. 34. In addition, the time-dependent transformations of each system relating the fi rst and second time steps are also illustrated in Fig. 34. It is shown below how the orie ntation transformation between the inertial and vision sensors (Rvi) can be determined by using m easurements which can easily be obtained at an outdoor setting. In Fig. 34 OG, OI, and OV denote origins of gl obal, inertial, and vi sion coordinate frames respectively. xk, yk, and zk define the corr esponding right handed 3D -axis system with k representing the respective coordinate fram es (i-inertial, v-vision, and g-global). Furthermore, the transformations from the gl obal frame to the inertial frame, global frame to the vision frame and inertial frame to the vision frame are defined respectively as Rig, Rvg, and Rvi.
106 Figure 34: Three coordinate systems associated wi th the alignment procedure and the respective transformations. If Pg denotes the position vector me asured in the global coordinate frame, the following equations can be written considering the re spective transformations between the global frame and both the inertial and the vision frames. i(t1) ig(t1) g(t1)P R P (6-2) v(t1) vg(t1) g(t1)P R P (6-3) And considering the transfor mation between the inertial (OI) and vision systems (OV); v(t1) vi i(t1)P R P (6-4) Substituting Eqn (7-2) and Eqn (7-3) into Eqn (7-4), the required transformation can be obtained as; vg(t1) 1 ig(t1) viR R R (6-5)
107 Although the transformations betw een global-inertial and glob al-vision are time variant, the transformation between the inerti al system and the vision system (Rvi) is time invariant due to the fact that the vision a nd inertial systems are rigidly fixed to the vehicle. Once the pose estimates for IMU and vision ar e obtained, the corresponding rotation matrices (in the Euler form) can be formulated considering the rotation sequence of zyx. Thus, Eqn (6-5) provides a simp le method of determining the required transformation Rvi. Then the Euler angles obtained from this step can be used in the optimization algorithm as initial angle estimates. These estimates can then be optimized as illustrated in the ensuing s ection to obtain more accurate or ientations between x, y, and z axes of the two sensor coordinate frames. 6.2.2 Optimization of the Vision -Inertial Transformation If , and are the respective orientation differen ces between the axes of the inertial sensor frame and the vision sensor frame, then the transformation Rvi can be explicitly represented in the Euler form by ) , (vi R. Using Eqn (6-5) the rotation matrix for the inertial system at anytime t can be expressed as; ) , (-1 vi ) t vg( *) t ig(R R R (6-6) ) t vg( Rcan be determined from a sequence of images obtained using the algorithm provided in (Taylor 1995, Faugeras 1996) and *R) t ig( can be estimated using Eqn (6-6) for any given set ) , ( On the other hand, one can determine ) t ig( Rdirectly from the IMU measurements. Then a non-linear error function ( e ) can be formulated in the form;
108 2 pq pq ) t ig( 2 pq] ) ( ) [( ) , () t ig(*R R e (6-7) Where p (=1, 2, 3) and q (=1, 2, 3) are the row and column indices of the Rig matrix respectively. Therefore, the sum of errors can be obtained as; pq 2 pq) , (e E (6-8) Finally, the optimum , and can be estimated by minimizing Eqn (6-8); pq 2 ] pq ) ) t ig( ( pq ) ig [( , min , min R R E (6-9) Minimization can be achieved by gradie nt descent Eqn (6-10) as follows; ) (1 i 1 i ix x x E (6-10) Where, xi and xi-1 are two consecutive set of orientations re spectively while is the step length and ) (1 ix Eis the first derivative of E evaluated at xi-1; TE E E E ) ( ) ( ) ( ) (1 i 1 i 1 i 1 ix x x x (6-11) Once the set of angles ) , ( corresponding to the minimum E in Eqn (6-8) is obtained, for time stept the above procedure can be repeated for a number of time stepst t etc. When it is verified that the set ) , ( is invariant with time it can be used in building the unique transformation ( Rvi) matrix between the two sensor systems.
109 6.2.3 Verification with the Survey Measurements Once the unique transformation between the tw o sensor systems is derived, it can be verified by comparing the pr edicted vehicle rotation mane uvers with those measured from a manual survey. Orientation of a seri es of survey locations along the roadway can be estimated with respect to the first survey point. Since the orientations are measured with respect to the first point, a separate formulation is needed to estimate the transformation between the survey-inertial syst ems. This also can be done based on Fig. 34 by replacing the global frame with the survey frame. If the transformation between two time steps in any of the frames (Fig. 34) is given as t2) j(t1R (j=i-inertial, j=v-vision, a nd j=s-survey), then the position vector at the second step can be obtained as; s(t2) t2) s(t1 s(t1)P R P (6-12) The transformations in the vision and inerti al frames between the first and second time steps can be also expressed in a similar fashion as: i(t2) t2) i(t1 i(t1)P R P (6-13) By using Eqn. (6-5); i(t2) is(t2) s(t2)P R P (6-14) Eqns (6-2), (6-12) and (6-14) can be combined to eliminate Pi(t1), Pi(t2), Ps(t1), and Ps(t2) to obtain the survey-inertial frame for the second time step as:
110 t2) i(t1 is(t1) 1 t2) s(t1 is(t2) R R R R (6-15) Similarly, the vision-survey tr ansformation can be deduced as; ) , (-1 vi t2) v(t1 t2) i(t1 is(t1) 1 t2) s(t1 vs(t2)R R R R R R (6-16) Where, Rvs(t2) is the vision measurement transformed onto the survey frame at the time step t2. While Ri(t1-t2) and Rv(t1-t2) can be obtained from vehicle kinematics, Rs(t1-t2) can be established based on surveying. Since the visi on measurements can be transformed to the inertial frame using Eqn (6-5) and then to the survey frame by using Eqn (6-16), they can be compared with survey measurements to determine the accuracy of the estimated transformation. 6.3 Determination of Position Residua l from the Two Sensor Systems Although one can obtain translat ions between the camera positions from a sequence of images collected using the visi on sensor, the translations esti mations are scaled in the z direction of the c-frame (Fig. 12b). In other words, due to the perspectiv e projection of the 3D points onto the image plane, the tota l recovery of the de pth is not allowed. Therefore, the translations derived from th e vision algorithm are indeed the normalized translations or unit vectors providing only the directional in formation and not the actual magnitude. One of the inputs (measurements) to the Ka lman filter that executes the fusion between the IMU and vision information is the positi on residual estimated by those two sensors. Of these, the IMU provides a vector expres sing the actual translations while the vision
111 sensor expresses the translation as a unit ve ctor. Thus the fusion of vision and inertial measurements requires this vi sion-based translation to be ex pressed as a regular vector rather than a unit vector. Hence, a special technique had to be devised to obtain the position residual. The method followed in this work to estimate the measurement residual is explained below. In this work, the authors first transform both measurements (IMU and vision) onto the eframe (Fig. 11). Then, the translations measured by both sensors are projected onto a unit sphere (Fig. 35) and the two respective un it vectors and hence th e difference between them that would produce the measurement residu als can be estimated. Finally, as needed by the input to the fusion algorithm, these unit residuals are multiplied by the magnitude of the IMU based translation vector. Figure 35: Illustration of measurement residual estimation.
112 In Fig. 35, PIMU is the position vector between two consecutive camera locations in the eframe as estimated by the IMU whereas Uvis and UIMU denote the unit translation vectors estimated from the vision system and th e IMU respectively, transformed to the e-frame First, the transformation of the visi on system measurements from the c-frame (Fig. 12b) into the e-frame (Fig.11) can be performed as; c b c k n b k e n k e VisT C t C t C t U ) ( ) ( ) ( (6-17) Where, b cC is the transformation between the c-frame and the b-frame which can be obtained using the vision system calibra tion procedure outlined in Chapter 5 and n bC can be estimated by Eqns (2-17) (2-19). The superscript e indicates that the quantities are expressed in the e-frame Transformation between the n-frame and the e-frame can be obtained by considering the instantaneous latitu de and the longitude. It can be deduced that, )) ( 2 ( )) ( (2 3 k k e nt R t R C (6-18) Where, Ri represents the Euler rotation about the ith axis (i =2, 3). The first camera location can be established as the corresponding IMU position in the e-frame Then the position at any other time is es timated using Eqn (6-17). Sim ilarly, the IMU translations can be transformed into the e-frame by; b IMU k n b k e n k e IMUT t C t C t P ) ( ) ( ) ( (6-19)
113 Where the term b IMUT is the translation estimated by th e IMU measurements between two consecutive camera locations. The IMU position v ector obtained from Eqn (6-19) is then normalized and the unit vector, ) (k e IMUt U associated with the IMU is determined (Fig. 35). Once the two unit vectors ar e estimated the measurement residual, an input to the fusion filter, is obtained by; )) ( ) ( ( ) (k e IMU k e Vis k e Vis et U t U t P dT (6-20) Where, |) (k e Vist P | is the magnitude estimated from the IMU measurement and dTe is the required translations residual in the e-frame
114 CHAPTER 7 EXPERIMENTAL SETUP AND RESULTS 7.1 Experimental Setup The data for the fusion process was collected on a test section on Eastbound State Road 26 in Florida. The total test section was divi ded into three separate segments; one short run and two other relatively longe r runs. On the short segment, data from three measuring frames, i.e. survey, vision, and inertial, we re collected. The two longer sections were selected in such a way that they woul d include the typical geometric conditions encountered on a roadway, such as straight sections, horizontal curves and vertical curves. Data collected at the two longer runs (1) Straight and (2) Horizontal curve on the State Road 26 was used for the validation pur pose. A typical road s ection used for data collection can be illustrated in Fig. 17. Since the manual su rvey is relatively expensive and time and labor intensive, it was not pe rformed in the longer segments. The longer segments were demarcated so that there w ould be an adequate number of sample points in each of them. 7.2 Experimental Verification of Highway Geometric Data 7.2.1 Cross Slope Data Two experimental data sets obtained by th e FDOT highway eval uation vehicle on two separate test locations were used to verify the algorithms used in the INS/DGPS system.
115 The first data set was obtained during a test run on Interstate-10 in Florida where crossslope data measured by a manual survey wa re also available. The second data set obtained on a reverse curve along the NE 8th Avenue in Gainesville, Florida contained the entire raw roadway geometric data obtained using a PC card. Fig. 36 illustrates the parabolic fit obtained for a section within the former test section. Tables 4 and 5 compare the results obtained in two consecutive runs using the three different methods, (1) slope obtained from the manufacturer s (ICC) software (2) slope obtained by the manual survey (3) slopes obtained by LSA fits, both linear and parabolic, for the same section. Figure 36: Parabolic fit for a point along the test section, 50 ft away from the starting point of the test section. Table 4: Comparison of cross-slopes for first run on I-10. ICC Survey Linear Parabolic Error (%) Correlation coefficient Distance (ft.) slope (%) slope (%) slope (%) slope (%) ICC Linear fit Para. fit Linear Para. 50 1.123 2.46 2.300 2.273 54.35 6.50 7.60 0.9805 1 200 0.676 1.44 2.010 1.996 53. 06 39.60 38.61 0.9914 0.9969 800 -0.996 -1.84 -1.774 -1. 792 45.87 3.57 2.61 0.9891 0.9985
116Table 5: Comparison of cross-sl opes for second run on I-10. ICC Linear Parabolic Error (%) Correlation coefficient Distance (ft.) slope (%) slope (%) slope (%) ICC Linear fit Para. fit Linear fit Para. Fit 50 1.248 2.362 2.337 49.27 3.98 5.00 0.9847 0.9997 200 0.450 1.049 1.034 68.75 27.15 28.19 0.9641 0.9884 800 -1.006 -1.531 -1.550 45.33 16.80 15.76 0.9859 0.9999 Results in Table 4 and 5 show reasonable repeatability except at the location of 200ft from the start. In both Tables 4 and 5 th e International Cybern etics Cooperation (ICC) slope is obtained from the manufacturers so ftware using Eqn (3-9). The third column of Table 4 provides the slope de termined from the manual survey while the fourth and the fifth columns contain the slopes obtained by th e linear and parabolic LSA fits, using Eqns (3-11) and (3-12). It is seen that the LSA parabolic fit a pproximates the roadway crossslopes much more accurately than any other method. The significant differences observed between the surveyed cross-slope values and all of the fitting met hods at a distance of 200 ft from the starting location can possibly be attributed to the inaccurate identification of the manually surveyed location. Th is hypothesis is further supported by the unsatisfactory repeatability of IMU measurements at this location observed in Table 4 and 5.
1177.2.2 Radius of Curvature Data Data obtained from the test section on a reverse curve at NE 8th Avenue in Gainesville, Florida was used to test the algorithms fo r determining the radius of curvature. Application of the geometric me thod of determining the radius Fig.37(a) provides the variation of radius with time based on the algorithm in Eqn (3-12), (a) (b) Figure 37: Radius calculated by the geometric metho d for a data acquisition frequency of 200 Hz (a) without averaging, (b) with averaging.
118 Figure 38 shows the variati on of the radius with time calculated based on v the velocity in the longitudinal direction and the first derivative of plat form heading with respect to time (Eqn (3-9b)) determin ed using Eqn (3-13). Figure 38: Variation of radius with time eval uated from Eqn (3-12) (frequency 200 Hz). In Figure 38 a cut-off value of 2000 meters is imposed on the radius to overcome the issues of undulations of large radii of curvature on the tangent sections. In comparing Fig. 38 with Fig 37 (a) and (b), it is seen that the former plots the radius with less variability. Furthermore, both the tangent se ctions as well as the curved sections show less oscillation and are easily distinguishable. Application of the kinematic me thod of determining the radius Fig.39 shows a typical plot of predicted body acceleration in the la teral (Y) direction (aby) which also exhibits signif icant undulations especially within both tangents and the curves. In order to separate the tangents from the curves, upper and lower bounds of (
119 3 ) were used. and are the mean and the standard de viation of the la teral acceleration data obtained from the tangent sections. Change of body acceleration in lateral direction-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 01020304050 Time (s)Aby (m/s2) Figure 39: Plot of body acceleration in the latera l (Y) direction with time for the total section. The acceleration values of the tangent section in Fig. 39 consistently lie within the upper and lower bounds defined for the tangent section. It is seen that th is pattern changes once the vehicle enters a curve where the latera l acceleration component remains deviated from these bounds. By applying the above techni que to the lateral acc eleration record, it is possible to demarcate curved sections fr om tangent sections. Fi gure 40 shows radii of curvature computed after separating two curves from the linear secti ons using Eqn (3-12). From Figure 40 one can also clear ly see the change in radius within the transition curves and constant radii in the circular sections.
120 Figure 40: Curved sections separated from the tangent sections, using upper and lower bounds. Figure 41: Comparison of radii values calculated by (1) using kinematic method, (2) using geometric method, (3) using modifi ed kinematic method. Finally, in Fig.41 the radii values estima ted by the three different approaches are compared, (1) the kinematic method using Eqn (3-8), (2) the geometric method using Eqn (3-12) and (3) modified kinematic method usi ng Eqn (3-12). It is clear that the radii values match well within the parts of the curve which are circular but deviate in the
121 transition sections. The results emphasize th e fact that the most accurate and stable solution for the transition s ections are provided by the modified kinematic method. Table 6: Comparison of radius va lues obtained for the two curves. Manual method (m) Compass method (m) Geometric Method (m) Kinematic method (m) Modified kinematic method (m) Curve 1 108.63 109.30 102.73 113.83 112.50 Curve 2 152.82 151.17 143.08 162.45 157.28 Radii computations from all of the methods are tabulated in Table 6. From Table 6 one can also see that the two curv e segments of the reverse curve in Fig. 41 have different radii values. Figure 42 shows the radius of cu rvature calculated for curve 1 and curve 2 by using the compass method described in S ection IV. The angular difference at two consecutive points is obtained from the platform heading and the distance between two points was estimated by using Eqn. (3-11). Once the angular differe nce and the distance between two points are available the radi us is calculated using Eqn (3-18).
122 Figure 42: Radius values obtained by compass method for two curves.
123 Figure 43:Comparison of radius values obtained fr om 1) compass method, 2) geometric method, 3) kinematic method and 4) modified kinematic method for (a) curve 1 and (b) curve 2 including transition curves. (a) (b)
1247.3 Calibration for Intrinsic Prop erties of the Vision System A 2D calibration target (Fig. 44) consisting of a pattern of circul ar control points was used to calibrate the camera for its intrinsic properties discussed in Section V. In order to evaluate the above six intrinsi c properties an adequate numbe r of circular data points was distributed on the target. These circles were setup to have radii values of 1 cm and a suitable center to center spacing so that it is possible to use the algorithm in (Heikkila 2000) directly to determine the intrinsic pa rameters of the camera. Once the intrinsic parameters of the camera are estimated they can be used to depict the distortion that occurs in the vision system, by plotting the distorted data against the original data. The camera of the FDOT survey ve hicle has a field of view of 55o-65o and is mounted on the hood of the vehicle, about 7.5 ft above the ground (Fig. 1). Therefore, it is impractical to position the target for the camera to capture the target in its entirety. Thus, the images were captured in such a way that the ta rget would occupy the maximum practically possible area of the image. Figure 44: Calibration target with circular control points.
125 Table 7 summarizes the intrinsic properties fo r the vision system in the survey vehicle. Table 7 also shows that there is in fact sign ificant distortion occurring in the images in the radial and tangential forms. Also the pi xel coordinates of the principal point, scale factor, and focal length in Table I would be us eful parameters in the accurate estimation of pose from vision. Table 7: Intrinsic parameters of the vision system. Intrinsic Property Value Focal Length (mm) 24.53 Scale Factor 0.79 X0 (pixels) 693.23 Principal Point Y0 (pixels) 495.52 k1 (mm-2) -2.19 x 10-2 Radial Distortion k2 (mm-4) 2.04 x 10-3 T1 (mm-1) 1.16 x 10-2 Tangential Distortion T2 (mm-1) 2.38 x 10-3 Fig. 45 shows the image of the calibration targ et shown in Fig. 44 in which the original input data points (ud,vd) are darkened while the output from the distortion model (Heikkila 2000) (uc,vc) is shown as circles. Fig. 45 cl early depicts the magnitude of the distortion in both the radial and tangential directions. Based on Fig. 45 it can be concluded that the corrected data points match reasonably well w ith the corresponding locations on the target.
126 Figure 45: Comparison of original and the distorted data points from the model on the calibration target. 7.3.1 Estimating the Transformation between Inertial Sensor and Vision Sensor Vision data was collected at all three test segments, one short segment and two relatively longer segments, setting the image capturing dist ance interval to be 5 ft. This corresponds to the highest allowable da ta logging frequency of th e system under normal operating speeds. Meanwhile, the IMU frequency can be set as high as 200 Hz. Once the images are collected, the vision pose from these images can be estimated as discussed in chapter 5. Pose from the inertial sensor is available in the desired format in the manufacturers post-processing software, Applanix PosPAC Due to the different data logging frequencies of the two sensor systems and si nce the reflective tape cannot activate the vision system at the desired location, the pose estimated from the vision system is interpolated to obtain the pose at the taped locations (Fig. 32) using Eqn (6-11). The transformation illustrated in Fig. 34 was used to convert vision data, in terms of roll, pitch and yaw, to the inertial fram e for comparison with actual inertial data. Since the data logging frequency of the inertial sensor is much higher than th at of the vision sensor, the
127 data from the inertial sensor was extracted at locations where vision data is available, for comparison with the converted vision data. The inertial and vision data collected on the longer test sections was used for the development and validation of the tran sformation described in Sections VII (2) and (3) while data from the short run, i.e. inertial, vision and survey, was used to verify the transformation with the ground-truth. Table 8 summarizes the optimized transfor mations obtained for the inertial-vision system. It shows the initial estimates used in the optimization algor ithm (Eqn 6-9) and the final optimized estimates obtained from the e rror minimization proce ss at three separate test locations (corresponding to timest t andt ). It is clear from Table 8 that the optimization process converges to a unique ) , ( set irrespective of the initial estimates provided. Since the two sensor syst em is rigidly fixed to the vehicle, the inertial-vision transformation must be unique Therefore, the average of the optimized transformations can be considered as the uni que transformation that exists between the two sensor systems.
128Table 8: Orientation difference between two se nsor systems estimated at four locations. Initial Angle Optimized Angle Roll (rad) -0.00401 -0.03304 Pitch (rad) -0.00713 0.01108 Point 1 (t') Yaw (rad) 1.23723 -0.08258 7.3.2 Validation of Results Once the pose s from the inertial and vision system s were obtained as mentioned in Chapter 2 and 4, the transformation determined in Chapter 6 was applied to the vision data and the transformed vision data was compar ed with the inertial sensor data. It must be noted that this transformed output occurs in its raw format whereas the inertial sensor data have already been statistically filtered. Therefore, the inertial data tend to be smoother compared to the transformed measurements of the vision pose To address this disparity, a simple moving median filter wa s used to smoothen out the transformed data of the vision pose as well. The comparison of the transformed vision and in ertial data are illustrated in Fig. 46 and Fig. 47 for the straight run and horizontal curve respectively. It is clear that the original inertial data and the transformed vision meas urements have reasonably close agreement. The reasons for the observed minor deviation can be attributed to measurement errors, Roll (rad) -0.03101 -0.03304 Pitch (rad) -0.00541 0.01108 Point 2 (t'' ) Yaw (rad) 1.34034 -0.08258 Roll (rad) -0.01502 -0.03304 Pitch (rad) -0.00259 0.01108 Point 3 (t''' ) Yaw (rad) 1.32766 -0.08258
129 noise problems (Alves 2003), various outdoor factors like vibr ations and above all, the coarseness of the data computational interval which is as large as 5ft. Figure 46: Comparison of raw inertial data with tran sformed vision data for (a) roll, (b) pitch, and (c) yaw for the straight section (longer run). Figure 47: Comparison of raw inertial data with tran sformed vision data for (a) roll, (b) pitch, and (c) yaw for the horizontal curve (longer run).
1307.3.3 Verification with Ground Truth Five intermediate points of the shorter se gment spaced at 11.5 ft were demarcated and taped in such a way that the vehicle would undergo translatio ns in all three directions, and rotations about all three axes (roll, pitch, and yaw) between each consecutive locations. The manual survey was performed us ing a total station which could capture all the locations from one temporary benchmark. At each taped location, four separate points in the cross section of the pavement, edge center, and two points in between, were surveyed. Moreover, at each of these points, total station measurements were repeated in order to eliminate any possible errors in the line of collimation. By considering the first surveyed point as the reference, horizontal and vertical angles between each pair of measurement points were estimated. From thes e measurements roll, pitch, and yaw of the vehicle at all consequent meas urement points could be estimate d with respect to the first point. The above survey measurements were then compared with the transformed vision Eqn (6-15) and transformed iner tial measurements Eqn (6-14). Since the optimized transformation between in ertial and vision systems is available (Rvi) the required transformations from inertial-s urvey Eqn (6-14) and vision-survey Eqn (615) can be obtained as shown in Chapter 6. Comparisons of these two transformed measurements with survey meas urement are illustrated in Fig. 48. From Fig. 48 it is clear that the survey measurements and the two transformed measurements have a reasonably good agreement. The discrepancies can again be attributed primarily to the coarseness of data measurement interval, which cannot be lowered below 5 ft due to practical
131 difficulties. This affects the accur acy of determining the vision pose in particular (Chapter 4). Figure 48: Comparison of transformed inertial and tr ansformed vision data with survey data for (a) roll, (b) pitch, and (c) yaw on the short section. 7.4 Comparison of the Two Vision pose Measurement Systems The SFM algorithm provides two outputs. Th ey are (i) the rotation of the camera coordinate frame at the capturing instant of a given image with respect to that of the capturing instant of the next image, and (ii) normalized translation of the vehicle between the above two instances. It is no ted that due to the scale fact or associated with the depth direction, only the normalized (unit) vector can be provided for transl ations. The rotation angles can be expressed in Euler form enabli ng them to be compared with IMU rotations. The normalized translations cannot be directly compared with the translations derived from the IMU since the latter contains both directions and magnitudes. Therefore, in
132 order to compare the IMU translations with those of SFM, the former ones have to be normalized. Figs. 49 (A-B) shows the comparison of orie ntations obtained fr om three different methods, IMU, manual SFM and auto SFM, pl otted on the IMU frame for the two test sections. It can be noted fr om Figs. 49(A-B) that orient ations derived from both SFM algorithms generally agree with the orientati ons obtained from IMU system. Moreover, it is also clear that the orient ations predicted from manually picked correspondences, i.e. manual SFM, are closer to the IMU orientations than those predicted from the automated SFM algorithm where the corresponden ces are automatically identified.
133 0 5 10 15 20 25 30 35 40 0.2 0 0.2 (a) SFM Auto 0 5 10 15 20 25 30 35 40 0.4 0.2 0 0.2 Image SequenceRadians(b) IMU 0 5 10 15 20 25 30 35 40 7.4 7.6 (c) SFM Manual (A) 0 5 10 15 20 25 30 35 40 0.5 0 0.5 1 (a) SFM Auto 0 5 10 15 20 25 30 35 40 0.5 0 0.5 1 Image SequenceRadians(b) IMU 0 5 10 15 20 25 30 35 40 7.5 8 (c) SFM Manual (B) Figure 49: Comparison of orientations on the (A) st raight section, (B) horizontal curve for (a) roll, (b) pitch and (c) yaw.
134 In order to reduce the variability associat ed with the SFM predictions and remove possible outliers inherent in the orientation estimation, first a median filter and then a moving average filter can be used. The raw orientation data can be processed with a median filter first to reduce the effects of the outliers and then the moving average filtration is performed on the refined data to reduce the fluctuations. The processed orientations data are plotted in Fig. 50 (A-B).
135 0 5 10 15 20 25 30 35 40 0.1 0 0.1 0.2 (a) SFM Auto 0 5 10 15 20 25 30 35 40 0.2 0 0.2 Image Sequence (b)Radians IMU 0 5 10 15 20 25 30 35 40 7.2 7.4 7.6 7.8 (c) SFM Manual (A) 0 5 10 15 20 25 30 35 40 1 0.5 0 0.5 (a) SFM Auto 0 5 10 15 20 25 30 35 40 0.5 0 0.5 Image Sequence (b)Radians IMU 0 5 10 15 20 25 30 35 40 7.5 8 (c) SFM Manual (B) Figure 50: Comparison of orientation after moving median and moving average filtering for (A) straight section, (B) horizontal curve (a) roll, (b) pitch and (c) yaw.
136 Fig. 51 illustrates the compar ison of the normalized transla tions obtained from the SFM algorithm and the IMU system. From Fig. 51 it can be seen that the raw translations obtained from the automated SFM have somewhat of a deviation from the IMU predictions. However, Fig. 51 also illustrate s how the SFM translations processed by the Kalman filter algorithm match well with the IMU predictions. 0 5 10 15 20 25 30 35 40 1 0 1 (a) Before KF 0 5 10 15 20 25 30 35 40 1 0 1 Image SequenceNormalized Translations(b) IMU 0 5 10 15 20 25 30 35 40 0 0.5 1 (c) After KF Figure 51: Comparison of normalized translations for the straight run before and after Kalman filtering (a) x-directio n, (b) y-direction and (c) z-direction.
137 0 5 10 15 20 25 30 35 40 0 0.5 1 (a) Before KF 0 5 10 15 20 25 30 35 40 1 0 1 Image SequenceNormalized Translations(b) IMU 0 5 10 15 20 25 30 35 40 1 0.5 0 0.5 (c) After KF Figure 52: Comparison of normaliz ed translations for the horizont al curve before and after Kalman filtering (a) x-directio n, (b) y-direction and (c) z-direction. It is clear from Figs. 49-52 that the rotations predicted from manual correspondences yield more accurate results than its automa tic counterpart. The primary reason for the higher accuracy could be attr ibuted to the fact that the manual co rrespondence algorithm uses straight lines in estimating the rotations while in the automatic algorithm discrete points are used. It is realized that ma tching corresponding points in two consecutive images is much more tedious than matching corresponding straight lines in images. In addition, the manual correspondences identified by a human would be less prone to errors compared to the automated correspondences assigned using a computer program. 7.5 Results of the IMU/Vision Integration The translations and rotations of the test ve hicle were estimated fr om vision sensors using the point correspondences tracked by the KLT tracker on the both sec tions. In order to
138 estimate the pose from the vision system, the correspondences given in Fig. 20, filtered out using the two methods mentioned in Ch apter 5, were used. Figures 53 (a)(c) compare the orientations obtained from bot h the pre-filtered vision system and the IMU/GPS system. Figure 53: Comparison of (a) roll, (b) pitc h and (c) yaw of IMU and filtered Vision Similarly, the normalized translations are also compared in Figures 54 (a)(c).
139 Figure 54: Comparison of translations (a) xdirection, (b) y-direction and (c) z-direction It is clear from Figs. 53 and 54 that the orientations and norma lized translations obtained by both IMU/GPS and filtered vision system match reasonably well. Hence, the authors determined that both sets of data are appr opriate for a meaningful fusion and upgrade. These data were then used in the fusion proces s to obtain positions show n in Figs.55-a, b;
140 (a) Latitude (b) Longitude Figure 55: Comparison of (a) Latitude and (b) Longitude. Furthermore, it is also clear from Fig. 55 th at the IMU/Vision system estimates are much closer to those of the IMU/GPS system than the corresponding estim ates of the IMU-only system. This is clearly shown in the lati tude comparison (Fig 55a) where the IMU-only estimate consistently deviates from the IMU/GPS readings wh ereas the IMU/Vision
141 estimates approaches the latter after the in itial deviation. The comparison results are summarized in Table 9. Table 9: Maximum and minimum errors between IM U/GPS, IMU-only and IMU/Vision systems. IMU/GPS IMU/Vis IMU/VisionIMU/GPS Error IMU only IMU only IMU/GPS Error Error (%) between errors estimated in Cols, 4, 6 Latitude 0.517426 0.5174266 2.824E-07 0.5174269 3.40E-07 20.38 Longitude -1.442371 -1.4423747 3.354E-06 -1.442375 3.59E-06 7.02 Table 9 summarizes the two errors associat ed with both the IMU/ Vision system and the IMU-only system with respect to the IMU/GPS system. It is clear that the IMU-only data consistently deviates from the IMU/GPS syst em due to IMUs inherent error growth. The last column of Table 9 indicates that at th e end of the run, the respective latitude and longitude estimates of the IMU/Vision syst em are 20% and 7% closer to the IMU/GPS system than the corresponding estimates of IMU-only system. The error estimated from the Kalman filter can be given as,
142 (a) Error in Latitude (b) Error in Longitude Figure 56: Error associated with (a) Latitude and (b) Longitude.
143 Figures 55-56 and Table 9 show that the position, i.e. latit ude and longitude, estimated by the IMU/Vision integration agree quite well with that given by the IMU/DGPS integration. These results also clearly show that the IMU/Vision system can certainly supplement the IMU measurements without a significant loss of accuracy during a GPS outage. Furthermore, the authors have i nvestigated the error estimation of the IMU/Vision fusion algorithm in Fig. 56. Figure 56 shows that the Kalman filter used for fusion achieves convergence and also that th e error involved in the position estimation reduces with time. These results are encourag ing since it further si gnifies the potential use of the vision system as an alte rnative to GPS in updating IMU errors. 7.6 Validation of IMU/Vision Orientat ion Results with Ground Truth The survey for this validation can also be pe rformed as previously discussed Section 7.1. These survey measurements were then compared with the IMU/Vision system orientations transformed into the appropriate frame to be co mpared with survey readings. This transformation can be found in (R andeniya 2006). Figure 57 illustrates the comparison between the IMU/DG PS, IMU/Vision and survey orientations for the short test section.
144 Figure 57: Comparison of IMU/Vision orientations with survey data. It is clear from Fig. 57 that th e IMU/Vision orientations are re latively closer to the survey orientations than those of IM U/DGPS. Since these results ar e obtained from a single test on an actual roadway one must be cautious and employ further experimentation to generalize the above conclusion. Table 10 shows the maximum error percentages estimated between the IMU/Visi on, IMU/GPS systems and the survey. It is clear that from the Figure 57 and Table 10 that the orientations match reasonably well. The maximum percent errors are quite satisfactor y considering the relatively large distance interval at which the images are captured.
145Table 10: Maximum error percentages between the IMU/Vision and IMU/GPS system estimates and the actual survey. IMU/Vision IMU/GPS Rotation Survey estimate (rad) Estimate (rad) Error (%) Estimate (rad) Error (%) Roll 0.0347 0.0296 14.71 0.051 46.97 Pitch -0.018 -0.023 28.01 -0.030 66.67 Yaw 0.392 0.332 15.31 0.273 30.36 Finally, Table 10 compares the orientation measurements performed by IMU/Vision and IMU/GPS systems with respect to those of the actual survey. It is seen that for this particular test run, IMU/Vision system estimat es orientations that are closer to survey measurements than those meas ured from the IMU/GPS system.
146 CHAPTER 8 CONCLUSIONS The work presented in this work primarily a ddresses two important i ssues involved in the process of fusing vision and inertial sensors; ( 1 ) estimating the intrinsic properties of the vision sensor (camera) and determining the opt imized transformation between the inertial sensor and the vision sensor in an outdoor setting (2) Fusion of the IMU sensor system and the vision system was performed in ai ding the autonomous navigational tasks as an alternative system to IMU/GPS system Two validations were performed; (1) to compare transformed IMU/Vision measur ements with IMU/GPS data, (2) to match transformed IMU/Vision measurements with global refe rence data (manual su rvey). The validation results show that the transformed vision m easurements match reasonably well in both cases. It was shown in this work that a vision syst em attached to a vehicle can be used to estimate the rotations and translations of th at vehicle using a sequence of images. The results also showed that the vision data can be used successfully in updating the IMU measurements to address the inherent error growth. The fusion of IMU/Vision measurements was performed for a sequence of images obtained on an actual roadway and compared successfully with the IMU/DG PS readings. The IMU/ DGPS readings were used as the basis for comparison since the ma in task of this work was to explore an alternative reliable system that can be used successfully in situations where the GPS signal is unavailable. Also it was shown in Figs. 53-54 that the noisy vision
147 measurements could be successfully used in the fusion after being processed by a prefilter. This resulted in a successful IM U/Vision fusion effort in absolute position coordinates as shown in Fig. 55. In addition, convergen ce of the errors involved in Kalman filter within short time (Fig. 56) depicts the promise and effectiveness of the fusion algorithm. Furthermore, orientations ob tained from the fused system were also successfully validated with a manual survey performed at the section. The author is confident that the accuracy of the IMU/Vision integrated system can be further improved by a closely spaced image sequence. The author also found that the use of accurate correspondences is essential in executing the vision algorithm successfully. It was also seen that significant improvements can be made by employing special techni ques such as epipolar lin es and correspondence motion fields to eliminate the errors due to false correspondences. Therefor e, the two (vision and inertial) sensor system can be used successfu lly to estimate the orientation of the vehicle to a considerable accuracy. However, some discrepancies do occu r between the actual measurements and the predicted data. These di screpancies can be attributed to the large image capturing interval of the MPSV and can yield improved predictions once significantly finer intervals are used to coll ect the sequence of images from the vision sensor. It is clearly seen that this multi se nsor fusion effort would certainly enhance the ITS and other artificial intel ligence technologies and more importantly will present a more reliable form of navigati on in GPS deprived environment.
1488.1 Future Research Due to the successful integration of IMU/Vision system in decentralized architecture, this work can be extended in following direction. 1) Fusing the IMU/Vision system in tightly coupled architecture. 2) Implementing the fused system in a land vehicle and use in real time navigation.
149 REFERENCES  M. Cramer. (2005, December). G PS/INS Integration. [Online]. http://www.ifp.unistuttgart.de/publica tions/phowo97/cramer.pdf  M. Wei, K. P. Schwarz, Testing a decen tralized filter for GPS/INS integration, Position Location and Navigation Symposium, IEEE Plans, March 1990.  S. Feng, C. L. Law, Assisted GPS a nd its Impact on Navigation Transportation Systems, in Proc5th IEEE International Conference on ITS Singapore, September 3-6, 2002.  J. Farrell, M. Barth, Integration of G PS/INS and Magnetic Markers for Advanced Vehicle Control, PATH-UC Berkeley Report No. UCB-ITS-PRR-2002-32, Oct., 2002.  Y. Zhao, Mobile Phone Location Determ ination and its Impa ct on Intelligent Transportation Systems, IEEE Trans. On ITS Vol. 1, No. 1, March 2000.  P. Kumar, S. Ranganath, H. Weimin, K. Sengupta, Framework for Real-Time Behavior Interpretation from Traffic Video, IEEE Trans. On ITS Vol. 6, No. 1, March 2005.  K. Sun, G. Arr, R. Ramachndran, S. R itchie, Vehicle Reidentification Using Multidetecor Fusion, IEEE Trans. On ITS Vol. 5, No. 3, September 2004.  U. Franke, S. Heinrich, Fast Obstacle Detection for Urban Tr affic Situations, IEEE Transactions on ITS Vol. 3, No 3, Sep 2002.  F. Dellaert, C. Thorpe, Robust car track ing using Kalman filtering and Bayesian templates, Conference on ITS, 1997.  M. Sotelo, F. Rodriguez, L. Magda lena, VIRTUOUS: Vision-Based Road Transportation for Unmanned Operat ion on Urban-Like Scenarios, IEEE Trans. On ITS Vol. 5, No. 2, June 2004.  W. Li, H. Leung, Simultaneous Regist ration and Fusion of Multiple Dissimilar Sensor for Cooperative Driving, IEEE Trans. On ITS Vol. 5, No. 2, June 2004.  D. Diel, Stochastic constraints for vision aided inertial naviga tion, M.S. thesis, Dept. Mechanical. Eng., Mass achusetts institute of technology, Cambridge, MS, USA, 2005.
150  J. Hespanha, O. Yakimenko, I. Kaminer, A. Pascoal, Linear parametrically varying systems with brief inst abilities: an application to vision/inertial navigation, IEEE Trans. On Aerospace and Electronic Systems Vol. 40, No. 3, July 2004.  B. M. Scherzinger, Precise Robus t Positioning with Inertial/GPS RTK, Proceedings of ION-GPS 2000, Salt la ke city, Utah, September, 2000.  C. J. Taylor, and D. J. Kriegman, Str ucture and motion from line segments in multiple images, IEEE Trans. on Pattern An alysis and Machine Intelligence, vol. 17, No. 11, Nov. 1995, pp. 1021.  J. Canny, A computational approach to edge detection, IEEE Trans. on Pattern Analysis and Machine Intelligen ce, vol. 8, Issue 6, Nov. 1986.  Ma Y., Kosecka J., Sastry S., Motion R ecovery from Image Sequence: Discrete viewpoint vs Differe ntial viewpoint, ECCV 1998.  Birchfield S. (2006 November), KLT: An implementation of the Kanade-LucasTomasi Feature Tracker, [online]. http://www.ces.clemson.edu/~stb/klt/.  Faugeras O., Three Dimensional Computer Vision: A Geometric Viewpoint., Second edition, MIT press, Cambridge, Ma, 1996.  K. Shoemake, Animating rotation with quaternion curves, in ACM SIGGRAPH computer graphics, vol. 19, No. 3, San Francisco, July 22-26, 1985.  Randeniya D., Gunaratne M., Sarkar S. a nd Nazef A., Calibration of Inertial and Vision Systems as a Prelude to Mu lti-Sensor Fusion, Under Review Transportation Research Part C (Emerging Technologies), Elsevier, June 2006.  D. H. Titterton and J.L. Weston, Strapdown inertial navigation technology, in IEE Radar, Sonar, Navi gation and Avionics se ries 5 E. D. R. Shearman and P. Bradsell, Ed. London: Peter Peregrinus Ltd, 1997, pp. 24.  Grewal M, Andrews A, Kalman Filtering Theory and Practice using MATLAB , 2nd edition, John Wiley & Sons, NY, USA, 2001.  B. K. P. Horn Closed-form solution of absolute orientation using unit quaternions, Journal of the Optical Society of America vol. 4, pp. 629, Apr. 1987.  J. Alves, J. Lobo, and J. Dias, Camer a-inertial sensor m odeling and alignment for visual navigation, in Proc. 11th International Conference on Advanced Robotics Coimbra, Portugal, June 30 July 3, 2003.  P. Lang, A. Pinz, Calibration of hybr id vision/inertial tr acking systems, in Proc. 2nd Integration of Vi sion and Inertial Sensors Barcelona, Spain, April 18, 2005.
151  E. Foxlin, and L. Naimark, Miniaturi zation, calibration and accuracy evaluation of a hybrid self-tracker, in Proc. 2nd IEEE and ACM International Symposium on Mixed and Augmented Reality Tokyo, Japan, October 7-10, 2003.  B. M. Scherzinger, Precise Robus t Positioning with Inertial/GPS RTK, Proceedings of ION-GPS 2000, Salt la ke city, Utah, September, 2000.  J. Heikkila, Geometric camera calibra tion using circular control points, IEEE Trans. on Pattern Analysis and Machine Intelligence vol. 22, No. 10, Oct. 2000.  Z. Zhang. (2006, February). A flexible new technique for camera calibration. Microsoft Research. [Online]. http://research.microsoft.com/~zhang/Calib/  Beer F.P., Johnston E.R. Jr, Vector Mechanics for Engineers-dynamics Mc Graw-Hill, New York, 1977.  Scherzinger B.M., Precise robust positioning with inertial/GPS RTK, Proceedings of ION-GPS 2000, Salt la ke city, Utah, September, 2000.  POS/LV V3 Installation and Operation Guide Applanix Corporation, Ontario, Canada, 2000.  POSPac User Manual Applanix Corporation, Ontario, Canada, 2002.  Van Dine C.P., Overturf J., Route mapping a nd linear referencing with photolog geometric data, Proceedings GEOTEC event, Vancouver, BC, March 2003.  Minnesota DOT, Roadway Design Manual (Metric).  http://www.dot.state.mn.us/tecsup/rdm/metric/3m.pdf October 1999, accessed on 07/13/2004.  Glenon J.C., Loumiet J.R., Measuri ng Roadway Curve Radius Using the Compass Method, January 2003.  Greenspan, R. L. (1995). Inertial navigation technology from 1970. Navigation: Journal of The Institute of navigation, 42( 1), 165. Special Issue.  Gebre-Egziabher, D., Powell, J. D ., and Enge, P. K. (2001). Design and performance analysis of a lowcost ai ded dead reckoning navigation system. In International Conference on Integrated Navi gation Systems, St. Petersburg, Russia.  D. Bevly, J Ryu and J.C. Gerdes, Integrating INS sensors with GPS measurements for continuous estimation of vehicle sideslip, roll, and tire cornering stiffness, IEEE Trans. on ITS Vol. 7, No. 4, Dec. 2006.
152  R. Gregor, M Lutzeler, M Pellkofer, K.H. Siedersber ger and E.D. Dickmanns, EMS-Vision: A perceptual syst em for autonomous vehicles, IEEE Trans. on ITS Vol. 3, No. 1, March 2002.  M. Bertozzi, A. Broggi, A. Fascioli and S. Nichele, S tereo vision-based vehicle detection, Proc. IEEE Intelligent Vehicle Symposium 2000.  A. Huster and S. Rock, Relative positi on sensing by fusing m onocular vision and inertial rate sensors, Proceeding of the 11th Internati onal Conference on Advanced Robotics Portugal, 2003.  Z Hu, U Keiichi, H. Lu H and F. Lamosa Fusion of vision 3D gyro and GPS for camera dynamic registration, Proceedings of the 17th International Conference on Pattern Recognition 2004.  M Bertozzi and A Br oggi, GOLD: A parallel realtime stereo vision system for generic obstacle and lane detection, IEEE Trans. on image processing Vol. 7, No. 1, Jan. 1998.  M Bertozzi, A Broggi a nd A Fascioli, Vision-based in telligent vehicles: State of the art and perspectives, Robotics and Autonomous Systems 32, Elsevier, 2000.  J.C. McCall and M.M. Trivedi, Video-based lane estimation and tracking for driver assistance: Survey, system, and evaluation, IEEE Transactions of Intelligent Transportation Systems Vol. 7, No. 1, March 2006.  S.I. Roumeliotis, A.E. Johnson and J. F. Montgomery, A ugmenting inertial navigation with image-based motion estimation, Proceedings of IEEE on International Conference on Robotics & Automation Washington DC, 2002.  J. Chen and A. Pinz, Structure and Mo tion by fusion of inertial and vision-based tracking, Proceedings of the 28th OAGM/AAPR conference Vol. 179 of Schriftenreihe, 2004, pp 55-62.  S. You and U. Neumann, Fusion of vision and gyro tracking for robust augmented reality registration, IEEE conference on Virtual Reality 2001.  E. Shin, Estimation of techniques for low cost iner tial navigation, PhD dissertation University of Calgary, CA, 2005.  C Jekeli, Inertial navigation systems w ith geodetic applications, Walter de Gruyter GmbH & Co., Berlin, Germany, 2000.  N. Barbour and G Schmidt, Ine rtial sensor technology trends, Trans. IEEE Sensors Journal Vol 1, No.4, Dec. 2001.
153  D. Randeniya, M. Gunaratne, S. Sarkar a nd A. Nazef, Calibrati on of Inertial and Vision Systems as a Prelude to Multi-Sensor Fusion, Transportation Research Part C (Emerging Technologies) Elsevier, to be published.  D. Randeniya, M. Gunaratne, S. Sarkar and A. Nazef, Structure from Motion Algorithms for Inertial a nd Vision based Vehicle Navi gation, under review by the Journal of Intelligent Trans portation Systems, December 2006.  D. A. Forsyth and J. Ponce, Computer Vision A Modern Approach, Prentice Hall, NJ, USA, 2003.  D.J. Allerton and H. Jia, A review of multi sensor fusion methodologies for aircraft navigation systems, The Journal of Navigation, The Royal institute of Navigation, 58, 2005, pp 405-417.  D.J. Allerton and A.J. Clare, Senso r fusion methods for synthetic vision systems, Proc. of the 23rd Digital Avionics Systems conference (DASC 04), Oct. 2004.  L. Armesto, S. Chroust, M. Vincze and J. Tornero, Multi-rate fusion with vision and inertial sensors, Proc. of International conf erence on Robotics and Automation April 2004.  M. Grewal, L. Weill and A. Andrews, Global positioning systems, inertial navigation and integration, John Wiley & Sons, NY, USA, 2001.  A.H. Jazwinski, Stochastic Processes and Filtering Theory, Academic press, NY, USA, 1970.  A. Mraz Evaluation of Digital Imaging Systems Used in Highway Aplications, Doctoral Dissertation, University of South Florida, July 2004.  Gunaratne, M., Mraz, A. and Sokolic I., Study of the Feasibility Of Video Logging with Pavement Condition Evaluation, Report No: BC 965, July 2003.  Gunaratne M and Randeniya D, "Use of an Improved Algorithm to Predict Road Curvature from Inertial Measurements", Program Report submitted to the Florida Department of Tr ansportation, September 2004.  Min Shin, Goldgof D, Bowyer K, Compa rison of Edge Detection Algorithms using a Structure from Motion Task, IEEE Trans. On Systems, Man and CyberneticsPart B: Cybernetics Vol.31, No 4, August 2001.  Sharp C, Shakernia O, Sastry S, A Vision System for Landing an Unmanned Aerial Vehicle, Proceedings of the2001 IEEE Inter national Conference on Robotics and Automation Seoul, Korea, May 2001.
154  R.E. Kalman, A new approach to lin ear filtering and pr ediction problem, Trans. On ASME Journal of Basic Engineering 82 (Series D), 35-45, 1960.
156Appendix A Structure fr om Motion Algorithms A.1 Motion Estimation from Line Correspondences (Taylor et al., 1995) Recovering the three dimensiona l structure of a scene with the aid of a moving camera is useful in estimating the translat ional and rotational vectors of the vehicl e. Taylor et al.  introduced a methodology to estimate th e Structure from Motion (SFM) from a scene composed of straight line segments using the image data obtained by a camera attached to the system in motion. Extr action of the translational and rotational components from the captured im ages is achieved by minimizing an objective function in the form of; m j n i j i i iu q p f Error O11 ,) ), ( ( (A-1) Where, f(pi,qj) : Function representing the image formation process. p : Position of the three dimensional line. q : Position and orientation of the camera. ui,j : Position and orientation measurement of the projection of feature i in image j Error ( ) : Positive real valued func tion that measures the difference between ui,j and f(pi,qj) The Error function in (A-1) can be expressed as; m BA A m) (T TError (A-2) Where, m= ( mx,my,mz) is the surface normal to the image plane and
157Appendix A (Continued) ; 1 12 2 1 1 y x y xA 1 5 0 5 0 1 ) ( 32 2y xm m lB. ( x1, y1) and ( x2, y2) are the two end points (in terms of pixel coordinates) of any line correspondence in the image. The minimization process mentioned above is a two stages, hybrid, minimization process consisting of global and local minimization co mponents. To start th e process, a random initial estimate for the camera orientation is selected. Once the initial estimates for the camera orientation is availabl e the second stage of the mini mization, local minimization, process starts. The latter is achieved in four stages. If two coordinate frames are considered, i.e. the global coordinate frame denoted by w and the vision (camera) c oordinate frame denoted by c, the perpendicular distance to any vector (v) measured in the camera coordinate frame (dc) can be expressed as; ) )v .v (t t (d dw w w c w c w c w c R (A-3) Wherec wRexpresses the rotation between the two frames, w ct is the translation between the two frames measured in the global coordinate frame and dw is the perpendicular distance to any vector (vw) from the vision (camera) frame origin measured in the global coordinate frame. The surface normal to the image plane can be obtained as; ) (w c w w c w c c c ct d v m d v m R (A-4) From (A-4) it is possible to deduce two constraints in the form of;
158Appendix A (Continued) 0 ) .( w c w T cv mR (A-5) 0 )) ( .( c w w c w T ct d m R (A-6) From the constraint (A-5) it is possible to deduce the generalized optimization function; m j n i i j T ijv C11 2 1) ( R m (A-7) Where, mij is the evaluated normal to the plane passing through the camera center and the observed edge, Rj is the selected random camera orientation and vi are the projection line directions. Therefore, once the random rotatio ns are obtained at the global optimization step, (A-7) can be used to obtain the initia l projection line directions. This is the first stage of the local minimizati on process. In the second stag e the initial values obtained from the first stage are further optimized using (A-7). In the third stage, the initial estimates for the camera translat ion and projection line positions are obtained. For this task a different objective function is deduced from the constraint given by (A-8); 2 11 2)) ( ( m j n i i i j T ijt d CR m (A-8) Where di are the projection line positions and ti is the camera translation. Once the values for Rj, vi, di and ti are obtained, the optimizat ion function given in (A-1) is used to derive the optimized solution for both the structure of the scene and the position of the camera.
159 Appendix B Multi Purpose Survey Vehicl es IMU and Camera Specifications B.1 IMU Specifications (Nor throp Grumman LN 200)
160 Appendix B (Continued)
161 Appendix B (Continued) B.2 Forward View Camera Specifications (DVC Company)
162 Appendix B (Continued)
ABOUT THE AUTHOR Duminda Randeniya received his B.Sc. (Eng) with honors, in Mechanical Engineering from University of Peradeniya, Sri Lanka in 2001. He earned hi s Masters degree in Applied Mathematics from Texas Tech Univer sity in 2003 and enrolled in PhD program in 2004 at University of South Florida. While pursuing the PhD, Mr. Randeniya activ ely participated in numerous research projects with Florida Department of Trans portation. He also coauthored three journal papers and two conference papers.