USFDC Home  USF Electronic Theses and Dissertations   RSS 
Material Information
Subjects
Notes
Record Information

Full Text 
PAGE 1 Learning from Biometric Distances: Performance and Security Related Issues in F ace Recognition Systems by Pranab Mohanty A dissertation submitted in partial fulllment of the requirements for the de gree of Doctor of Philosoph y in Computer Science and Engineering Department of Computer Science and Engineering Colle ge of Engineering Uni v ersity of South Florida CoMajor Professor: Sudeep Sarkar Ph.D. CoMajor Professor: Rang achar Kasturi, Ph.D. Dmitry Goldgof, Ph.D. Ra vi Sankar Ph.D. Dmitry Kha vinson, Ph.D. Date of Appro v al: No v ember 2, 2007 K e yw ords: Security and Pri v ac y Issues in Biometrics, Modeling F ace Recognition Algorithm, F ace T emplate Reconstruction, Multidimensional Scaling, Distance Based Embedding, F ace T emplate Inde xing, Hill Climbing Approach, Hacking F ace Recognition Systems c r Cop yright 2007, Pranab Mohanty PAGE 2 DEDICA TION T o my Mother PAGE 3 A CKNO WLEDGEMENTS In the rst place I w ould lik e to record my gratitude to Prof. Sudeep Sarkar for his supervision, advice, and guidance in e v ery phase of this dissertation as well as gi ving me e xtraordinary e xperiences through the course of my PhD. Abo v e all and the most needed, he pro vided me uninching encour agement and support in numerous w ays. His truly scientic intuition and passion to w ards research inspires and enriches my gro wth as a student and a researcher I am indebted to him more than he kno ws. I gratefully ackno wledge Prof. Rang achar Kasturi for his supervision, support and often crucial advice to this dissertation and my o v erall research. His in v olv ement and originality has triggered and nourished my intellectual maturity that I will benet from for a long time to come. I am e xtraordinarily fortunate in ha ving Prof. Rang achar Kasturi as my comajor professor I am deeply indebted to Prof. Dmitry Goldgof, Prof. Ra vi Sankar and Prof. Dmitry Kha vinson for their constructi v e comments on this dissertation. I am thankful that in the midst of all their acti vities, the y accepted to be members of my PhD committee. It w as a pleasure and honor to w ork with my committee members. I am also thankful to V asant Manohar Himanshu V ajaria, T anmo y Islam, Dr P admanabhan Soundararajan and other colleagues for their helpful and willing interactions during the numerous discussions in our lab W arm ackno wledgments are also o wed to man y friends at the Uni v ersity of South Florida, T ampa, whose presence and encouragement w as not only helpful and memorable, b ut k ept me refreshed and dedicated to this task. Furthermore, I w ould lik e to thank the friendly and helpful staf f of Computer Science department for pro viding me a smooth w orking en vironment. Finally I w ould lik e to e xtend my deepest gratitude to my dad and brother and my friends for al w ays being there in perple xing situations and adding color to the life through their lo v e and af fection. PAGE 4 T ABLE OF CONTENTS LIST OF T ABLES iii LIST OF FIGURES i v ABSTRA CT vi CHAPTER 1 INTR ODUCTION 1 CHAPTER 2 RELA TED W ORK 7 2.1 Linear Models for Biometric Algorithms 7 2.2 Biometric T emplate Reconstruction and Its Impact on Security Issues 9 2.2.1 Hill Climbing Based Attacks 10 CHAPTER 3 MODELING F A CE RECOGNITION ALGORITHMS 15 3.1 Deri v ation of the Linear Model 15 3.1.1 Computing Point Conguration 17 3.1.2 Choice of Initial Point Conguration 19 3.1.3 Classical Multidimensional Scaling 22 3.1.4 Af ne T ransformation 23 3.2 Experimental Setup 25 3.2.1 Data Sets 25 3.2.2 F ace Recognition Algorithms and Distance T ransformation 26 3.2.3 T rain and T est Sets 31 3.2.4 Performance Metrics for the Linear Model 34 3.3 Modeling Results 35 3.3.1 Model Space 35 3.3.2 Recognition Performances 38 3.3.3 Local Manifold Structure 48 3.3.4 Ef fect of Distance Measures and Score Normalization 49 3.4 Discussion 52 CHAPTER 4 INDEXING F A CE TEMPLA TES 54 4.1 Inde xing with Linear Model 54 4.2 Experimental Setup 56 4.3 Results 57 4.4 Discussion 60 i PAGE 5 CHAPTER 5 F A CE TEMPLA TE RECONSTR UCTION 61 5.1 Af ne T ransformation From Match Scores 63 5.2 Embedding and Reconstruction 65 5.3 Experimental Setup 67 5.3.1 Database 68 5.3.2 F ace Recognition Algorithms 69 5.3.3 Distance Measure 70 5.4 Results 70 5.4.1 Reconstructed F ace T emplates 72 5.5 Discussion 73 CHAPTER 6 PRIV A CY AND SECURITY ISSUES RELA TED T O F A CE RECOGNITION SYSTEMS 76 6.1 Probability of Breakin 78 6.2 Comparison with Hill Climbing Approach 79 6.3 Ef fect of Score Quantization 82 6.4 Discussion 85 CHAPTER 7 CONCLUSIONS 87 7.1 Future W ork 88 REFERENCES 93 ABOUT THE A UTHOR End P age ii PAGE 6 LIST OF T ABLES T able 3.1 Similarity/Dissimilarity Measures of Dif ferent F ace Recognition Algorithms. 30 T able 3.2 Summary of T rain and T est Sets. 33 T able 3.3 Ef fect of Distance Measure on the Model: Error in Modeling for PCA Algorithm on the FERET f afb (1195 subjects) Probe Set. 50 T able 4.1 Inde xing Error T able: V alue of K at Three Dif ferent Inde xing Error Rates for Rank 1 (Rank 5) Identication Rate on the FERET f afb (1195 Subjects) Probe Set. 57 T able 4.2 Inde xing Error T able: V alue of K at Three Dif ferent Inde xing Error Rates for Rank 1 (Rank 5) Identication Rate on the FERET dup1 (722 Subjects) Probe Set. 57 T able 6.1 Comparison of Reconstructed T emplates Using Our Approach Ag ainst Hill Climbing Approach. 80 T able 6.2 Ef fect of Quantization of Match Scores on Reconstructed T emplates. 84 iii PAGE 7 LIST OF FIGURES Figure 1.1 General Ov ervie w of Biometric Applications and Contrib ution from this Dissertation. 2 Figure 1.2 Approximating F ace Recognition Algorithms by Linear Models. 3 Figure 2.1 Schematic of Processes in (a) Hill Climbing Attack and (b) the Proposed Model Based Approach. 11 Figure 2.2 V isualization of the Search Process of a Hill Climbing Attack and the Proposed Model Based Approach. 13 Figure 3.1 Modeling F ace Recognition Algorithms. 16 Figure 3.2 Steps to Compute Initial Conguration Points Y ( 0 ) 21 Figure 3.3 From Distance Metric to Points in Model Space. 22 Figure 3.4 Sample F ace Images. 26 Figure 3.5 T op F our Dimensions of Corresponding Af ne Spaces for Each Algorithm. 36 Figure 3.6 Eigen v alues of Distance Matrix for Each Algorithm. 39 Figure 3.7 R OC Curv es: Comparison of Recognition Performance of PCA Algorithm with the Corresponding Linear Model. 40 Figure 3.8 R OC Curv es: Comparison of Recognition Performance of LD A Algorithm with the Corresponding Linear Model. 41 Figure 3.9 R OC Curv es: Comparison of Recognition Performance of ICA Algorithm with the Corresponding Linear Model. 42 Figure 3.10 R OC Curv es: Comparison of Recognition Performance of B A Y Algorithm with the Corresponding Linear Model. 43 Figure 3.11 R OC Curv es: Comparison of Recognition Performance of EBGM Algorithm with the Corresponding Linear Model. 44 Figure 3.12 R OC Curv es: Comparison of Recognition Performance of COM with the Corresponding Linear Model. 45 i v PAGE 8 Figure 3.13 R OC Curv es: Comparison of Recognition Performance of COM with the Corresponding Linear Model. 46 Figure 3.14 Comparison of Recognition Performances on the FRGC Experiments with the Corresponding Linear Model. 47 Figure 3.15 Similarity of the Local Manifold Structure Between the Original Algorithm and the Linear Model as Captured by the Nearest Neighbor Agreement Metric using the FERET f afb Probe Set. 49 Figure 3.16 R OC Curv es Indicating Score Normalization Ef fect on Proposed Modeling Scheme. 51 Figure 4.1 Outline of Inde xing Approach. 55 Figure 4.2 Inde xing T radeOf f on the FERET f afb set with Rank1 and Rank5. 58 Figure 4.3 Inde xing T radeOf f on the FERET dup1 set with Rank1 and Rank5. 59 Figure 5.1 Outline of F ace T emplate Reconstruction Process. 62 Figure 5.2 Block Diagram of Embedding Scheme. 65 Figure 5.3 Block Diagram of Reconstruction Scheme. 65 Figure 5.4 Sample Images from (a) Breakin Set and (b) Gallery Set. 68 Figure 5.5 The Eigen v alues of the Matrix B for Three F ace Recognition Algorithms. 71 Figure 5.6 Reconstructed F ace T emplates Using a Breakin Set with 600 Images. 73 Figure 5.7 V ariation in Reconstructed T emplates for Three Algorithms. 74 Figure 6.1 Probability of Breakin Using the Reconstructed F ace T emplate. 78 Figure 6.2 Probability of Breakin Using Fi v e Dif ferent Breakin Sets for Three Algorithms at 1% F AR on the FERET Gallery Set. 79 Figure 6.3 Comparison of Probability of Breakin at 1% F AR for Commercial Algorithm. 81 Figure 6.4 Probability of Breakin at 1% F AR for Commercial Algorithm with Quantized Scores. 83 Figure 7.1 Easy/Dif cult F ace Images for Commercial Algorithm. 89 v PAGE 9 LEARNING FR OM BIOMETRIC DIST ANCES: PERFORMANCE AND SECURITY RELA TED ISSUES IN F A CE RECOGNITION SYSTEMS Pranab Mohanty ABSTRA CT W e present a theory for constructing linear black box approximations to f ace recognition algorithms and empirically demonstrate that a surprisingly di v erse set of f ace recognition approaches can be approximated well using a linear model. The construction of the linear model to a f ace recognition algorithm in v olv es embedding of a training set of f ace images constrained by the distances between them, as computed by the f ace recognition algorithm being approximated. W e accomplish this embedding by iterati v e majorization, initialized by classical multidimensional scaling (MDS). W e empirically demonstrate the adequac y of the linear model using six f ace recognition algorithms, spanning both template based and feature based approaches on standard f ace recognition benchmarks such as the F acial Recognition T echnology (FERET) and F ace Recognition Grand Challenge (FRGC) data sets. The e xperimental results sho w that the a v erage Error in Modeling for six algorithms is 6.3% at 0.001 F alse Acceptance Rate (F AR), for FERET f afb probe set which contains maximum number of subjects among all the probe sets. W e demonstrate the usefulness of the linear model for algorithm dependent inde xing of f ace databases and nd that it results in more than 20 times reduction in f ace comparisons for Bayesian Intra/Extraclass person classier (B A Y), Elastic Bunch Graph Matching algorithm (EBGM), and the commercial f ace recognition algorithms. W e also propose a no v el paradigm to reconstruct f ace templates from match scores using the linear model and use the reconstructed templates to e xplore the security breach in a f ace recognition system. W e e v aluate the proposed template reconstruction scheme using three, fundamentally dif fer ent, f ace recognition algorithms: Principal Component Analysis (PCA), Bayesian Intra/Extraclass person classier (B A Y), and a feature based commercial algorithm. W ith an operational point set vi PAGE 10 at 1% F alse Acceptance Rate (F AR) and 99% T rue Acceptance Rate (T AR) for 1196 enrollments (FERET g allery), we sho w that at most 600 attempts (score computations) are required to achie v e 73%, 72% and 100% chance of breaking in as a randomly chosen tar get subject for the commercial, B A Y and PCA based f ace recognition system, respecti v ely W e also sho w that the proposed reconstruction scheme has 47% more probability of breaking in as a randomly chosen tar get subject for the commercial system as compared to a hill climbing approach with the same number of attempts. vii PAGE 11 CHAPTER 1 INTR ODUCTION In the past fe w decades, research in biometric application has dra wn a lot of attention from the computer science community with much success. Initially the focus has been centered on the dev elopment of no v el algorithms for biometric template matching techniques in v arious challenging en vironment. Fusion of v arious biometric modalities is also a major topic of research to compensate the performance of indi vidual biometrics in noncooperati v e en vironments. Recently with the cheaper biometric sensors and high accurac y of man y biometric modalities, v arious adv ance yet practical research topics in biometric authentication systems ha v e been proposed and discussed to a certain length. Ef cient storage and retrie v al of biometric templates, f ast and nonintrusi v e biometric sensors, secure transmission of biometric templates o v er netw ork, biometric system rob ustness ag ainst dif fer ent types of hacking mechanisms and user pri v ac y are among se v eral adv anced topics in biometrics applications. Due to the uniqueness of biometric templates and lack of reissue ability of biometric templates, template security and user pri v ac y are the major hurdles in wide scale deplo yment of biometrics sensors in real time applications and these adv ance topics play a signicant role in biometric applications. In this dissertation, we proposed a no v el linear modeling mechanism to a biometric authentication system and un v eil the strength of such linear model to help inde xing biometric templates, identifying possible threats to biometric systems and template reconstruction. A broad look at biometrics related research is depicted in Fig. 1.1 along with the pointer to the contrib ution from this dissertation in the related areas. In this dissertation, we ha v e used f ace images as a primary biometric modality and se v eral f ace recognition systems to demonstrate our ideas. But the theoretical nding of this dissertation can be e xtended to other biometric modalities too. Among se v eral biometrics, f ace images emer ge as a strong contender for biometric applications for v arious reasons, such as, lo wcost deplo yment, public 1 PAGE 12 Figure 1.1 General Ov ervie w of Biometric Applications and Contrib ution from this Dissertation. 2 PAGE 13 acceptance and con v enient enrollment procedures. Intensi v e research has produced an amazingly di v erse set of approaches for f ace recognition (see [23, 25, 70] for e xcellent re vie ws). The approaches dif fer in terms of the features used, distance measures used, need for training, and matching methods. Systematic and re gular e v aluations such as the FERET (F acial Recognition T echnology) [40], FRGC (F ace Recognition Grand Challenge) [41] and F ace Recognition V endor T est [35] ha v e enabled us to identify the top performing approaches. In general, a f ace recognition algorithm is a module that computes distance (or similarity) between tw o f ace images. Just as linear systems theory allo ws us to characterize a system based on inputs and outputs, we seek to characterize a f ace recognition algorithm based on the distances (the outputs) computed between tw o f aces (the inputs). Can we model the distances, d i j computed by an y gi v en f ace recognition algorithm, as a function of the gi v en f ace images, x i and x j (actually ro wscanned v ector representation of the images)? Mathematically what is the function f so that the error ( d i j jj f ( x i ) f ( x j ) jj ) 2 is minimized? This problem is depicted in Fig. 1.2. (a) (b) Figure 1.2 Approximating F ace Recognition Algorithms by Linear Models. Distance Between T w o F ace Images Observ ed by (a) Original F ace Recognition Algorithm (b) Linear Model. Apart from sheer intellectual curiosity answer to this question has some practical benets. First, it w ould let us ha v e a better understanding of the algorithms than just the recognition rates. F or instance, if f is an identity operator then it w ould suggest that the underlying f ace recognition algorithm is essentially performing a rigid rotation and translation to the f ace representations similar to principal component analysis (PCA). If f is a linear operator then it w ould suggest that the underlying algorithms can be approximated f airly well by a linear transformation (rotation, shear stretch) of the 3 PAGE 14 f ace representations. Second, if a linear approximation can be b uilt then it can be used to reconstruct f ace templates just from scores. W e ha v e demonstrated this ability in [33]. This has serious security and pri v ac y implications. Third, we can use the linear approximation to b uild ef cient inde xing mechanisms for f ace images. This is particularly important for the identication scenarios where one has to perform one to man y matches, especially using a computationally e xpensi v e f ace recognition algorithm. While there are ef cient inde xing mechanisms for ngerprints based on minutiae points, such mechanisms are lacking for f aces because there are no such global features for f ace images. A possible fourth application could be the modeling of netw ork ed multibiometric systems where the linear model of the f ace recognition model can be used to construct closedform analytical models. Specically we consider f s that are af ne transformations, dening a linear subspace spanned by possibly nonorthogonal v ectors. T o arri v e at this model we need a set of f ace images (training set) and the distances between these f ace images as computed by the f ace recognition algorithm. Thus, we treat the algorithm as a black box. F or computational reasons, we decompose the o v erall transformation into tw o parts: a rigid transformation, which can be obtained by an y orthogonal subspace nding algorithm such as the principal component analysis, and a nonrigid, af ne, transformation. Note that the dimensions of the o v erall transformation need not be orthonormal. T o construct the af ne subspace we embed the training set of f ace images constrained by the distances between them, as computed by the f ace recognition algorithm being approximated. W e accomplish this distance preserving embedding by iterati v e majorization, initialized by classical multidimensional scaling (MDS) [13, 19]. This process results in a set of coordinates for the train images. The af ne transformation denes the relationship between these embedding coordinates and the rigid (PCA) space coordinates. W e analyze our proposed linear modeling scheme on some of the popular f ace recognition algorithms, namely Eigenf aces (PCA + distance metrics) [58], Linear Discriminant Analysis (LD A) [7], Bayesian Intra/Extraclass person classier [31], Elastic Bunch Graph Matching (EBGM) [67], Independent Component Analysis (ICA) [18] and one of the top performing commercial f ace recognition algorithm. The choice of the f ace recognition algorithms includes template based approaches such as PCA, LD A, ICA, B A Y and feature based ones, such as the EGBM and a commercial algorithm. W e 4 PAGE 15 use a subset of the FRGC [41] training set as our training set and test the quality of the approximation on the FERET [40] and the FRGC [41] data set. W e use the proposed linear modeling scheme to perform the binning or algorithm specic inde xing task with little o v erhead in computation. Of course, for algorithm such as PCA, LD A and ICA which uses the linear projection of ra w template the model will result in no additional computational adv antage. Ho we v er algorithms such as the B A Y and EBGM, where numerical inde xing of template is not feasible, inde xing through a linear model can reduce the o v erall computational comple xity by selecting only a subset of g allery images to be matched with probe image. W ith the help of the proposed modeling scheme, we demonstrate that B A Y algorithm requires at most 8 comparisons to achie v e similar rank1 performance as compared to using complete probe set, which requires 1195 comparison in the case of f afb probe set. Similarly for other tw o algorithms (EBGM and COM), at most 50 comparisons are suf cient to achie v e similar identication performance at 0.01% error rate for rank1 as well as rank5 identication performances. Using the proposed linear model approach to f ace recognition algorithm, we also un v eil a possible security and pri v ac y breach in a f ace recognition system. W e demonstrate that the discussed linear model from match scores can be successfully used for f ace template reconstruction and e v entually mak e the system vulnerable to break in. Using the match scores between f ace images from a local database, we b uild the corresponding af ne transformation model of the f ace recognition algorithm. It can be noted that the f ace recognition system is treated as a complete black box and we do not perform an y re v erse engineering on the recognition system. The assumption of the kno wledge of the f ace recognition algorithm is a weak one. It might e v en be possible to identify the recognition algorithm from the score matrices of kno wn algorithms. The modeling of the recognition system is an of fline procedure, and needs to be constructed only once for a gi v en recognition algorithm. Once we ha v e b uilt such a model, we present the templates from our breakin set to the system to be brok en and observ e the match scores to an assumed identity So, in real time scenarios, our proposed method only requires access to a set of match scores, equal to the number of images in the breakin set. These match scores are then used to embed the unkno wn tar get template in modeled af ne space. Finally we use the in v erse of the af ne transformation to reconstruct the unkno wn tar get 5 PAGE 16 template in original image space. W e v alidate our proposed template reconstruction scheme on three dif ferent types of f ace recognition systems using tw o standard public databases, F acial Recognition T echnology (FERET) [40] and F ace Recognition Grand Challenge (FRGC) [41]. T w o template based algorithms namely principal component analysis with cosine distance measure [59] (widely accepted as a baseline algorithm ) and Moghaddam and Pentland' s algorithm popularly kno wn as the Bayesian Intra/Extraclass person classier (B A Y) with maximum lik elihood (ML) estimation [31], and a feature based commercial f ace recognition system are used for this e xperiment. A cursory look at match scores from a biometric system may not appear to be a weak link in terms of security and pri v ac y issues; ho we v er with our proposed we re v ealed that e v en match scores carry suf cient information for re v erse engineering of the original templates and should be protected in the same w ay as the original templates. The rest of the dissertation is or g anized as follo ws. In Chapter 2, we re vie wed the e xisting w ork related to approximate biometric system with linear model. W e also discussed the e xisting approach to inde x biometric templates and the inadequac y of such methods to inde x f ace templates. W e presented an e xtensi v e comparison of e xisting hill climbing approach with our proposed template reconstruction method and briey re vie w the security and pri v ac y issues in biometric authentication system. In Chapter 3, we present our proposed linear modeling approach to f ace recognition algorithms along with e xperimental setup and results. Chapter 4 discusses the inde xing of f ace template using proposed linear scheme. In Chapter 5 we present the details of f ace template reconstruction scheme follo wed by a possible use of reconstructed templates to break into f ace recognition systems in Chapter 6. Chapter 7 concludes the dissertation with an elaborati v e discussion of future research direction related to the ndings of this dissertation. 6 PAGE 17 CHAPTER 2 RELA TED W ORK 2.1 Linear Models f or Biometric Algorithms As f ar as we kno w there is no direct related w ork that considers the f ace recognition algorithm modeling problem as we ha v e posed it. Perhaps the closest w orks are those that use multidimensional scaling (MDS) to deri v e models for standard classiers such as nearest neighborhood, linear discriminant analysis, and linear programming problem from the dissimilarity scores between objects [38]. Pekalska et al. [38] propose a similar embedding scheme for the dissimilarity information of dif ferent objects in the Euclidean/pseudoEuclidean space depending on the presence/absence of the Euclidean property of the original distance matrix and then unkno wn objects are projected to the embedded space and classied accordingly with a Euclidean distance measure. A similar frame w ork is also suggested by Roth et al. [48], where pair wise distance information is embedded in Euclidean space and an equi v alence is dra wn between se v eral clustering approaches with similar distance based learning approaches. There are also studies that statistically model similarity scores so as to predict the performance of the algorithm on lar ge data sets based on results on small data sets. F or instance, recently W ang et al. [64] proposed a no v el approach to model and predict performance of a f ace recognition based on analysis of similarity scores. Using similarity scores from g allery images only the intrinsic parameters associated with the recognition algorithms are optimized to nd the best similarity scores between the g allery images; these set of parameters are then used to predict the matching performance for query images. Using the principal component analysis based f ace recognition algorithm with dif ferent distance measures, encouraging results ha v e been reported with the FRGC and the FERET e xperiments. Such approaches require e xtensi v e kno wledge of the parameters of the underlying f ace recognition 7 PAGE 18 algorithms which, in certain cases, are dif cult to access such as the commercial or proprietary algorithms. Mitra et al. [30] used the random ef fect model to predict the performance of a f ace recognition system using the match scores from a small subset of the original data set. The match scores or an monotonic transformation of the match scores between multiple templates of a subject are characterized with random ef fect model along with v ariables associated with the f ace image. A Bayesian estimation of these parameters is then incorporated to predict the set of parameters associated with each subject. Using three dif ferent f ace recognition algorithms, authors reported that the equal error rate for all three algorithms can be predicted accurately up to the rst decimal place. Observing the nature of the random ef fect model, one can infer that such approach requires a signicant number of f ace templates from each user in the training set to estimate the prior distrib ution of the parameters as well as to compute conditional posterior of the unkno wn parameters using Gibbs sampling. Also, due to the nonlinearity nature of the proposed model along with the number of parameters needed to be estimated for each indi vidual, the computational cost of such model is directly proportional to the number of subjects in the data sets. W ang and Bhanu [65] proposed an e xpectation maximization (EM) based approach to predict the performance of the f ace recognition system and nd the smallest g allery set required for such models to accurately predict the performance of a system. A similar approach w as also presented in [24] to b uild a model of cumulati v e match characteristic curv e (CMC) from similarity scores. Grother and Phillips [21] also proposed a joint density function to independently predict match scores and nonmatch scores from a set of match scores. Apart from f ace recognition systems, similar methods were also proposed to model and predict performances for other biometric modalities and objects recognition [14, 26]. A couple of philosophical distinctions e xists between our w ork and these related w orks. First, unlik e these w orks, which try to statistically model the scores, we estimate an analytical model that characterizes the underlying f ace manifold and b uild a linear transformation from the original template to this global manifold. Second, unlik e some of these methods, we do not place an y restrictions on distrib ution of scores in the training set, such as separation between match score distrib ution and non8 PAGE 19 match score distrib ution. Third, we empirically demonstrate the quality of the model under a v ery strict e xperimental frame w ork with a complete separation of train and test. 2.2 Biometric T emplate Reconstruction and Its Impact on Security Issues The increasing demands of biometric technologies can be well justied with its adv antages o v er passw ord or smart card based technologies, such as user con v enience, high security and less fraud. Ho we v er lik e man y other authentication technologies, biometric based systems also possess vulnerable points of security breaches in biometric based authentication systems [42]. The cost of replacing a biometric tok en or template is higher when compared to that of a passw ord or a smart card, with se v ere security and pri v ac y implications. The templates can be reused o v er digital netw orks or can be use to reproduce synthetic biometric templates such as f ak e ngers or model f aces [23, 28]. In case of f ace templates, there is an additional risk that the identity of a person using a biometric access system in a highly secure f acility can be re v ealed. Se v eral authors ha v e successfully pointed out v arious sources of security breaches in biometric based authentication systems [22, 50]. Lately some counter measures ha v e also been proposed to nullify such threats [44, 53, 61] and standardized biometric application programming interf ace (BioAPI) has been continuously updated with counter measure guidelines such as, to encrypt templates, a v oid storage and transmission of original templates, and perform quantization of match scores [9]. Recently cancelable biometrics ha v e been proposed to encrypt both g allery and probe templates in an attempt to restrict unauthorized access of biometric templates [45]. Similarly in [15], Boult proposed re v ocable biometrics with rob ust distance measure where f ace templates are encrypted along with redistrib ution of match scores within a predened range for each subject in the g allery set. These recent de v elopments further indicate that, without encryption of templates, biometric based authentication systems are vulnerable to security breaches and pri v ac y of the users. In general, most biometric authentication systems ha v e four major modules [10], a biometric template acquisition sensor a matching module to compare a ne w template to an enrolled template, a decision module using predened thresholds for particular operational points and a database for enrolled templates (g allery). In man y applications, it is not possible to inte grate all these modules 9 PAGE 20 to one unit. In such scenarios, the information from one unit to the other is passed through digital channels and/or stored in digital media for of fline processing. As reported by man y authors [23, 44], each of these modules possesses dif ferent le v el of security threats, and dif ferent counter measures are necessary to nullify such threats. F or instance, li v eness detection at the sensor unit will detect an y attempts to hack the system with synthetic templates. Similarly a secure database or a secure digital channel will pre v ent an y unauthorized access of templates o v er a netw ork. In this paper we made a successful attempt to e xplore one such point of vulnerability between a matching module and a decision module. In applications, where matching module and decision module are not inte grated together we need to store the match scores in a digital media or transmit the match score through digital channel to a decision module [34]. This scenario can arise in distrib uted netw ork biometric systems with a central decision unit. Such netw orks can arise in wide area monitoring conte xts. In this paper we pose the question: Can unauthorized access of these match scores result in security and pri v ac y breaches? 2.2.1 Hill Climbing Based Attacks Soutar [52] w as the rst to propose an iterati v e template adaptation scheme, popularly kno wn as the hill climbing attack, to break into a biometric system based on match scores. The proposed scheme attacks the account of a specic subject, referred to as the tar get, by starting from an arbitrary f ace template and iterati v ely rening it. At e v ery iteration, if the modied template results in a better score than the pre vious match score then the modied template is retained or else it is discarded. The process is iterated until the template is accepted as the tar get subject. The basic block diagram of the hill climbing attack is sho wn in Fig. 2.1 (a). Note that with this method one might break into a system using a nal template that does not look lik e an y f ace as long as it fools the system. In other w ords, it is not a f ace reconstruction method b ut rather a breakin strate gy Though Soutar did not report an y quantitati v e results of biometric template reconstruction, good performance of similar approaches has been reported by se v eral others [1, 60]. One counter measure for the rst generation of hill climbing approaches is to quantize the scores. W ith appropriate quantization it will not be possible to get incremental feedback as is needed by these 10 PAGE 21 (a) (b) Figure 2.1 Schematic of Processes in (a) Hill Climbing Attack and (b) the Proposed Model Based Approach. The Hill Climbing Attack is an Iterati v e Process that Starts From a F ace T emplate and Then Iterati v ely Updates the T emplate until an Accept Decision is Generated by the System. The Proposed Approach is a Model Based, Oneshot Method, Emplo ying Multiple F ace T emplates the Breakin Set. 11 PAGE 22 approaches. So, Adler [1] proposed a modied hill climbing attack for a f ace recognition system with quantized match scores using an additional independent set of Eigenf aces. The recognition systems that output quantized match scores do not alter the match scores with small changes in input images, which can pre v ent the rst generation of hill climbing attacks. After initializing the process with an arbitrary f ace template, at e v ery iteration, the pre viously updated template is multiplied with randomly selected Eigenf aces with dif ferent weights. This is e xpected to generate templates f arther a w ay from the pre vious template. The f ace template which results in a better match score is retained as the updated image for the ne xt iteration. The process terminates when there is no further impro v ement in match scores. Experimental results on a commercial f ace recognition algorithm sho w that after nearly 4000 attempts a high match score is achie v ed with 99% condence. Later Adler [2] e xtended this idea to w ork with encrypted f ace templates. Security breaches are possible not only for f ace biometrics b ut in other biometrics too. Uludag and Jain [60] e xtended the hill climbing attack idea to break into minutiae based ngerprint recognition algorithms. Initially random minutiae templates are created and matched ag ainst the tar get user by ngerprint matching system. The best matched template is then used to generate another set of minutiae templates by randomly adding and deleting e xisting minutiae. The iteration process is continued till the system accepts the template. The authors reported that all 160 enrolled accounts could be brok en with less than 1000 attempts for each account. Lopresti and Raim [27] proposed an attack on an online handwriting recognition system by randomly generating feature v ectors through a generati v e model of human handwriting. A set of dif ferent te xt samples, from the enrolled users, were fed to the generati v e model. W ith fe w te xt samples from the enrolled users, the model reproduced dif ferent te xt templates thorough random partition and concatenation of the input te xt until a template w as accepted as a successful match. Preliminary results sho w that this attack succeeded 49% of the time. Although hill climbing based attacks can quite successfully break a particular tar get account, effecti v e counter measures for such type of attacks can also be generated. One property of hill climbing based attacks is that the y require a lar ge number of attempts before success. Hence, one possible counter measure for such attacks is to restrict the number of consecuti v e unsuccessful attempts. Ho w12 PAGE 23 e v er this still lea v es the system vulnerable to a sp yw are based attack that interlaces its f alse attempts with attempts by genuine users (successful attempts) and collects information to iterate o v er a period of time. Ho we v er in most of the hill climbing based attacks, the templates at i t h attempt (iteration) are generated from the ( i 1 ) t h attempts (iterations) and are v ery similar to each other Hence, if we monitor all unsuccessful attempts for a particular tar get account within a x ed time interv al, we will disco v er a pattern of similar f aces with decreasing dissimilarity scores (see Fig. 2.2). So, a continuous observ ation of unsuccessful match scores will help to detect hill climbing based sp yw are attacks. In this paper we e xpose a more se v ere form of vulnerability where such counter measures will be hard to design since we use scores from distinct f ace images, with no ob vious patterns in the scores. In Fig. 2.2, we present a schematic visualization of the search process to illustrate the dif ferences between a hill climbing attack and our proposed linear scheme. Our algorithm requires the distances or scores between the tar get and a set of f aces from breakin set that is distrib uted throughout the space. While a hill climbing based attack computes scores for f aces along a trajectory of incremental scores from an arbitrary template to the tar get template, there are no ob vious patterns in the scores needed by our approach; hence, the proposed scheme is not incrementally iterati v e. As discussed earlier the statistically decreasing dissimilarity scores generated by a hill climbing based approach can indeed be Figure 2.2 V isualization of the Search Process of a Hill Climbing Attack and the Proposed Model Based Approach. The Dotted Line Represents One of the Possible P aths in a Hill Climbing Attack Starting from a Chosen T emplate. At Each Point on the P ath, Distance to the T emplate is Needed. The Solid Arro w Represents One T ime Comparison of T ar get T emplate with Members of Breakin Set T emplates that is Needed by The Proposed Approach. 13 PAGE 24 used to detect such attacks, b ut a similar strate gy can not be applied to our proposed method. The hill climbing approach is considered as a breakin strate gy to a recognition system whereas the proposed method is a template reconstruction scheme for an y f ace recognition system. In our case, the breakin performance sho ws the accurac y and condence in reconstructed templates. As a result, the proposed algorithm has vulnerability implications related to both security and pri v ac y issues of the users. Also, the numbers of attempts in our breakin scheme are predened by the number of images in the breakin set which allo ws such attacks to be more feasible in real time applications. In Chapter 6, we present a more systematic e v aluation of our proposed method ag ainst hill climbing based attacks sho wing the ef cienc y of our method both qualitati v ely in terms of reconstructed templates as well as quantitati v ely in terms of the probability of breaking into a recognition system. 14 PAGE 25 CHAPTER 3 MODELING F A CE RECOGNITION ALGORITHMS In order to model a particular algorithm from its outputs, we need to learn the underlying distrib ution of f ace images, also kno wn as manifold of f ace images, as induced by the specic algorithm. W e also need a transformation to project ne w f ace images into the learned manifold. In this Chapter we present the mathematical deri v ation of the proposed af ne transformation based modeling scheme for this manifold. Gi v en a set of f ace images and the pair wise distances between these images, rst we compute a point conguration preserving these pair wise distances between projected points on the lo w dimensional manifold. W e use stress minimization with iterati v e majorization to arri v e at a point conguration from match scores between templates on the training set. Although the iterati v e majorization algorithm is guaranteed to con v er ge with an y starting point, an informati v e initial guess will reduce the number of iterations and speed up the process. W e use classical multidimensional scaling for this purpose. 3.1 Deri v ation of the Linear Model Notations: A fe w notational issues are in order Let d i j be the distance between tw o images, x i and x j (ro wscanned v ector representations), ( x T i 2 N ) as computed by the gi v en f ace recognition algorithm. Here we assume that the f ace recognition algorithm outputs the dissimilarity scores of tw o templates. Ho we v er if a recognition algorithm computes similarities instead of distances, we can al w ays con v ert the similarity scores s i j into distances using a v ariety of transformations, such as ( 1 s i j ) log ( s i j ) or 1 s i j 1. These distances can be arranged as a K K matrix D = [ d 2 i j ] where K is the number of images in the training set. In this paper we will denote matrices by bold capital letters, A column v ectors by bold small letters, a W e will denote the identity matrix by I a v ector of 15 PAGE 26 Figure 3.1 Modeling F ace Recognition Algorithms. Starting with a Gi v en Set of T raining F ace Images, X W e Compute the P air wise Dissimilarities di js Between these Images Using the Underlying F ace Recognition Algorithm. W e Con v ert the P air wise Dissimilarities to an Equi v alent Euclidean Distance Matrices and then Use Stress Minimization Method to Arri v e at the Model Space. The Underlying Algorithm is then Modeled by a Linear T ransformation A which T ransforms the Input Images X to Points of Conguration Y in the Model Space.16 PAGE 27 ones by 1 a v ector of zeros by 0 and the transpose of A by A T F or each image, we w ould lik e to nd v ectors y i such that jj y i y j jj = f ( d i j ) where f ( : ) is a monotonic function and y i T 2 M M M 3.1.1 Computing P oint Conguration The objecti v e is to nd a point conguration Y = [ y 1 ; y 2 ; ; y K ] such that the squared error in distances ( d i j d i j ) 2 is minimum where d i j is the distance computed between f ace template X i and X j and d i j is the Euclidean distance between conguration points y i and y j The objecti v e function can be written as minimize i j w i j ( d i j d i j ) 2 (3.1) where, w i j are weights chosen appropriately The incorporation of weights w i j in Eqn. 3.1 is a generalization of the objecti v e function and can be associated with the condence in the match scores if computed by an y algorithm. The weights can also be useful when there are fe w missing match scores, in which case, the weights for those entries can made to be zero. If there are no condence v alues associated with the similarity scores, then all the weights can be the same. In our e xperiments, the weights are all equal and set to 1. F or generality we de v elop the theory based on weighted scores. Let, S ( Y ) = i < j w i j ( d i j d i j ) 2 = i < j w i j d 2 i j + i < j w i j d 2 i j 2 i < j w i j d i j d i j = h 2 d + h 2 ( Y ) 2 r ( Y ) (3.2) where, h 2 d is independent of the point conguration Y and h 2 ( Y ) = i < j w i j ( y i y j ) T ( y i y j ) = t r ( Y T VY ) (3.3) 17 PAGE 28 where, v i j = w i j for i 6 = j (3.4) v ii = n j = 1 ; j 6 = i w i j : (3.5) Similarly r ( Y ) = i < j w i j d i j d i j d 2 i j = i < j w i j d i j d i j ( y i y j ) T ( y i y j ) = t r ( Y T U ( Y ) Y ) (3.6) where, [ U ( Y ) ] = [ u i j ] and u i j = 8 > < > : w i j d i j d i j if d i j 6 = 0 and i 6 = j 0 otherwise u ii = 8 > < > : j = 1 ; j 6 = i w i j d i j d i j if d i j 6 = 0 0 otherwise Hence, from Eqn. 3.2, S ( Y ) = h 2 d + t r ( Y T VY ) 2 t r ( Y T U ( Y ) Y ) (3.7) The conguration points Y can be found by maximizing S ( Y ) in man y dif ferent w ays. In this w ork, we consider the iterati v e majorization algorithm proposed by Bor g and Groenen [13]. Let, T ( Y ; Z ) = h 2 d + t r ( Y T VY ) 2 t r ( Y T U ( Z ) Z ) (3.8) 18 PAGE 29 then T S and T ( Y ; Y ) = S ( Y ) hence T ( Y ; Z ) majorizes S ( Y ) So the optimal set of conguration points Y can be found as follo ws d T d X = 0 2 VY 2U ( Z ) Z = 0 (3.9) Hence, the iterati v e formula to arri v e at the optimal conguration points can be written as follo ws Y ( k ) = V U ( Y ( k 1 ) ) Y ( k 1 ) (3.10) where, Y ( 0 ) is initialized to some random conguration points and V represent the pseudoin v erse of V 3.1.2 Choice of Initial P oint Conguration Although the iterati v e solution presented in Eqn. 3.10 can be initialized with an y random starting conguration point, an appropriate guess will reduce the number of iteration to nd optimal conguration points. W e initialize the iterati v e algorithm with a set of conguration points deri v ed by applying classical multidimensional scaling on the original distance matrix. Classical multidimensional scaling w orks well when the distance measure is a metric or more specically an Euclidean distance matrix. Therefore, we rst compute an approximate Euclidean distance D E from the original distance matrix D follo wed by the deri v ation of initial conguration points using classical multidimensional scaling adapted from Cox and Cox [19]. Gi v en the original distance matrix D we rst check if the distance matrix satises Euclidean distance properties. If an y such property is violated, then we replace the original distance matrix D with an equi v alent distance matrix D E The term equi v alent is used in the sense that, the o v erall objecti v e of both the distance matrix D and D E remains the same. F or e xample, in our case, adding a constant to all the entries of original distance matrix D E does not alter the o v erall performance of a f ace recognition system and hence, has similar beha vior in terms of recognition performances. 19 PAGE 30 Although there are se v eral possible approaches to v erify the Euclidean property of a gi v en distance matrix [37], we utilize the follo wing theorem proposed by Go wer and Le gendre [20] to check the Euclidean property Theor em 3.1.1 A distance matrix D is Euclidean if and only if B = 1 2 H D 2 H is a symmetric and positive semidenite metric, wher e H is the centering matrix. If the original distance matrix D is not Euclidean, as in case of most of the f ace recognition algorithms, then we use the follo wing propositions to deri v e an equi v alent Euclidean distance matrix D E from D In lieu of biometric applications, we assume that distance between the same template is zero and all the distances are positi v e, as ne g ati v e distances can be shifted to another interv al ensuring the positi v e property without af fecting the o v erall performances. Therefore the possibility of the distance matrix D not satisfying Euclidean property can arise mainly from tw o f actors. Either D is not a positi v e semidenite matrix and/or D does not satisfy triangle inequality In such scenarios, the follo wing propositions are useful to restore the triangle inequality and positi v e semidenite property in the equi v alent matrix D E A detail discussion on deri ving equi v alent Euclidean distance D E can also be found in our earlier w ork [33]. Pr oposition 3.1.2 If D is nonmetric then the matrix [ d r s + c ] ; ( r 6 = s ) is metric wher e c max i ; j ; k j d i j + d ik d j k j Theor em 3.1.3 If D is a metric distance then ther e e xists a constant h suc h that the matrix with elements ( d 2 i j + h ) 1 2 ; i 6 = j is Euclidean, wher e h 2 l n is the smallest (ne gative) eig en value of HDH wher e H = ( I 1 K 11 T ) [19, 20] The steps in v olv ed to modify the original distance and to v erify the Euclidean property of D is outlined in Fig. 3.2. This step also pro vides additional information about the dimension of the model space as well as the distance measure to be used in model space. If the original distance measure is v eried to be Euclidean, then we use Euclidean distance measure in the model space; for all other distance measures, we use the dot product distance in our model space. 20 PAGE 31 Figure 3.2 Steps to Compute Initial Conguration Points Y ( 0 ) First, W e V erify the Euclidean Property of the Original Distance Matrix d and then Compute an Equi v alent Euclidean Distance Matrix d e If Necessary The Dimension of the Model Space is also Determined during this Process. 21 PAGE 32 Figure 3.3 From Distance Metric to Points in Model Space. W e Use Classical MDS to Compute Conguration Points from Distance Matrix D E 3.1.3 Classical Multidimensional Scaling Gi v en the equi v alent Euclidean distance matrix D E the objecti v e is to nd K v ectors, f y 1 ; ; y K g such that D E ( i ; j ) = ( y i y j ) T ( y i y j ) (3.11) Note that the abo v e conguration points y i s are not unique. An y translation or rotation of v ectors y i s can also be a solution to Eqn. 5.1 (See Fig. 3.3). T o reduce such de grees of freedom of the solution set, we constrain the solution set of v ectors to be centered at the origin and the sum of the v ectors to zero, i.e. i y i = 0 Eqn. 5.1 can be compactly represented in matrix form as D E = c 1 T + 1 c T 2 Y T Y (3.12) where Y is matrix constructed using the v ectors y i as the columns Y = [ y 1 ; ; y K ] and c is a column v ector of the magnitudes of the v ectors y i s. Thus c = [ y 1 T y 1 ; ; y K T y K ] T (3.13) 22 PAGE 33 T o simplify Eqn. 3.12, if we preand postmultiple each side of the equation by centering matrix H = ( I 1 K 11 T ) we ha v e B = 1 2 HD E H = Y T Y (3.14) Since D E is Euclidean matrix, the matrix B is also a distance matrix, representing dot product distances between the v ectors, y i and is a symmetric, positi v e semidenite matrix [13, 19]. Thus the initial conguration points can be deri v ed as Y ( 0 ) = ( V M EVD D M EVD 1 2 ) T (3.15) where D M EVD is a M M diagonal matrix consisting of M nonzero eigen v alues of B and V M EVD represents the corresponding eigen v ectors of B 3.1.4 Afne T ransf ormation So f ar we ha v e seen ho w to nd a set of coordinates, Y such that the Euclidean distance between these coordinates is related to the distances computed by the recognition algorithm by an additi v e constant. W e no w nd an af ne transformation, A that will relate these coordinates, Y to the images, X such that Y = A ( X ) (3.16) where is the mean of the images in breakin set, i.e. a v erage f ace. W e do not restrict this transformation to be orthonormal or rigid. W e consider A to be composed of tw o subtransformations: nonrigid transformation A nr and rigid transformation rigid, A r i.e., A = A nr A r The rigid part A r can be ar ri v ed at by an y analysis that computes an orthonormal subspace from the gi v en set of training images. In this e xperiment, we use the principal component analysis (PCA) for the rigid transformation. Let the PCA coordinates corresponding to the nonzero eigen v alues, i.e. nonnull subspace, be denoted by X r = A r ( X ) The nonrigid transformation, A nr relates these rigid coordinates, X r to the distance based coordinates, Y Y = A nr X r (3.17) 23 PAGE 34 Substituting Eqn. 5.3 in Eqn. 3.17 we ha v e A nr X r = ( V M EVD D M EVD 1 2 ) T (3.18) Multiplying both sides of Eqn. 3.18 by X r T and using the result that X r X r T = L P C A where L P C A is the diagonal matrix with the nonzero eigen v alues computed by PCA, we ha v e A nr = ( V M EVD D M EVD 1 2 ) T X r T L PCA 1 (3.19) This nonrigid transformation allo ws for shear and stress, and the rigid transformation, computed by principal component analysis, together model the f ace recognition algorithm. Note that the rigid transformation is not dependent on the f ace recognition algorithm; it is only the nonrigid part that is determined by the distances computed by the recognition algorithm. An alternati v e vie wpoint could be that the nonrigid transformation captures the dif ference between a PCA based recognition strate gy the baseline and the gi v en f ace recognition algorithm. Thus the o v erall outline of modeling approach can be summarized as follo ws. Input: A training set containing K f ace images. The distance matrix 'D' computed on the training using the f ace recognition algorithm. Algorithm: Compute initial conguration points Y ( 0 ) (See Fig. 3.2). Use the iterati v e scheme in Eqn. 3.10 to arri v e at the nal conguration points. The iteration is terminated when the error S ( Y ) is less than the tolerance parameter e which is empirically set to 0.001 in our e xperiments. Compute the rigid subtransformation A r using PCA on training set. Compute the nonrigid subtransformation A nr A = A r A nr is the required model af ne transformation. 24 PAGE 35 W e e v aluate the accurac y of the proposed linear modeling scheme using six fundamentally different f ace recognition algorithms and compare recognition performances of each of these algorithms with respecti v e models. W e demonstrate the consistenc y of modeling scheme on tw o standard f ace data sets (FERET and FRGC). In the follo wing subsections, we pro vide more details about the f ace recognition algorithms and the distance measures associated with these algorithms, train and test sets used in our e xperiments, and the metrics used to e v aluate the strength of the purposed modeling scheme. 3.2 Experimental Setup Experimental results, presented in this section, v alidate that the proposed linear modeling scheme generalizes to a dif ferent probe set representing dif ferent v ariations in f ace images (FERET probe sets) and consistent with performances of f ace recognition algorithms on lar ge scale data sets (FRGC Experiments) as well. W e demonstrate that, dif ferent distance measures with PCA algorithm, and normalization of match scores, ha v e minimal impact on the proposed modeling approach. 3.2.1 Data Sets F or our e xperiments, we use tw o popular f ace image data sets namely the F acial Recognition T echnology (FERET) data set [40] and the F ace Recognition Grand Challenge (FRGC) data set [41]. These tw o data sets are publicly a v ailable and equipped with predened training, g allery and probe sets commonly used to e v aluate f ace recognition algorithms. The FERET data set contains the maximum number of distinct subjects in the g allery set (1196) and the FRGC data set contains maximum number of f ace images (16028 from 466 subjects). All the f ace images used in this e xperiment, e xcept for EBGM algorithm, were normalized using the Colorado State Uni v ersity (CSU) F ace Identication Ev aluation System [8] to ha v e the same e ye location, the same size (150 x 130) and similar intensity distrib ution. Fe w normalized f ace images are sho wn in Fig. 3.4. The EBGM algorithm requires a special normalization process for f ace images which are also a v ailable in [8]. 25 PAGE 36 Figure 3.4 Sample F ace Images. T op Ro w Represents Sample T raining Images from the FRGC T raining Set. Middle Ro w Represents Sample Gallery Images from the FERET Data Set and Bottom Ro w Represents Sample Probe Images with Dif ferent V ariations. 3.2.2 F ace Recognition Algorithms and Distance T ransf ormation W e e v aluate our proposed modeling scheme on four dif ferent template based algorithms and tw o feature based f ace recognition algorithms. The template based approaches include Principal Component Analysis (PCA), Independent Component Analysis (ICA), Linear Discriminant Analysis (LD A), and Bayesian Intra/Extraclass person classier (B A Y). A commercial algorithm (COM) and Elastic Bunch Graph Matching (EBGM) algorithm are selected to represent the feature based recognition algorithms. In this section, we briey described these algorithms and the respecti v e distance measures used by these algorithms. F or further details on these algorithms, the readers may refer to the original papers or recent surv e ys on f ace recognition algorithms [23, 25, 70]. F ace recognition based on Principal Component Analysis (PCA) [58] of f ace images is one of the most popular algorithms and also considered as a baseline algorithm for f ace recognition system. The algorithms projects both the probe and g allery images to a lo wer dimensional space using a set of basis images popularly kno wn as Eigenf aces, computed from a set of training images. The distance between probe and g allery image is then computed in the projected space to measure the dissimilarity between 26 PAGE 37 tw o f aces. In this e xperiment, we use Mahalanobis cosine distance measure as this particular measure has reportedly better performance when compared to other distance measures [39, 62]. A v ariation of PCA algorithm is the Linear Discriminant Analysis (LD A) [7] algorithm which also projects both the tar get and probe images into a lo wer dimensional space follo wed by the distance measure between the tw o projected images. The dimensions are reduced by computing within class scatter ag ainst between class scatter and nding their generalized eigen v ectors and eigen v alues subject to maximum discrimination between class scatters. Euclidean distance is then used to compute the dissimilarities of projected f ace images. Algorithms based on PCA of f ace images assume each image as a random v ariable with Gaussian distrib ution and compute the Eigenf aces based on second order statistics. This assumption is relax ed in algorithms based on Independent Component Analysis (ICA) [18] of f ace images. ICA based algorithms computes second order as well as higher order of statistics of f ace images to nd statistically independent basis v ectors not necessarily orthogonal to each other Bartlett et al. [6] pro vided tw o dif ferent architectures for the ICA based f ace recognition algorithm. Ho we v er since both architectures ha v e similar performances, in this e xperiment we only use ArchitectureI with cosine angle as the distance measure. The Bayesian algorithm [31] proposed by Moghaddam and Pentland uses the dif ference between tw o images to probabilistically determine similarity of tw o f aces. Dif ference images arising from the images of the same subject are called intrapersonal images and dif ference images arising from the images of tw o dif ferent subjects is called as e xtrapersonal images. Each of these dif ference images is considered to be a point in a high dimensional space. The high dimensional space is ho we v er v ery sparsely populated as majority of the v acant spaces correspond to dif ference images that ne v er occur in practice. These dif ference images will tend to form clusters with underlying assumption that each dif ference image belongs to one of the tw o interpersonal and e xtra personal clusters. Also that the y are distinct and localized Gaussian distrib utions within the space of all possible images. Ho we v er the parameters to these distrib utions are unkno wn. These parameters can be estimated by using the maximum a posteriori method or the maximum lik elihood method. In our e xperiment, we restrict ourselv es to only the Maximum Lik elihood (ML) method since it has been found to ha v e equally good 27 PAGE 38 results as the maximum a posteriori method and at the same time it is less computationally intensi v e. The maximum lik elihood estimate ignores the e xtrapersonal class information. When comparing a no v el image to `N' kno wn g allery images, the g allery image yielding the highest similarity score is tak en as to be the person in the probe image. Elastic Bunch Graph Matching algorithm [67] dif fers from the other algorithms because it recognizes f aces by comparing parts, instead of performing matching the image as whole. The features of the images are represented by Gabor jets also called as model jets, obtained by con v olving an image with a Gabor lters. These model jets are collecti v ely called b unch graphs and each node in this graph is a collection of model jets of a particular landmark. These jets ha v e been e xtracted from manually selected landmark locations from the model images and adding to the appropriate b unch graphs, used as reference data for landmark descriptions while locating landmarks in no v el images. Locating a landmark is based on tw o steps. The location is rst estimated by the kno wn location of other landmarks in the image and estimated location is then further rened by e xtracting a Gabor jet at that point and comparing it ag ainst a set of models. The most similar jet is selected from a b unch graph and this then serv es as a model. The algorithm be gins by estimating the e ye coordinates rst because these estimates are v ery reliable and then iterati v ely try to locate the rest of the landmarks till it has reached the edge of the head. F ace graphs are created for each image by e xtracting jets from the landmark locations lik e e yes, nose tip, corner of lips, etc. These graphs contain the ph ysical location of the landmarks as well as the v alue of the jets. Jets are also e xtracted from locations at the midpoint between tw o landmarks. Since an image is no w represented only by its f ace graph, the original image data can be discarded and an immense amount of memory can be sa v ed. Similarity between tw o images is calculated as a function of the landmark locations and their jet v alues. Jet similarity can be computed using v arious methods of magnitude only phase or displacement compensated Gabor jet similarity Another method to compute the similarity is based on the position of the landmark points. A simple w ay is to compute the Euclidean distance between these locations. The presumption being that images belonging to the same subject will dif fer v ery little in the landmark locations. Since in EBGM algorithms, the landmark points are computed manually we used the FERET training set where landmark points are already dened, to train our proposed model for EBGM algorithm. 28 PAGE 39 F or this e xperiment, we also include a high performing feature based commercial algorithm. It uses Local Feature Analysis (LF A) to represent the f ace. The mathematical technique of LF A assumes that a f acial image can be synthesized from an irreducible set of b uilding elements. These elements can be deri v ed from a set of model f ace images using statistical techniques. F or identication purposes the relati v e positions of these elements are as important as the characteristics of the elements themselv es. Although se v eral elements are possible, only a fe w are needed to describe a f ace completely Ho we v er these elements do not necessarily correspond to f acial features e v en though the y span just a fe w pix els. Compared to methods such as the PCA, LF A is much more resilient to changes in e xpression and hence much more rob ust. The information about distance measure used in the commercial algorithm is unkno wn. Note that, in this e xperiment, we do not use the inb uilt f ace detection module, instead, we use normalized f ace images with x ed size and x ed e ye coordinates. W e also use the ra w match scores from the commercial system without an y inb uilt score normalization routines. T able 3.1 summarizes the distance measures used for the dif ferent algorithm. The six f ace recognition algorithms and the distance measures associated with each of these algorithms are summarized in T able 3.1. Except for the commercial algorithm and ICA algorithm, the implementation of all other algorithms are publicly a v ailable in Colorado State Uni v ersity (CSU) F ace Identication Ev aluation System [8]. The implementation of ICA algorithm has been adapted from [5]. The particular distance measures for each of these algorithms are selected due to their higher recognition rates compared to other distance measures. The last tw o columns in T able 3.1 indicate the range of the similarity/dismiliarity scores of the respecti v e algorithms and the transformation used to con v ert these scores to a range such that the lo wer range of all the transformed distances are same i.e. the distance between tw o similar f ace images are close to 0. Although the distance measure for the Bayesian Intra/Extraclass person classier (B A Y) is a probability measure b ut due to the numerical challenges associated with small probability v alues, the distances are computed as an approximation to such probability The implementation details of distance measures for B A Y algorithm and EBGM algorithm can be found in [55] and [11] respecti v ely Also, in addition to abo v e transformations, the distance between tw o e xact image is set to zero in order to to maintain the ree xi v e property of the distance measure. All the abo v e mentioned 29 PAGE 40 T able 3.1 Similarity/Dissimilarity Measures of Dif ferent F ace Recognition Algorithms. Algorithm Measure Similarity/Dissimilarity Range of Scores T ransformation PCA Cosine distance in Mahalanobis Space Dissimilarity 1 to 1 1 + di j LD A Euclidean Dissimilarity 0 to none ICA Cosine angle in Mahalanobis Space Dissimilarity 1 to 1 1 + di j B A Y Probability of Intra/Extra Class of the template Dissimilarity 0 to none EBGM F ace Graph Narro wing Local Search Dissimilarity 1 to 0 l o g ( di j) COM Unkno wn Similarity a to b b di j 30 PAGE 41 distance measures also e xhibit symmetric property hence no further transformation is required to enforce the symmetric property of the distance measure. Note that, these distance transformations are monotonic increasing functions and hence do not alter the relati v e similarity/dissimilarity between tw o f ace images. So the recognition performance of the respecti v e algorithms with transformed scores remain same to the recognition performance with original scores. 3.2.3 T rain and T est Sets Out of six selected algorithms, e xcept commercial algorithm, the other v e algorithms require a set f ace image as part of the algorithm training process. This training set is dif ferent from the training set required to model the indi vidual algorithms. Thus to distinguish between these tw o training sets, we dene tw o training sets; algorithm train set (algotrain) and a model train set (modeltrain). W e use a set of 600 controlled images from 150 subjects (in the decreasing order of their numeric id) from the FRGC training set to train the indi vidual algorithm. T o b uild the linear model for each algorithm, we use another subset of the FRGC training set with 600 controlled images from the rst 150 subjects (in the increasing order of their numeric id) with four images per subjects. Due to limited number of subjects in the FRGC training set, fe w subjects appear in both the training set; ho we v er there is no common image in algotrain and modeltrain set. Also, the feature based EBGM algorithm dif fers from other algorithms with a special normalization and localization process of f ace images and requires manual landmark points on training images. Hence for the EBGM algorithm, we use the FERET training set containing 493 images pro vided in CSU f ace e v aluation system including the special normalized images required for EBGM algorithm. The algotrain and the modeltrain for EBGM algorithm are same. The commercial system does not require an y training images. Ho we v er while b uilding the model for the commercial system, we empirically observ e that the performance of the linear model demonstrate higher accurac y on the FERET probe sets when the model is trained on the FERET training set. Similarly the linear model is more accurate on the FRGC e xperiments when the number of images in modeltrain for commercial system is e xpanded to 2048 images, containing an equal number of controlled and uncontrolled images from the FRGC training set. In the result section, we ha v e demonstrated the performance of our linear model to the commercial system 31 PAGE 42 with these tw o dif ferent modeltrain sets. F or test sets, to be consistent with other studies, we ha v e selected the g allery set and four dif ferent probe sets as dened in the FERET data set. The g allery set contains 1196 f ace images of 1196 subjects with neutral or minimal f acial e xpression and with frontal illumination. F our sets of probe images (fb, fc, dupI, dupII) are created to v erify the recognition performance under four dif ferent v ariations of f ace images. The `fb' set contains 1195 images from 1195 subjects with dif ferent f acial e xpression than g allery images. The `fc' set contains 194 images from 194 subjects with dif ferent illumination conditions. Both `fb' and `fc' images are captured at the same time as that of g allery images. Ho we v er 722 images from 243 subjects in the probe set `dupI' are captured between in between 0 to 1031 days after the g allery images were captured. Probe set `dupII' is a subset of probe set `dupI' containing 234 images from 75 subjects which were captured at least one and a half year after g allery images. The abo v e mentioned numbers of images in probe and g allery sets are predened within the FERET distrib ution. Similar to the FERET e v aluation protocols, the F ace Recognition Grand Challenge (FRGC) is designed to e v aluate f ace recognition algorithms on lar ge data set with more challenging conditions. Se v eral e xperiments are designed in the FRGC frame w ork; out of which Experiment one, Experiment tw o, and Experiment four in v olv e intensity images in both g allery and probe sets. Experiment one and tw o is designed to study the performance of an y f ace recognition algorithm where both tar get and query images are captured in controlled illumination conditions. Experiment four is designed to study the ef fect of illumination v ariation on recognition algorithms with a tar get set containing controlled images whereas query set contains uncontrolled images. The tar get set for all three e xperiments is the same and contains 16028 normalized images. The query sets of Experiment one and tw o are the same as the tar get set whereas the query set for Experiment four contains 8016 uncontrolled images. In Experiment tw o, the match scores from same subjects are further subsampled which results in a better performance for most of the f ace recognition algorithms. This is the only dif ference between the rst tw o e xperiments of the FRGC [41]. W e e v aluate our proposed modeling scheme using the FRGC predened tar get set and query set of Experiment one, tw o and Experiment four using the commercial system. 32 PAGE 43 T able 3.2 Summary of T rain and T est Sets. Algorithm algotrain modeltrainaT est Sets PCA 600 images (FRGC train set) 600 images (FRGC train set) FERET Probe Sets LD A 600 images (FRGC train set) 600 images (FRGC train set) FERET Probe Sets ICA 600 images (FRGC train set) 600 images (FRGC train set) FERET Probe Sets B A Y 600 images (FRGC train set) 600 images (FRGC train set) FERET Probe Sets EBGM 493 images (FERET train set) 493 images (FERET train set) FERET Probe Sets COM Unkno wn 600 images (FRGC train set) FERET Probe Sets COM Unkno wn 493 images (FERET train set) FERET Probe Sets COM Unkno wn 2064 images (FRGC train set) FRGC Exp1, Exp2, Exp4 aThere are no common images in the algotrain set and the modeltrain set for PCA, LD A, ICA, and B A Y algorithm (see te xt)33 PAGE 44 3.2.4 P erf ormance Metrics f or the Linear Model W e compare the recognition rates of the algorithms with recognition rates of the linear models in terms of standard Recei v er Operating Characteristic (R OC). Gi v en the conte xt of biometrics, this is a more appropriate performance measure than the error in indi vidual distances. Ho w close is the performance of the linear model to that of the actual algorithm on image sets that are dif ferent from the train set? W e augment this performance measure with a measure that reects the local manifold structure that is captured by the linear model. W e dene a ne w metric Nearest Neighbor Agreement that capture the agreement in the identity of the nearest neighbor for each probe between the original algorithm and its linear model. While comparison of R OC curv es of the original algorithm with that of the model demonstrate the strength of the linear model, the metric (Error in Modeling) quanties the accurac y of the model at a particular F alse Acceptance Rate (F AR). W e compute the Error in the Modeling by comparing the T rue Positi v e Rate (TPR) of the linear model with the TPR of the original algorithm at a particular F alse Positi v e Rate (F AR). Error in Modeling (%) = abs ( T PR or ig T PR mod el ) max ( T PR or ig ; T PR mod el ) 100 (3.20) where, T PR or ig and T PR mod el are the true positi v e rate of the original algorithm and true positi v e rate of the model at a particular F AR. In order to closely e xamine our proposed model, we also dene a stronger metric, Nearest Neighbor Agreement, to quantify the local neighborhood similarity of f ace images in model transformation with the original algorithm. F or a gi v en probe P k let G i be the nearest subject as computed by the algorithm and G j be the nearest subject based on the linear model. Let, s k ( G i ; G j ) = 8 > < > : 1 if i = j 0 otherwise 34 PAGE 45 Then the Nearest Neighbor Agreement between the model and the original algorithms can be quantied as S = ( 1 P P k = 1 s k ) 100 where P is the total number of probes in the probe set. Note that the Nearest Neighbor Agreement metric S is a stronger metric than rank1 identication rate in Cumulati v e Match Curv es (CMCs). T w o algorithms can ha v e the same rank1 identication b ut the Nearest Neighbor Agreement can be lo w F or the latter to be high, the identities of both the correct and incorrect matches should agree. In other w ords, a high v alue of this measure indicates that the model and the original algorithm agree on neighborhood structure of the f ace manifold. 3.3 Modeling Results In this section, we present e xperimental results of our proposed linear models to six dif ferent f ace recognition algorithms using the FERET probe sets. W e also model the commercial algorithm on the FRGC e xperiment one, tw o and four Using the metrics dened in pre vious section, we demonstrate the strength of the linear model on tw o dif ferent data sets and with complete separation of training and test sets. The Experimental results sho w that the a v erage Error in Modeling for six algorithms is 6.3% for f afb probe set which contains a maximum number of subjects among all the four probe sets. W e also observ e that the proposed linear model e xhibits an a v erage of 87% accurac y when measured for the similar neighborhood relationship with the original algorithm. A detailed analysis and e xplanation of these results are presented in the follo wing subsections. 3.3.1 Model Space In order to visualize the dif ference between dif ferent algorithms and the underlying lo wer dimensional f ace space constructed by indi vidual algorithms, in Fig 3.5 we present the top four dimensions of the af ne space for each of the algorithms. In the top ro w we present the rst four dimensions of the Eigenf aces computed from the training set follo wed by the rst four dimensions of the model space for indi vidual algorithms. Note that, the dimensions of the Eigenf aces and model space of 35 PAGE 46 Algorithm Dimension 1 Dimension 2 Dimension 3 Dimension 4 T otal Dim. Eigenf aces 360 PCA 360 LD A 149 ICA 70 B A Y 599 EBGM 480 COM 485 Figure 3.5 T op F our Dimensions of Corresponding Af ne Spaces for Each Algorithm. F or A Comparison with Eigenf aces (Rigid T ransformation), In Ro w One W e Present First F our Dimensions of the Eigenf aces F ollo wed by PCA, LD A, ICA, B A Y EBGM And COM Algorithm Respecti v ely The Last Column Indicates the T otal Dimension of the Respecti v e Model Space. 36 PAGE 47 PCA algorithm are dif ferent. This indicates that the model space is also moti v ated by the underlying distance measure rather than the rigid transformation of f aces. The distance measure used for PCA algorithm is cosine distance in Mahalanobis space (see denition of MahaCosine distance in [8]); as a result the nonrigid transformation for PCA algorithm is not an identity transformation, consequently the Eigenf aces and that of the model space for the PCA algorithm are dif ferent. Note that, if we use Euclidean distance in place of MahaCosine distance for PCA algorithm then the nonrigid transformation for the PCA algorithm will be an identity matrix and the model space will be equal to the eigenspace. From Fig 3.5, we also observ e that each recognition algorithm emphasizes dif ferent v ariations in corresponding dimensions compared to other algorithms. Ho we v er there are notable resemblances between v ariations captured by dif ferent template based algorithms (PCA, LD A, ICA and B A Y) though not in same order F or e xample, v ariations captured through rst dimension of PCA are similar to that of fourth dimension of ICA; similarly v ariations captured in the fourth dimension of LD A and B A Y are similar The last column in Fig 3.5 indicates the total dimension of the model space for the respecti v e algorithms. In most of the dimension reduction techniques the choice of dimension of the projected space plays an important role to w ards o v erall performance, yet there is no direct solution to nd the perfect dimension of the projected space. In practice, the eigenspectrum of the original data pro vides an approximation to the dimension of the projected space. Ho we v er as sho wn in other studies [37], e v en ignoring fe w lar ger eigen v alues also yields better results. In the presence of ne g ati v e eigen v alues with high magnitude, Pekalaska and Duin [37] suggested a ne w embedding scheme of the data points in pseudoEuclidean space whose dimension is decided by both positi v e and ne g ati v e eigen v alues of high magnitudes. Ho we v er since in our case, we ha v e modied the original distance matrix, when necessary to enforce the Euclidean property we do not ha v e a lar ge magnitude of ne g ati v e eigen v alues of the modied distance matrix. So the dimension of the af ne space is decided in a more con v entional w ay of ne glecting smaller eigen v alues and k eeping 99% of the ener gy of eigenspectrum. In Fig 3.6, we plot the Eigen v alues of the matrix B. F or LD A based algorithm, the original distance matrix is tested to be Euclidean and hence no correction measure for metric property is necessary and Euclidean distance measure is used in af ne space to model the algorithm. Similarly 37 PAGE 48 for ICA algorithm, magnitude of ne g ati v e Eigen v alues are v ery small compare to lar gest positi v e Eigen v alues, hence the dimension of the af ne space is decided by dropping the ne g ati v e Eigen v alues ho we v er since the original distance matrix is not Euclidean so we use cosine angle distance in af ne space to approximate the algorithm. F or all other algorithms, the magnitude of ne g ati v e Eigen v alues is lar ge compare to highest positi v e Eigen v alues hence we enforce the metric property and positi v e semidenite property by adding appropriate constant to nondiagonal elements of respecti v e distance matrix and use cosine angle distance in af ne space to model each algorithms. 3.3.2 Recognition P erf ormances In Figs. 3.7 to Fig. 3.12, we compare the original performances of all six algorithms with the performances of respecti v e linear models. The four dif ferent probe sets are predened in the FERET database. Note the logscale for the f alse alarm rate. W e observ e that not only the recognition perfor mance of the model matches with that of the original algorithm b ut it also generalizes to the v ariations in f ace images represented by four dif ferent probe sets. F or e xample, the performance of ICA algorithm in f afc (Fig. 3.9(b)) is lo wer as compared to rest of the algorithm and the modeling performance is also lo wer for ICA algorithm which is a good indication of accurate model of the underlying algorithm. Similar performances can also be observ ed in case of LD A and B A Y algorithms. Also, for f afb probe set, the error in the modeling of all the algorithms at 0.001 F AR are 3.8%, 7%, 9%, 5%, 4% and 26% for PCA, LD A, ICA, B A Y EBGM and COM algorithms, respecti v ely The high error rate for the COM algorithm indicates that the linear model for COM algorithm is under trained. Note that, the training set used for the commercial system or the score normalization techniques adapted to optimize the performances are unkno wn. Ho we v er we observ e that the commercial system' s performance is v ery high for the v arious FERET e xperiments b ut similar performance can not be duplicated on the FRGC e xperiments. So, we retrain our linear model for the COM algorithm with the FERET training set containing 493 images and also study the ef fect of tw o standard score normalization methods on the proposed linear model for the commercial system. The performance of the COM system on four FERET probe set and the performance of the linear model trained using the FERET training set are presented in Fig 3.13. W ith the FERET training set, the Error in Modeling for 38 PAGE 49 0 100 200 300 400 500 600 0.5 0 0.5 1 1.5 2 2.5 3 Eigen ValuesIndex of Eigen Values 0 100 200 300 400 500 600 2000 0 2000 4000 6000 8000 10000 12000 Eigen ValuesIndex of Eigen Values (a) PCA (b) LD A 0 100 200 300 400 500 600 20 0 20 40 60 80 100 120 140 Eigen ValuesIndex of Eigen Values 0 100 200 300 400 500 600 0 500 1000 1500 2000 2500 3000 3500 Eigen ValuesIndex of Eigen Values (c) ICA (d) B A Y 0 100 200 300 400 500 0.5 0 0.5 1 1.5 2 2.5 3 x 10 5 Eigen ValuesIndex of Eigen Values 0 100 200 300 400 500 0 200 400 600 800 1000 1200 1400 1600 1800 Eigen ValuesIndex of Eigen Values (e) EBGM (f) Commercial Figure 3.6 Eigen v alues of Distance Matrix for Each Algorithm. (a) PCA with MahaCosine Distance (b) LD A with Euclidean Distance (c) ICA on Arch I (d) B A Y Algorithm (e) EBGM Algorithm (f) Commercial F ace Recognition System. 39 PAGE 50 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate PCA Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate PCA Model (a) f afb (b) f afc 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate PCA Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate PCA Model (c) dupI (d) dupII Figure 3.7 R OC Curv es: Comparison of Recognition Performance of PCA Algorithm with the Cor responding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. 40 PAGE 51 (a) f afb (b) f afc (c) dupI (d) dupII Figure 3.8 R OC Curv es: Comparison of Recognition Performance of LD A Algorithm with the Cor responding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. 41 PAGE 52 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate ICA Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate ICA Model (a) f afb (b) f afc 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate ICA Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate ICA Model (c) dupI (d) dupII Figure 3.9 R OC Curv es: Comparison of Recognition Performance of ICA Algorithm with the Cor responding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. 42 PAGE 53 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate BAY Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate BAY Model (a) f afb (b) f afc 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate BAY Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate BAY Model (c) dupI (d) dupII Figure 3.10 R OC Curv es: Comparison of Recognition Performance of B A Y Algorithm with the Cor responding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. 43 PAGE 54 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate EBGM Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate EBGM Model (a) FERET f afb (b) FERET f afc 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate EBGM Model 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate EBGM Model (c) FERET dupI (d) FERET dupII Figure 3.11 R OC Curv es: Comparison of Recognition Performance of EBGM Algorithm with the Corresponding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. 44 PAGE 55 (a) f afb (b) f afc (c) dupI (d) dupII Figure 3.12 R OC Curv es: Comparison of Recognition Performance of COM with the Corresponding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. 45 PAGE 56 COM algorithm in f afb probe set is reduced to 13%, and with the normalization process, the Error in Modeling for the commercial system is further reduced to 9%. The ef fect of score normalization on proposed modeling scheme is discussed in Section 3.3.4. (a) f afb (b) f afc (c) dupI (d) dupII Figure 3.13 R OC Curv es: Comparison of Recognition Performance of COM with the Corresponding Linear Model. (a) FERET f afb (b) FERET f afc (c) FERET dupI (d) FERET dupII Probe Set with FERET Gallery Set. The Linear Model is T rained Using 493 FERET T raining Images. In Fig. 3.14, we compare the performance of the commercial algorithm (blue color lines) with that of the linear model on the FRGC e xperiments (red color lines). As mentioned earlier the training set to model the commercial system has been e xpanded to 2048 images from the FRGC training set to include an equal number of controlled and uncontrolled images. F ollo wing the guidelines of the FRGC distrib ution, we plot three R OC curv es for the original algorithm as well as for the model. These three R OC curv es: R OCI, R OCII, and R OCIII, represent the temporal v ariation among probe and g allery images. R OCI represents performance on tar get and query images captured within a semester 46 PAGE 57 10 3 10 2 10 1 10 0 0.6 0.7 0.8 0.9 1 False Acceptance RateTrue Acceptance Rate Affine Model ROC I COM ROC I Affine Model ROC II COM ROC II Affine Model ROC III COM ROC III 10 3 10 2 10 1 10 0 0.65 0.75 0.85 0.95 1 False Acceptance RateTrue Acceptance Rate Affine Model ROC I COM ROC I Affine Model ROC II COM ROC II Affine Model ROC III COM ROC I (a) (b) 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 False Acceptance RateTrue Acceptance Rate Affine Model ROC I COM ROC I Affine Model ROC II COM ROC II Affine Model III COM ROC III (c) Figure 3.14 Comparison of Recognition Performances on the FRGC Experiments with the Corresponding Linear Model. (a) Commercial Algorithm on Experiment One (b) Commercial Algorithm on Experiment T w o (c) Commercial Algorithm on Experiment F our 47 PAGE 58 Similarly R OCII and R OCIII represent performance between semester and within the year respecti v ely [41]. The comparison of performance as presented in Fig 3.14 reinforces the high quality of the b uilt model. F or Experiment one, the proposed linear model not only matches the performance of commercial system b ut also the temporal v ariations as observ ed in three dif ferent R OC' s do not af fect the model. In our earlier w ork [32], we ha v e observ ed similar patterns of encouraging results for template based algorithms (LD A and PCA). Considering the R OCIII has the higher temporal v ariation, the Error in Modeling for Experiment one and Experiment tw o for R OCIII for the commercial algorithm are 4%, 3%, respecti v ely The performance of the commercial system is too lo w for the Experiment four to compute the modeling error Due to the poor performance of the commercial system on Experiment four it is not feasible to properly e v aluate the Error in Modeling for this e xperiment. F or Experiment four the linear model demonstrates a higher performance compared to the commer cial system. Such beha vior of the linear model is not totally une xpected. W e belie v e the commercial system is not tuned to controlleduncontrolled illumination v ariation in f ace images. And we can not retrain the commercial to include such v ariations. As a result, the f ace manifold created by the commercial system on Experiment four is not v ery rob ust/stable. Therefore, the nonrigid transfor mation could not ha v e signicant impact on the rigid space and, the performance of linear model of the commercial system more resembles to w ards the baseline (PCA + MahaCosine) algorithm than the commercial algorithm. In the ne xt section, we will discuss more about the local manifold structure of the model through the Nearest Neighbor Agreement metric. 3.3.3 Local Manif old Structur e Fig. 3.15 sho ws the similarity of the neighborhood relationship for six dif ferent algorithms on the FERET f afb probe set. Observ e that, irrespecti v e of correct or incorrect match, the Nearest Neighbor Agreement metric has an a v erage accurac y of 87% on all the six algorithms. It is also important to note that, for algorithms where the performance of the model is better than that of the original algorithm, the metric S is penalized for such impro v ement in the performances and pulls do wn the subject agreement v alues e v en if the model has better performance than the original algorithm. Nonetheless, the high v alue of such a stringent metric v alidates the strength of the linear model. Ev en with little 48 PAGE 59 Figure 3.15 Similarity of the Local Manifold Structure Between the Original Algorithm and the Linear Model as Captured by the Nearest Neighbor Agreement Metric using the FERET f afb Probe Set. information about the train and optimization process of commercial algorithms, the linear model still e xhibits a 70% accurac y for the commercial algorithm. As we observ e from Fig. 3.12 and Fig. 3.14, the commercial system might be optimized for the FERET type data sets and may ha v e used some score normalization techniques before transforming the match scores to a x ed interv al [a b]. In the ne xt subsection we e xplore the v ariation in the model' s performance with dif ferent distance measures using PCA algorithm as well as the ef fect of score normalization on our proposed modeling scheme using the commercial system. 3.3.4 Effect of Distance Measur es and Scor e Normalization Dif ferent f ace recognition algorithms use dif ferent distance measures, and in man y cases the distance measure is unkno wn and nonEuclidean in nature. In order to study the ef fect of v arious distance measure on proposed modeling scheme, we use PCA algorithm with 6 dif ferent distance measure other than Mahacosine distance. F or a stronger comparison we k ept all other parameters, such as the train49 PAGE 60 ing set and dimension of the PCA space, same as before. Only the distance measures are changed. These distance measures are implemented in the CSU F ace Ev aluation T ool, and we use them as per the denition in [8]. In T able 3.3, we present the Error in Modeling (see Eqn. 3.20) for the PCA algorithm with dif ferent distance measures on the FERET f afb probe set. Note that, as described in Fig 3.2, e xcept for PCA+Euclidean distance, the model uses cosine distance for all other cases. From the table, we observ e that for dif ferent distance measures, the Error in Modeling is in the magnitude of 10 2 or less. Therefore, it is apparent that dif ferent distance measure ha v e minimal impact on the proposed modeling scheme. T able 3.3 Ef fect of Distance Measure on the Model: Error in Modeling for PCA Algorithm on the FERET f afb (1195 subjects) Probe Set. Distance Measure F alse Acceptance Rate (%) 0.001 0.01 0.1 CityBlock 0.022 0.018 0.007 Euclidean 0.003 0.0001 0.0001 Correlation 0.004 0.015 0.0005 Co v ariation 0.003 0.016 0.0002 L1 Norm 0.024 0.003 0.005 L2 Norm 0.008 0.006 0.0001 Biometric match scores are often augmented with some normalization procedures before compelled to a threshold based decision. Most of these score normalization techniques are often carried out as a post processing routine and do af fect the underlying manifold of the f aces as observ ed by the f ace recognition algorithms. The most standard score normalization techniques used in biometric applications are Znormalization and MinMax normalization. T o observ e the impact of normalization on proposed modeling scheme, we use the commercial algorithm with MinMax and Znormalization techniques. Note that, in this case, the normalization techniques are considered as part of the black box; hence the match scores used to train the model are also normalized in the similar w ay Fig 3.16 sho ws the comparison of recognition performance of commercial system with score normalization to that of modeling. The score normalization is a postprocessing method and does not reect the original manifold of the f ace images; so in the presence of such normalization, the model actually predicts the original scores instead of normalized match scores. 50 PAGE 61 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 True Positive RateFalse Positive Rate COM Affine Model Affine Model + Normalization (a) 10 3 10 2 10 1 10 0 0 0.2 0.4 0.6 0.8 1 False Positive RateTrue Positive Rate COM Affine Model Affine Model + Normalization (b) Figure 3.16 R OC Curv es Indicating Score Normalization Ef fect on Proposed Modeling Scheme. (a) MinMax Normalization (b) Znormalization T echniques on the FERET f afb Probe Set. 51 PAGE 62 Hence, we apply the same score normalization techniques to match scores of the model and plot it for a better comparison. As e xpected, the proposed modeling scheme emphasizes the underline manifold of the f ace images irrespecti v e of an y post processing such as normalization of the match scores. Ho we v er the dif ference between the algorithm with normalized match score and the model with the same normalization of match scores is minimum. 3.4 Discussion W e proposed a no v el, linear modeling scheme for dif ferent f ace recognition algorithms based on the match scores. Starting with a distance matrix representing the pair wise match scores between f ace images, we used an iterati v e stress minimization algorithm to obtain an embedding of distance matrix in a lo w dimensional space. W e then proposed a no v el linear outofsample projection scheme for test images. The linear transformation used to project ne w f ace images into the model space is di vided into tw o sub transformation: a rigid transformation of f ace images obtained through principal component analysis of f ace images follo wed by a nonrigid transformation responsible for preserving pair wise distance relationships between f ace images. T o v alidate the proposed modeling scheme, we used six fundamentally dif ferent f ace recognition algorithms, co v ering both template based and feature based approaches, on four dif ferent probe sets using the FERET f ace image database. W e compared the recognition rate of each of the algorithms with respecti v e model and demonstrated that the recognition rates are consistent on each of the probe set. Experimental results sho wed that the proposed linear modeling scheme generalized to dif ferent probe set representing dif ferent v ariations in f ace images (FERET probe sets) and consistent with performances of f ace recognition algorithms on lar ge scale data sets (FRGC Experiments). A 6.3% a v erage Error in Modeling for six algorithms is observ ed at 0.001 F alse Acceptance Rate (F AR), for the FERET f afb probe set which contains maximum number of subjects among all the probe sets. Similarly for the FRGC e xperiment one and tw o, we observ ed an a v erage of 3.5% Error in Modeling for the commercial algorithm at 0.001 F AR. The proposed linear model also e xhibited an a v erage of 87% accurac y when measured for the similar neighborhood relationship with the original algorithms. Although choice of distance measure v aries from algorithm to algorithm, we sho wed that such v ariations in distance measures ha v e less impact on 52 PAGE 63 our proposed modeling scheme. Similarly man y biometric systems use score normalization as a post processing routine, and we observ ed that similar score normalization routine when applied to match scores obtained through the af ne model of the algorithm yields e xpected recognition performances. 53 PAGE 64 CHAPTER 4 INDEXING F A CE TEMPLA TES In the identication scenario, one has to perform one to man y matches to identify a ne w f ace image (query) among a set of g allery images. In such scenarios, the query image needs to be compared to all the images in g allery So the response time of the system is directly proportional to the g allery size. The entire process is computationally e xpensi v e for lar ge g allery sets. One possible approach to a v oid such e xpensi v e computation and to pro vide f aster response time is to inde x or bin the g allery set. In case of well de v eloped biometrics such as ngerprints, a binning process based on ridge patterns such as whorl loop and arches is used for inde xing [17, 46]. F or other biometrics where a template is represented by a set of d dimensional numeric features, Mhatre et al. [29] proposed a p yramid inde xing technique to inde x the database. Unfortunately for f ace images there is no straightforw ard and global solution to binning or inde xing of f ace images. Since dif ferent algorithms use dif ferent strate gies to compute the template or features from f ace images, such inde xing strate gy can not be independent of underlying algorithms. F or e xample, the Bayesian intra/e xtra class approach computes the dif ference image of probe with all the g allery templates, so a feature based inde xing scheme is not applicable for this algorithm. 4.1 Indexing with Linear Model W e can use the proposed linear modeling scheme to perform the binning or algorithm specic inde xing task with little o v erhead in computation. Of course, for algorithms such as PCA, LD A and ICA which use the linear projection of ra w templates, the model will result in no additional computational adv antage. Ho we v er algorithms such the Bayesian and EBGM, where numerical inde xing of template is not feasible, inde xing through a linear model can reduce the o v erall computational comple xity by selecting only a subset of g allery images to be matched with probe images. 54 PAGE 65 F or a gi v en f ace recognition algorithm, we design the corresponding linear model as described in Chapter 3. W e then project a gi v en probe image into the model space of the f ace recognition algorithm and nd the K nearest g allery images. W e use the original algorithm to match K selected g allery images with the probe image and output the rank of the probe image. A graphical comparison of the f ace identication system with and without inde xing scheme is outlined in Fig 4.1. Note that, for a perfect model, a system with inde xing and without inde xing will produce the same rank for an y probe set. (a) (b) Figure 4.1 Outline of Inde xing Approach. (a) F ace Identication System without Inde xing (b) F ace Identication System with Inde xing Through Linear Model. 55 PAGE 66 4.2 Experimental Setup T o e v aluate the proposed inde xing mechanism, we select the FERET g allery with the FERET f afb and dupI probe set. These are the tw o probe sets containing maximum number of distinct subjects. The training of the indi vidual f ace recognition algorithms are same as described earlier in Chapter 3. In order to quantify the error in the inde xing scheme, we use the dif ference in rank v alues for a gi v en probe set with and without the inde xing scheme. If the model e xtracts the same K nearest g allery image as by the original algorithm, then the rank of a particular probe will not change with the use of the inde xing procedure and identication rate at a particular rank will remain same. Ho we v er if the K nearest g allery subjects selected by the model do not match with the K nearest subjects selected by the original algorithm, then identication rate at a particular rank will decrease. T o quantify the error in inde xing, let I r represents the identication rate of algorithm at rank r without using the inde xing of g allery set and I r ( K ) represents the identication rate of algorithm at rank r using the inde xing scheme. W e compute the error in inde xing scheme as follo ws E r = max ( I r I r ( K ) ; 0 ) I r (4.1) Where E r represents the error in inde xing approach at rank r Note that, if a probe image has rank higher than K then we penalize the inde xing scheme by setting the rank I r ( K ) to 0. Hence ensuring highest possible v alue of E r The maximum is tak en to a v oid penalizing the inde xing scheme in cases where inde xing of g allery images yields better identication rate than the original algorithm; for e xample, in the case where model of an algorithm has a better recognition rate with modeling than the original algorithm. Hence, from Eqn. 4.1, it is ob vious that there is a trade of f between the v alue of K which is directly proportional to the response time of the system and the error tolerance due to the inde xing scheme. A higher v alue of K will ensure higher accurac y in the rank through inde xing of f ace template b ut increase the number of template matching required by the algorithm. In Fig 4.2 and Fig 4.3, we plot the error in inde xing v ersus the inde xing parameter K using both f afb and dup1 probe set at rank1 and rank5 for B A Y EBGM and COM algorithm, respecti v ely As we can observ e, from Fig. 4.2 (a), 56 PAGE 67 for B A Y algorithm, all the subjects with rank1 has correctly identied by setting the v alue of K to 1. Ho we v er for subjects with rank5 on f afb probe set, the v alue of K need to set to 12 to nullify an y error due to inde xing mechanism. Similar trend for v alue of K is also observ ed for B A Y algorithm on dupI probe set too. Hence, we conclude that for B A Y algorithm, the v alue of K can be set to 12, to achie v e the similar identication rate as per the original algorithm. Ho we v er for other tw o algorithm the v alue of K is observ ed to be close to 50 to achie v e the best performance from the inde xing scheme. 4.3 Results T able 4.1 Inde xing Error T able: V alue of K at Three Dif ferent Inde xing Error Rates for Rank 1 (Rank 5) Identication Rate on the FERET f afb (1195 Subjects) Probe Set. Algorithm Error in Inde xing 0.1 0.01 0.001 0.0001 B A Y 1 (5) 1 (7) 1 (12) 1 (12) EBGM 1 (5) 9 (15) 43 (43) 48 (48) COM 1 (5) 18 (22) 46 (46) 50 (50) T able 4.2 Inde xing Error T able: V alue of K at Three Dif ferent Inde xing Error Rates for Rank 1 (Rank 5) Identication Rate on the FERET dup1 (722 Subjects) Probe Set. Algorithm Error in Inde xing 0.1 0.01 0.001 0.0001 B A Y 1 (5) 1 (5) 1 (9) 8 (13) EBGM 1 (5) 15 (15) 20 (22) 20 (22) COM 1 (5) 22 (38) 22 (50) 26 (50) In T able 4.1 and T able 4.2, we sho w the specic v alues of the inde xing parameter K at three dif ferent inde xing error rates for rank1 and rank5 identication, using both f afb and dup1 probe set, respecti v ely These tw o table sho ws that, if we allo w a small error tolerance for inde xing approach, then the number of comparison to nd the rank of a particular probe image reduces signicantly F or e xample, for the Bayesian algorithm, the v alue of the inde xing parameter is as lo w as 8 with error in inde xing equals to 0.01%. Which means, with the help of the proposed model, the Bayesian algorithm requires at most eight comparison to achie v e similar rank1 performance as compared to using complete probe set, which requires 1195 comparison in the case of f afb probe set. Similarly 57 PAGE 68 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) B A Y : Rank 1 and Rank 5 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) EBGM: Rank 1 and Rank 5 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) COM: Rank 1 and Rank 5 Figure 4.2 Inde xing T radeOf f on the FERET f afb set with Rank1 and Rank5. 58 PAGE 69 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) B A Y : Rank 1 and Rank 5 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) EBGM: Rank 1 and Rank 5 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) 5 10 15 20 25 30 35 40 45 50 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Error in IndexingIndex Parameter (K) COM: Rank 1 and Rank 5 Figure 4.3 Inde xing T radeOf f on the FERET dup1 set with Rank1 and Rank5. 59 PAGE 70 for the other tw o algorithms, at most 50 comparison is suf cient to achie v e similar identication performance at 0.01% error rate for rank1 as well as rank5 identication performances. 4.4 Discussion In this Chapter we ha v e demonstrated a v ery useful application of our earlier proposed linear modeling scheme. W e proposed an inde xing mechanism for f ace templates with the help of a linear model to computationally e xpensi v e f ace recognition algorithm. Unlik e other biometric modalities, f ace templates do not ha v e common global feature sets to use for binning the templates. Since we use the linear model of the algorithm, hence our approach of inde xing f ace templates is algorithm specic. Nonetheless, with more than 1000 unique f ace templates, we e xperimentally sho wed that the proposed inde xing scheme results in more than 20 times reduction in f ace comparisons for Bayesian, EBGM, and the commercial f ace recognition algorithms. W ith this inde xing scheme, the response time is reduced by a f actor of C a C m + K N where C a and C m is the time required to match tw o f ace images using original algorithm and its linear model respecti v ely N represent the number of g allery images. Since, our proposed modeling scheme is a linear projection of f ace images, hence in most cases (such as B A Y and EBGM algorithms) C a C m therefore the C a C m + K N << 1. Ho we v er for algorithms such as PCA, LD A and ICA which uses the linear projection of ra w template, the model will not result in computational adv antage as in these cases C a = C m 60 PAGE 71 CHAPTER 5 F A CE TEMPLA TE RECONSTR UCTION Biometrics templates such as ngerprints, f ace or v oice are v ery unique ph ysical characteristics of an indi vidual person. This uniqueness property of the biometric templates is a major moti v ation to use biometrics in authentication and security systems. Ho we v er the uniqueness of biometric templates also has a dra w back in terms of stolen templates. Unlik e passw ord or smartened based authentication system, biometrics templates are v ery dif cult to re generate. Hence, secured storage and encryption of digital biometric templates is v ery important. Ev en the original templates are encrypted or stored in a secured database, unauthorized reconstruction of biometric template can pose a security and pri v ac y threat to the biometric system. In the follo wing tw o Chapters, we discuss and identify a possible security breach to f ace recognition system from template match score. On the surf ace, match scores from a biometric system may not appear to be a weak link in terms of security and pri v ac y issues. Ho we v er in the follo wing tw o Chapters, we re v eal that e v en match scores carries suf cient information for reconstruction of the original templates and should be protected in the same w ay as original templates. Therefore guidelines such as encryption of match scores, limited number of attempts within a x ed time interv al, periodical updates of enrolled templates should be incorporated with e v ery installation of biometric authentication system. W ith the proposed linear model, we demonstrate that f ace templates can indeed be reconstructed from match scores using a noniterati v e process. Once we successfully b uilt a linear model (see Chapter 3) to a f ace recognition system, using a no v el distance based embedding technique, we embed the tar get f ace in the model space. Then, we use the in v erse of the linear model transformation on the tar get f ace in model space to reconstruct the f ace in original f ace image space. Recall that, while constructing the linear model to f ace recognition algorithm, we seek an approximating af ne transformation A that is a composition of an orthogonal (or rigid) and a nonrigid (shear and stretch) 61 PAGE 72 transformation, A = A nr A r The approximating af ne transformation preserv es the distances among the templates generated by the f ace recognition system. Gi v en this af ne space, we can embed an y template in this space based on its distance d from a kno wn set of templates the set of breakin templates. Once we ha v e the embedded af ne coordinates for template y z we can reconstruct the f ace by in v erting the af ne transformation. A graphical outline of the template reconstruction process is presented in Fig. 5.1. Figure 5.1 Outline of F ace T emplate Reconstruction Process. W e b uild the af ne transformation model A of the f ace recognition algorithm using a local set of f ace images (breakin set). This breakin set is independent from the g allery and probe sets. It can be noted that the f ace recognition system is treated as a complete black box and we do not perform an y re v erse engineering on the recognition system. The assumption of the kno wledge of the f ace recognition algorithm is a weak one. (It might e v en be possible to identify the recognition algorithm gi v en score matrices of kno wn algorithms. Ho we v er we do not e xplore that angle here.) The modeling of the recognition system is an of fline procedure, and needs to be constructed only once for a gi v en recognition algorithm. Once we ha v e b uilt such a model, we present the templates from our breakin set to the system to be brok en and observ e the match scores to an assumed identity So, in real time scenarios, our proposed method only requires access to a set of match scores, equal to the number of images in the breakin set. These match scores are then used to embed the unkno wn template of a tar geted subject in modeled af ne space. Finally we use the in v erse of the af ne transformation to reconstruct the unkno wn template of a tar geted subject in original image space. W e v alidate our proposed template reconstruction scheme on three dif ferent type of f ace recognition systems using 62 PAGE 73 tw o standard public databases, FERET [40] and FRGC [41]. T w o template based algorithms namely Principal Component Analysis with cosine distance measure [59] (widely accepted as a baseline algorithm) and Moghaddam and Pentland' s Bayesian Intra/Extraclass person classier (B A Y) with Maximum Lik elihood (ML) estimation [31], and a feature based commercial f ace recognition system are used for this e xperiment. 5.1 Afne T ransf ormation Fr om Match Scor es Similar to modeling scheme described in Chapter 3, with same mathematical notation, let d i j be the distance between tw o images, x i and x j ( x T i 2 N ) in breakin set as computed by the gi v en f ace recognition algorithm. Here we also assume that the f ace recognition algorithm outputs the dissimilarity scores of tw o templates. Ho we v er if a recognition algorithm computes similarities instead of distances, we can al w ays con v ert the similarity scores s i j into distances using a v ariety of transformations, such as ( 1 s i j ) log ( s i j ) 1 s i j 1 etc. Then, these distances can be arranged as a K K matrix D = [ d 2 i j ] where K is the number of images in the breaking set. F or biometric systems, if the original match score between tw o templates is not modied based on other templates on the g allery then a monotonically increasing transformation of the distances does not af fect the model of the system. F or man y recognition algorithms the underlying distance measure may not be Euclidean and in some case the observ ed dissimilarity matrix may not e xhibit metric properties as well. In such cases, we need to transform the distance matrix D to equi v alent Euclidean distance matrix D E Although the process of con v erting a nonEuclidean distance matrix to an equi v alent Euclidean distance matrix is not feasible in all cases, an approximation to nonEuclidean distance matrix D can be used for such embedding [38]. Although man y dif ferent schemes [49, 56] can be used to arri v e at a set of conguration points which preserv e the pair wise distances gi v en by a input distance matrix, for this e xperiment, we follo wed a simple scheme commonly kno wn as classical scaling or metric multidimensional scaling (MDS) [13, 19]. Gi v en the Euclidean distance matrix D E the objecti v e here is to nd K v ectors, f y 1 ; ; y K g such that D E ( i ; j ) = ( y i y j ) T ( y i y j ) (5.1) 63 PAGE 74 Note that the abo v e conguration points y i s are not unique. An y translation or rotation of v ectors y i s can also be a solution to Eqn. 5.1. T o reduce such de grees of freedom of the solution set, we constrain the solution set of v ectors to be centered at the origin and the sum of the v ectors to zero, i.e. i y i = 0 The ne xt task is to nd a set of v ectors, such that Y T Y = B where B is the dot product distances deri v ed from the monotonically increasing transformation of the distances computed by the f ace recognition algorithm being modeled. One such solution strate gy is to use the eigen v alue decomposition (EVD) of B Since B is a symmetric, positi v e semidenite matrix, (assuming the rank of B is M N ). B has M nonne g ati v e eigen v alues and N M zero eigen v alues. B = V EVD D EVD V EVD T (5.2) where D EVD is a N N diagonal matrices where rst M diagonal entries represents the nonzero eigen v alues of matrix B sorted in ascending order V EVD represents the corresponding eigen v ectors of B The solution is then gi v en by Y = ( V M EVD D M EVD 1 2 ) T (5.3) where D M EVD is a M M diagonal matrices consisting of M nonzero eigen v alues of B and V M EVD represents the corresponding eigen v ectors of B No w follo wing the similar discussion as presented in Chapter 3, we can deri v e the af ne transformation A as a composition of tw o subtransformations: nonrigid transformation A nr and rigid transformation rigid, A r i.e., A = A nr A r The rigid part A r can be arri v ed at by an y analysis that computes an orthonormal subspace from the gi v en set of training images. In this e xperiment, we use principal component analysis (PCA) for the rigid transformation. Let the PCA coordinates corresponding to the nonzero eigen v alues, i.e. nonnull subspace, be denoted by X r = A r ( X ) ) The nonrigid transformation, A nr relates these rigid coordinates, X r to the distance based coordinates, Y and can be e xpressed as A nr = ( V M EVD D M EVD 1 2 ) T X r T L PCA 1 (5.4) This nonrigid transformation, allo wing for shear and stress, and the rigid transformation, computed by principal component analysis, together model the f ace recognition algorithm. Note that the rigid 64 PAGE 75 transformation is not dependent on the f ace recognition algorithm; it is only the nonrigid part that is determined by the distances computed by the recognition algorithm. An alternati v e vie wpoint could be that the nonrigid transformation captures the dif ference between a PCA based recognition strate gy the baseline and the gi v en f ace recognition algorithm. 5.2 Embedding and Reconstruction F or the breakin scenario, we will not ha v e access to the tar get image; ho we v er we will be able to retrie v e distances of the tar get image to an y gi v en image. So, we need a mechanism to be able to compute coordinates of the tar get image from the gi v en distances, i.e. embed the tar get image in the modeling af ne space. Gi v en the embedded coordinates, we will use the in v erse transformation to reconstruct the tar get image. In this section, we e xplain the embedding solution outlined in Fig. 5.2. Figure 5.2 Block Diagram of Embedding Scheme. Figure 5.3 Block Diagram of Reconstruction Scheme. Let, y z be the unkno wn tar get template coordinate v ector in af ne space. Let d = [ d 1 ; d 2 ; ; d K ] T be the v ector of distances of y z from the K images Y = [ y 1 ; ; y K ] in the breakin set as computed by the f ace recognition algorithm along with the Euclidean correction f actor that w as found during estimation of the recognition algorithm. Based on the nature of the construction of the af ne space, these 65 PAGE 76 distances w ould be equal to the Euclidean distance between the v ectors y z and y i Mathematically d 2 i = k y i y z k 2 = k y i k 2 + k y z k 2 2 y i T y z ; 8 i = 1 ; ; K (5.5) subtracting d 2 i from d ( i + 1 ) 2 and simplifying, we ha v e E y z = F ) y z = E F (5.6) where, E T = ( y 2 y 1 ) T ; ( y 3 y 2 ) T ; ; ( y K y K 1 ) T (5.7) F T = [ f i ] ; f i = 1 2 [( d 2 i k y i k 2 ) ( d 2 i + 1 k y i + 1 k 2 )] (5.8) and E represents the pseudoin v erse of E Here, we assume that E does not map all points to null space of F hence pseudoin v erse of E e xists. Thus, gi v en distances d = [ d 1 ; d 2 ; ; d K ] of an y unkno wn template y z from K images in the breakin set, we can use Eqn. 5.6 to compute the coordinates of y z in the approximating af ne space. Once we obtain the coordinate of an y unkno wn template in af ne space, we in v ert the transfor mation to reconstruct the template. Mathematically if y z is the embedding coordinate of unkno wn template x z then A nr A r x z = y z ) x z = A T r A nr y z (5.9) In summary the indi vidual steps in v olv ed in reconstructing tar get templates are as follo ws: Inputs Kno wledge of the f ace recognition algorithm A set of K f ace images (breakin set) A set of match scores between the templates from breakin set to the assumed identity' s template 66 PAGE 77 Modeling Compute distance matrix D between these K templates using the underlying f ace recognition algorithm. If D is not Euclidean, then compute the equi v alent Euclidean distance matrix D E Calculate X EVD from D E Construct the matrix B by double centering with H This step centers the gi v en distance and con v ert them into equi v alent dot product distances B = 1 2 HDH Compute the EVD of B as B = V EVD D EVD V EVD T Compute coordinates Y as Y = ( V M EVD D M EVD 1 2 ) T Build the af ne transformation A = A nr A r The rigid part A r of af ne transformation can be arri v ed at by PCA. Let the PCA coordinates be denoted by X r = A r X The nonrigid part A nr of the transformation is gi v en by A nr = ( V M EVD D M EVD 1 2 ) T X r T L PCA 1 where L P C A is the diagonal matrix with the PCA eigen v alues Embedding and Reconstruction Find the MDS coordinate y z of tar get template x z Compare the templates in breakin set with tar get template to create distances v ector d The coordinate of the tar get template y z in MDS space is constructed as y z = E F Reconstruct the the unkno wn template x z using Eqn. 5.9 5.3 Experimental Setup W e demonstrate our reconstruction scheme using three, fundamentally dif ferent, f ace recognition algorithms: Principal Component Analysis (PCA) with the Mahalanobis cosine distance measure, 67 PAGE 78 the Bayesian Intra/Extraclass person classier (B A Y), and a feature based commercial algorithm. In order to emphasize the true independence of the breakin set and g allery set, we use tw o distinct public databases. The FERET [40] database is used for the g allery images and the F ace Recognition Grand Challenge (FRGC) database [41] is used to construct dif ferent breakin sets. In this section, we rst pro vide an o v ervie w of tw o databases and f ace recognition algorithms used in our e xperiments. Then we present the reconstructed templates and corresponding breakin performance for each of the f ace recognition algorithms. Later we compare our approach with hill climbing based attacks and sho w the ef cienc y of our proposed method o v er a hill climbing based approach [1] both in terms of quality of reconstructed templates and breakin performances. Finally we demonstrate the rob ustness of our proposed algorithm to score quantization. (a) Sample images from breakin set (b) Sample images from g allery set Figure 5.4 Sample Images from (a) Breakin Set and (b) Gallery Set. Breakin Set and Gallery Set are Independent of Each Other and ha v e No Subjects in Common. 5.3.1 Database The f ace images used in this e xperiment are selected from the FERET [40] and F ace Recognition Grand Challenge (FRGC) f ace databases [41]. T o ensure the distincti v eness of the breakin set with the g allery set, we choose our breakin set from a subset of the FRGC training set and reconstructed all the images from the FERET g allery set containing 1196 images from 1196 subjects. The FERET f ace database is a widely used public database, and the g allery set is predened (feret g allery .srt in [8]) in that database. W e use Colorado State Uni v ersity (CSU) F ace Identication Ev aluation System to 68 PAGE 79 normalize the original f ace images [8]. The normalized f ace images ha v e the same e ye location, the same size (150 x 130), and similar intensity distrib ution. Fe w preprocessed f ace images are sho wn in Fig. 5.4. F or breakin sets, we selected a subset of the FRGC training set with 600 controlled images from the rst 150 subjects (in the increasing order of their numeric id) with 4 images per subject. In order to v alidate the ef fecti v eness of the proposed template reconstruction scheme and breakin strate gy it is necessary that the selected f ace recognition algorithms ha v e high recognition rates at lo w f alse acceptance rates (F AR). Since most of the f ace recognition algorithms perform poorly on a data set with one or more v ariations in f ace images [41], we restrict our e xperiments to controlled frontal f ace images only Similarly current template based algorithms require the images to be scaled to the same size with the same e ye location, so a preprocessing step is ine vitable for such algorithms. Ho we v er if a f ace recognition system has high performance without such restriction on the v ariation of f ace images or size of the f ace images then the proposed scheme can be e xtended naturally to such systems. 5.3.2 F ace Recognition Algorithms W e e v aluate the proposed reconstruction scheme on tw o template based algorithms and one feature based f ace recognition system: Principal Component Analysis approach with Mahalanobis cosine angle as the distance measure which by def ault is considered as the baseline algorithm for f ace recognition system [59], Moghaddam and Pentland' s algorithm popularly kno wn as Bayesian Intra/Extraclass person classier [31], and a commercial f ace recognition system. The commercial system is based on Local Feature Analysis of f ace images and widely re g arded as being among the best a v ailable at present. Both baseline and B A Y algorithms were trained using the breakin set from the FRGC training set b ut the commercial algorithm did not require an y training process and used as a black box in all of our e xperiments. Since all the f ace images are normalized with x ed e ye coordinates () and x ed size (150 x 130) f ace images, we did not utilize the f ace and e ye detector module embedded in the commercial f ace recognition system. Using f afb probe set of the FERET data set, we observ e that the baseline, B A Y and the commercial algorithm ha v e 97%, 95% and 99% true acceptance rate at 1% f alse acceptance rate (F AR) respecti v ely on the f afb probe set in the FERET database. 69 PAGE 80 5.3.3 Distance Measur e The three algorithms used in this e xperiment ha v e completely dif ferent approaches of comparing tw o f aces and generate similarity and/or dissimilarity scores with dif ferent distance measures. The baseline algorithm uses Mahalanobis cosine angle and has dissimilarity scores between 1 to 1. Similarly the Bayesian maximum lik elihood classier reports the similarity between tw o f aces in terms of probability of dif ference image to the intrapersonal/e xtrapersonal space. F or this e xperiment, we use CSU implementation of the B A Y algorithm [8] where a ne g ati v e log arithm transformation is applied to the probabilistic similarity score to con v ert them into a distance measure [12]. Ho we v er in order to ha v e an upper bound for the dissimilarity scores, we ro w normalize the distances to the interv al [0 1]. The similarity measure used in feature based commercial algorithm is not kno wn, b ut the similarity scores are within a nite range of [ S min S max ] W e con v ert similarity scores to distances by simply subtracting each match score S i j from the maximum possible match score ( S max S min ) W e observ e that all the three distance measures used by respecti v e algorithms e xhibit symmetric property b ut violate the triangle inequality property Hence, we reinforce the triangle inequality property in the respecti v e distance matrices. The v alues of V learned from the breakin set are 1.297, 2.094 and 19.970 for the baseline, B A Y and the commercial algorithms, respecti v ely 5.4 Results Our rst objecti v e is to model the beha vior of each f ace recognition algorithm in terms of an af ne transformation. In other w ords, the distance between tw o templates computed by these algorithms should be close to the Euclidean distance between the tw o templates in the respecti v e af ne spaces. Here, we present some of the intermediate results sho wing the accurac y of our modeling scheme and the beha vior of the constructed af ne spaces. In Fig. 5.5, we plot the eigen v alues of the transformed distance matrices B dened in Eqn. 3.14. The eigen v alues of the indi vidual algorithms reect the nature of the af ne space for each indi vidual algorithm. The plots for eigen v alues of the three distance matrices from three algorithms appear dif ferent due to dif ferent scales of eigen v alues for each algorithm. In Fig. 5.5 (a), we can observ e that the eigen v alues drop from 9.4 to zero at 360 inde x of the eigen v ector which is about 60% of the total 70 PAGE 81 0 100 200 300 400 500 600 0.5 0 0.5 1 1.5 2 2.5 3 Eigen ValuesIndex of Eigen Values 0 100 200 300 400 500 600 0 500 1000 1500 2000 2500 3000 3500 Eigen ValuesIndex of Eigen Values (a) Baseline (b) B A Y 0 100 200 300 400 500 0 200 400 600 800 1000 1200 1400 1600 1800 Eigen ValuesIndex of Eigen Values (c) Commercial Figure 5.5 The Eigen v alues of the B Matrix for Three F ace Recognition Algorithms. (a) Baseline Algorithm (b) B A Y Algorithm (c) Commercial Algorithm. 71 PAGE 82 number of images. Thus, it can be inferred that the baseline algorithm uses top eigen v ectors which contrib ute 60% of the total ener gy Moreo v er Fig. 5.5 also pro vides estimation for the dimension of each af ne space. F or e xample, we can e xpect that for the baseline algorithm an y breakin set with more than 360 images will result approximately the same probability of breakin. In other w ords, 360 images (attempts) are suf cient to achie v e optimal breakin performance for the baseline algorithm. Ho we v er in the case of the B A Y algorithm, it appears that eigen v alues do not drop to zero e v en with 600 images in breakin set. Hence we can e xpect the loss of sharp features in reconstructed images for the B A Y algorithm with 600 or less images in breakin set. Similarly for the commercial algorithm, we can e xpect a near optimal performance with 600 images in breakin set. T o quantify the modeling error we compute Euclidean distance between the projected images in af ne space and compare with the actual distance matrices computed by the respecti v e algorithms after correction of additi v e constant f actor The normalized error e is then computed as follo ws. e = d i j d i j d i j where, d i j represents Euclidean distance between projected images i and j in af ne space and d i j represents the actual distance computed by recognition algorithm. W e observ e that, the mean of the normalized errors e are 0.002, 0.0457 and 0.1881 with standard de viations 0.1563, 0.0915 and 0.2554 for baseline, B A Y and commercial algorithms, respecti v ely 5.4.1 Reconstructed F ace T emplates T o study the ef fect of the number of images in the breakin set ag ainst the quality of reconstructed templates and breakin performances, we created v e dif ferent breakin sets from the FRGC training set. T w o breakin sets contain 75 and 150 images with one image per subject and the other three breakin sets contain 300, 450, and 600 images respecti v ely with multiple images per subject. Sample images from breakin set and g allery set are sho wn in Fig. 5.4. W e reconstructed all the 1196 images in the FERET g allery set using each of the v e breakin sets. In Fig. 5.6 we present some of the reconstructed images using a breakin set with 600 images. In Fig. 5.7 we sho w the reconstruction templates of a particular tar get with all the v e breakin sets. As e xpected, the reconstruction of the 72 PAGE 83 tar get template impro v es with the number of images in the breakin sets. The noise in the reconstructed images is due to the f act that the breakin set and g allery set are from tw o distinct databases collected in a totally dif ferent en vironment. In the case of the B A Y algorithm, the reconstructed images appear much smoother than the original image. As mentioned earlier the B A Y algorithm requires more than 600 numbers of images in breakin set for better reconstruction. Figure 5.6 Reconstructed F ace T emplates Using a Breakin Set with 600 Images. First Ro w Represents the Original T emplates; Second, Third and F ourth Ro ws Represent the Reconstructed T emplates for Baseline, B A Y and Commercial Algorithm Respecti v ely 5.5 Discussion In this Chapter we proposed a no v el paradigm to reconstruct f ace templates from match scores using a linear approach. First we modeled the beha vior of the gi v en f ace recognition algorithm by an 73 PAGE 84 Original Reconstructed T emplates 75 150 300 450 600 Figure 5.7 V ariation in Reconstructed T emplates for Three Algorithms. Using Fi v e Dif ferent Breakin Sets with 75, 150, 300, 450 and 600 Images. First Column Represents the Original T ar get T emplate. First, Second and Third Ro ws represent the reconstructed templates for Baseline, B A Y and commer cial algorithm respecti v ely 74 PAGE 85 af ne transformation. The goal of the modeling is to approximate the distances computed by a f ace recognition algorithm between tw o f aces by distances between points, representing these f aces, in an af ne space. Gi v en this space, templates from an independent image set (breakin) were matched only once with the enrolled template of the tar geted subject and match scores are recorded. These scores were then used to embed the tar geted subject in the approximating af ne (nonorthogonal) space. Gi v en the coordinates of the tar geted subject in the af ne space, the original template of the tar geted subject w as reconstructed using the in v erse of the af ne transformation. W e demonstrated our ideas using three, fundamentally dif ferent, f ace recognition algorithms: Principal Component Analysis (PCA) with Mahalanobis cosine distance measure, Bayesian Intra/Extraclass person classier (B A Y), and a feature based commercial algorithm. T o demonstrate the independence of the breakin set with the g allery set, we select f ace templates from tw o dif ferent databases: F ace Recognition Grand Challenge (FRGC) and F acial Recognition T echnology (FERET) Database (FERET). In this Chapter we presented a qualitati v e comparison of reconstructed f ace templates with original f ace templates. Ho w good are these reconstructed f ace templates compare to original templates? Can we use these reconstructed f ace templates to grant access (unauthorized) to this system? Can a reconstructed f ace template re v eal the identity of person who has access to the system? W e answer these security and pri v ac y related questions in the ne xt Chapter 75 PAGE 86 CHAPTER 6 PRIV A CY AND SECURITY ISSUES RELA TED T O F A CE RECOGNITION SYSTEMS The requirement of a rob ust, secured authentication system and the a v ailability of the commercial of f the shelf (CO TS) biometric based authentication system encourages the deplo yment of such systems in relati v ely lo w to v ery high risk secure f acilities. At the same time incorporation of biometrics in operational scenarios re v eals se v eral other issues related to security and pri v ac y of the users. Regeneration of enrolled g allery templates will re v eal the identity of the persons with e xclusi v e access to highly secured areas or can be use to ne g ate an y classied operation such as monitoring certain subjects in w atch list. Also the templates can be reused o v er digital netw orks or can be use to reproduce synthetic biometric templates such as f ak e ngers or model f aces [16, 54]. Since the cost of replacing a biometric tok en or template is much higher compared to a passw ord or smart card, the unauthorized access or re generation of biometric templates is a major issue in biometric applications [22, 42, 47]. The standardized biometric application programming interf ace (BioAPI) has been continuously updated with no v el guidelines such as encryption of templates, a v oiding storage and transmit of templates and quantization of match scores to neutralize such security breaches [9]. In order to mak e biometric applications rob ust to such threats it is e xtremely necessary to point out v arious w ays by which the templates can be re generated and propose dif ferent counter measures to such attacks [3, 4, 15, 45]. Most biometric authentication systems ha v e three major components: biometric template acquisition sensor modules to compare a ne w template to enrolled template(s) and a decision module with predened thresholds for particular operational points [10]. In applications, where matching module and decision module are not inte grated together it is required to store the match score in digital media or to transmit the match score through digital channel to decision module. Can unauthorized access of these match scores result in security breaches? In this paper we demonstrate that biometric templates can indeed be reconstructed from match scores using a noniterati v e process. 76 PAGE 87 Methods that ha v e been proposed to reconstruct the biometric templates from the match scores are based on the hill climbing techniques. Soutar [52] proposed an iterati v e template reconstruction scheme popularly kno wn as the hill climbing attack. This scheme selects the account of a specic subject referred to as tar get by starting with an arbitrary f ace template and iterati v ely rening it. At each iteration, if the current match score is better than the pre vious match score then the modied template is retained else it is discarded. The process is iterated until the template is accepted as the tar get subject. Though Soutar did not report an y quantitati v e results of biometric template reconstruction, the performance of similar approaches has been reported by se v eral others [1, 60]. In Chapter 2, we ha v e discussed other hacking methods for dif ferent biometric modalities and presented a short comparison of our proposed template reconstruction scheme and breakin method with a general hill climbing approach. Hill climbing based attacks can successfully break a particular tar get account by re generating the tar get template, this type of attacks ha v e other dra wbacks. Hill climbing based attacks requires signicant number of attempts before success. Besides, each of these attempts use incrementally different f aces. These tw o properties can be used together to block hill climbing attack. F ollo wing the template reconstruction method proposed in last Chapter in this Chapter we present a quantitati v e analysis of reconstructed f ace templates and also compare the performance of our breakin approach with one of the hill climbing based attacks. T o measure the strength of the proposed hack, we dene a strong metric called probability of breakin and use this metric to e v aluate our scheme with a hill climbing based attack at x ed F alse Acceptance Rate (F AR). W ith an operational point set at 1% F AR and 99% T rue Acceptance Rate (T AR) for 1196 enrollments (FERET g allery), we demonstrate that at most 600 attempts (score computations) are required to achie v e a 73% chance of breaking in as a randomly chosen tar get subject for the commercial f ace recognition system. W ith similar operational set up, we achie v e a 72% and 100% chance of breaking in for the B A Y and PCA based f ace recognition systems, respecti v ely W ith three dif ferent le v els of score quantization, we achie v e 69%, 68% and 49% probability of breakin, indicating the rob ustness of our proposed scheme to score quantization. W e also sho w that the proposed reconstruction scheme has 47% more probability of breaking in as a randomly chosen tar get subject for the commercial system as compared to a hill climbing approach with the same number 77 PAGE 88 of attempts. Gi v en that the proposed template reconstruction method uses distinct f ace templates to reconstruct f aces, this w ork e xposes a more se v ere form of vulnerability than a hill climbing kind of attack where incrementally dif ferent v ersions of the same f ace are used. Also, the ability of the proposed approach to reconstruct actual f ace templates of the users increases pri v ac y concerns in biometric systems. 6.1 Pr obability of Br eakin Figure 6.1 Probability of Breakin Using the Reconstructed F ace T emplate. T o quantify the performance of breaking into a system with reconstructed images, we compute the probability of breakin, which is dened as the probability of breaking a randomly chosen tar geted subject. Prob of breakin = No. of successfully v eried tar geted subject using reconstructed images T otal no. of enrolled subjects (6.1) The reconstructed templates of the tar geted subject are matched ag ainst the original templates of the tar geted subject and a dissimilarity matrix is generated for each of the three algorithms. From each of these dissimilarity matrices, we compute the T rue Acceptance Rate (T AR) at 1% F alse Acceptance Rate (F AR) from the respecti v e Recei v er Operating Characteristic Curv es (R OC). The probability of breaking into an y particular f ace recognition system is computed as the ratio between the number of accounts successfully brok en at 1% F AR and the total number of accounts. Note that this performance measure for breakin is dif ferent from that used in other related w orks [27, 60], where the number of attempts is considered as a measure of the breaking into a security 78 PAGE 89 Figure 6.2 Probability of Breakin Using Fi v e Dif ferent Breakin Sets for Three Algorithms at 1% F AR on the FERET Gallery Set. system. In our case, the number of attempts is x ed and same as the number of images in breakin set. In Fig. 6.2 we demonstrate the tradeof f between probabilities of breakin for the system with the number of attempts which is same as the number of images in breakin set. F or the baseline algorithm, only 300 attempts is suf cient to achie v e a 100% success rate to break into the system. F or the feature based commercial algorithm, 600 numbers of attempts are needed to break into the system with a probability of 0.73. F or the template based B A Y algorithm, 600 attempts are required to break into the system with a probability of 0.72. Note that at 1% F AR, the probability of breakin with an y random template after 600 attempts is 0.01. 6.2 Comparison with Hill Climbing A ppr oach In this section, we compare our proposed template reconstruction scheme with a hill climbing based approach [1, 60] on the commercial f ace recognition system. The Eigenf aces required to modify the pre vious template in a hill climbing approach are created using 600 images from the breakin set. At each iteration, a randomly selected Eigenf ace is added or subtracted from the pre vious template. 79 PAGE 90 T able 6.1 Comparison of Reconstructed T emplates Using Our Approach Ag ainst Hill Climbing Approach. T ar get F ace Hill Climbing Approach Our Approach Initial Guess 300 Attempt 600 Attempt 300 Attempt 600 Attempt Easy Reject Reject Accept Accept Accept Moderate Reject Reject Reject Reject Accept Dif cult Reject Reject Reject Reject Reject 80 PAGE 91 Due to the computational demand of the hill climbing process, we restrict our v ersion of the hill climbing method to the rst 100 subjects of the FERET g allery set, and a maximum of 600 attempts are allo wed per subject. The commercial algorithm is set to operate at 1% F AR with 99% T AR, and we let the system decide the acceptance or rejection of a probe template based on this operational setup. W e count the number of tar get accounts that are successfully brok en by hill climbing method and compare that with the number of successfully accepted reconstructed templates, using our breakin set with 600 images. It should be noted that, once we reconstruct a tar get f ace template, we treat the reconstructed template as an original f ace template and match with g allery set. This comparison sho ws the ef cienc y of our approach ag ainst the hill climbing approach after 600 iterations. In T able 6.1, we present a fe w reconstructed templates from hill climbing approach at 300 and 600 iterations and the corresponding reconstructed templates with our approach using the same number of comparisons. In rst column of T able 6.1, we sho w three dif ferent accounts enrolled with tar get templates mark ed Figure 6.3 Comparison of Probability of Breakin at 1% F AR for Commercial Algorithm. W e use First 100 Subjects of the FERET Gallery Set. The Commercial Algorithm is Set to Operate at a Predened Threshold such that T AR = 99% at 1% F AR. as easy moderate, and hard accounts to break in. The rst ro w of T able 6.1 represents a tar get template (easy) whose account is brok en by both hill climbing approach and our approach as well. 81 PAGE 92 Ho we v er it should be noted that hill climbing approach requires 600 attempts to break into this easy account whereas the same result can be achie v ed with only 300 iterations using our proposed scheme. Similarly in the second ro w of T able 6.1, we present a tar get template (moderate) whose account cannot be brok en by the hill climbing approach after 600 attempts b ut proposed scheme successfully brok e that account with 600 attempts. Finally in the third ro w we present a tar get template (hard) whose account can not be hack ed by either scheme. In Fig. 6.3, we compare o v erall breakin performance of both the schemes using the rst 100 subjects from the FERET g allery set. W e can observ e that the proposed scheme has a 47% higher chance of break into a random account compared to the hill climbing attack with 600 attempts. It is w orth to mention here that in [1], Adler sho ws that this particular hill climbing based approach requires approximately 3000 to 4000 iterations to successfully break an account which is much higher compared to the 600 iterations we used here. This count does not include the comparisons needed during the modeling procedure, which is done of fline. 6.3 Effect of Scor e Quantization One counter measure for the rst generation hill climbing attack is to quantize match scores. The systems with quantized match scores do not alter the output match scores with small changes in input images, which can pre v ent general hill climbing attacks. In such cases, if tw o similar probe templates, when matched with a tar get template, ha v e the original match scores, say 42 : 56 and 43 : 4 in the range 0 to 100, and if the system quantizes the output match scores to the nearest inte ger (roundof f), then both the scores will be quantized to 43. F or such type of quantized scores, a hill climbing based approach will f ail to observ e the impro v ement in modied template and will f ail to re generate a template for the ne xt iteration. Ho we v er such quantization of match scores has minimal ef fect on the proposed breakin scheme. Though, in [2] Adler proposed a modied hill climbing approach for systems with quantized match scores, our v ersion of the hill climbing approach f ailed with quantized match scores and therefore, we did not compare the breakin performance of hill climbing approach with our approach on quantized match scores. In our proposed scheme, we compare dif ferent f ace templates to tar get templates and do not need to observ e an y impro v ement in match scores; hence the 82 PAGE 93 proposed scheme is rob ust to the system with quantized match scores. In this e xperiment, we compute Figure 6.4 Probability of Breakin at 1% F AR for Commercial Algorithm with Quantized Scores. The V alue of Quantization P arameter D S in Eqn. 6.3 is Set to 0 : 001, 0 : 01, 0 : 1 and 1 to Quantize the Original Match Score at F our Dif ferent Le v els. the probability of breakin using quantized match scores for the commercial f ace recognition system. W e dene a quantization inde x Q s which controls the le v el of quantization. S quan t = j S or ig S min D S k D S + S min (6.2) Q s = D S ( S max S min ) where S or ig S quan t S max and S min represents the original, quantized, the minimum and the maximum match score of the recognition system, respecti v ely In Eqn. 6.3, the parameter D S controls the le v el of the quantization of original scores and is dened as the length of the quantized interv als i.e. the dif ference between tw o successi v e quantized scores. T o be consistent with v ariable range of match scores for dif ferent algorithms, we dene quantization inde x Q s by normalizing D S o v er a possible range of match scores of a recognition system. If the quantization inde x is set to 0.1, then origi83 PAGE 94 nal scores are quantized at 10 dif ferent points and if Q s equals to 0.01 then the original scores are quantized at 100 dif ferent points. F or this e xperiment, we use four dif ferent le v els of quantization by setting the v alue of Q s to 0 : 0001, 0 : 001, 0 : 01 and 0 : 1 respecti v ely Fig. 6.4 sho ws the probability of breakin at 1% F AR for the commercial algorithm with proposed quantization of match scores. W e observ e that probabilities of breakin do not change signicantly when original scores quantized with Q s equals to 0.0001 or 0.001 and probability of breakin drops from 0 : 68 to 0 : 45 when Q s equals to 0 : 01. Ho we v er we can observ e that, for Q s equal to 0 : 1, the probability of breakin dropped from 0 : 45 to 0 : 22. In T able 6.2, we demonstrate the ef fect of quantization on a reconstructed template along with the acceptance/rejection decision from the system using that particular reconstructed template. As we can observ e, with increasing v alue of Q s the quality of the reconstructed template starts to de grade and is e v entually rejected by the system. If the system outputs v ery high le v el of quantized scores e.g with Q s = 0 : 1 then the original match scores are highly distorted and the af ne modeling of the underlying algorithm is erroneous and as a result, o v erall breakin performance is af fected. Ho we v er it should be observ ed that such quantization of match scores has a tradeof f with the operational e xibility of a system. F or e xample, if the recognition system, with range of original scores in the interv al [0 100], quantizes the original scores at 10 dif ferent points with Q s equal to 0.1 (i.e. output scores as a multiplier of 10), then the system is restricted to operate only at these 10 distinct operational points (thresholds) and loose the e xibility to operate at an y intermediate threshold or F ARs. T able 6.2 Ef fect of Quantization of Match Scores on Reconstructed T emplates. T ar get F ace Quantization Inde x Q s 0.0001 0.001 0.01 0.1 Accept Accept Reject Reject 84 PAGE 95 6.4 Discussion W e present a no v el scheme to reconstruct f ace images from match scores. W e use an af ne transformation to approximate the beha vior of the f ace recognition system using an independent set of f ace templates termed as breakin set. Selected subsets of templates in the breakin set are then matched only once with the enrolled templates of the tar get account and match scores are noted. Gi v en the distances of the tar get template we embed the tar get in the learnt af ne space and in v ert the modeling af ne transformation to arri v e at the original template. W e observ ed that with the proposed scheme, at most 450 attempts are required to achie v e a 0.93 probability of breaking into a commercial f ace recognition algorithm. The major adv antage of the proposed scheme o v er hill climbing attack is its noniterati v e and the number of attempts is less. This scheme also uses distinct templates in each attempt when compared to the tar get template. Therefore such attack is dif cult to detect automatically and can not be neutralized by simple quantization of match scores. W ith the proposed scheme we e xploit a serious security breach in biometric authentication system using only match scores. On the surf ace, match scores from a biometric system may not appear to be a weak link in terms of security and pri v ac y issues. Ho we v er in this e xperiment we re v eal that e v en match scores carries suf cient information for reconstruction of the original templates and should be protected in the same w ay as original templates. Therefore guidelines such as encryption of match scores, limited number of attempts within a x ed time interv al, periodical updates of enrolled templates etc should be incorporated with e v ery installation of biometric authentication system. W ith the proposed scheme, we e xposed a serious security breach using match scores of the f ace recognition systems. A cursory look at match scores from a biometric system may not appear to be a weak link in terms of security and pri v ac y issues, ho we v er in this e xperiment, we re v ealed that e v en match scores carry suf cient information for re v erse engineering of the original templates and should be protected in the same w ay as the original templates. The major adv antage of the proposed scheme o v er earlier proposed hill climbing attack is that it is not based on local search, and number of attempts are less. Our scheme also uses distinct templates in each attempt when compared to a tar get template. Therefore such attack is dif cult to detect automatically and can not be neutralized by simple quantization of match scores. Thus, future f ace recognition systems should emphasize 85 PAGE 96 issues related to pri v ac y of the f ace template and system rob ustness to such types of attacks. Recently fe w interesting de v elopments to w ards encryption of f ace templates such as cancelable biometrics [45] and re v ocable biometrics [15] ha v e been proposed. Ho we v er in this paper we ha v e not addressed such encrypted systems; in the future, the proposed template reconstruction scheme can be e xplored further to deal with encrypted systems. Ne v ertheless, the template encryption and redistrib ution of match scores are progressi v e directions to address issues related to template securities in biometric applications. 86 PAGE 97 CHAPTER 7 CONCLUSIONS W e proposed a no v el, linear modeling scheme for dif ferent f ace recognition algorithms based on the match scores. Starting with distance matrix representing the pairwise match scores between f ace images, we used iterati v e stress minimization algorithm to obtain an embedding of distance matrix in a lo w dimensional space. W e then proposed a no v el linear outofsample projection scheme for test images. The linear transformation used to project ne w f ace images into the model space, is di vided into tw o sub transformation: a rigid transformation of f ace images obtained through principal component analysis of f ace images follo wed by a nonrigid transformation responsible to preserv e pair wise distance relationship between f ace images. T o v alidate proposed modeling scheme, we used six fundamentally dif ferent f ace recognition algorithms, co v ering both template based and feature based approaches, on four dif ferent probe sets using the FERET f ace image database. W e compared the recognition rate of each of the algorithms with respecti v e model and demonstrated that the recognition rates are consistent on each of the probe set. Experimental results sho wed that the proposed linear modeling scheme generalized to dif ferent probe set representing dif ferent v ariations in f ace images (FERET probe sets) and consistent with performances of f ace recognition algorithms on lar ge scale data sets (FRGC Experiments). A 6.3% a v erage Error in Modeling for six algorithms is observ ed at 0.001 F alse Acceptance Rate (F AR), for the FERET f afb probe set which contains maximum number of subjects among all the probe sets. Similarly for the FRGC e xperiments, we observ ed an a v erage of 3.5% Error in Modeling for the commercial algorithm at 0.001 F AR. The propose linear model also e xhibited an a v erage of 87% accurac y when measured for the similar neighborhood relationship with the original algorithms. W e also demonstrated the usefulness of such modeling scheme on algorithm specic inde xing of f ace databases. Although choice of distance measure v ary from algorithm to algorithm, we sho wed that such v ariations in distance measures ha v e less impact on our proposed 87 PAGE 98 modeling scheme. Similarly man y biometric systems uses score normalization as a post processing routine and we observ ed that similar score normalization routine when applied to match scores obtained through the af ne model of the algorithm yields e xpected recognition performances. W e discussed a no v el scheme to reconstruct f ace images from match scores and e xploit a security breach in the f ace recognition systems. W e observ ed that with the proposed scheme, only 300 attempts were required to achie v e a 100% probability of breaking into the baseline f ace recognition algorithm and 600 attempts were required for the B A Y algorithm to achie v e 72% success. F or commercial algorithm, we achie v ed 73% success rate to break into the system with 600 attempts. This observ ation leads us to in v estig ate further on easiness/hardness property of a particular tar get f ace. In Fig. 7.1, we presented fe w tar get images which were hard to break with 600 number of attempts as well as we sho wed fe w tar get images which were easy to break with only 75 attempts. It will be interesting to in v estig ate further on the hardness/easiness of a particular tar get f ace and the abilities to quantize an y tar get f ace in terms of a hard/easy template to be used to sneak into a system. The breakin set chosen for the proposed scheme w as selected from the FRGC database and tar get templates were selected from the FERET g allery set. These tw o databases were collected independently without an y o v erlap of subjects or equipments. Ho we v er for both the g allery set and breakin set, we used indoor controlled images with neutral e xpression and controlled illumination. The results indicate that inclusion of multiple images per subject yields better accurac y for template reconstruction. In particular we observ ed that, the breakin performance with the based recognition system impro v es with the breakin set containing more than one image from subjects. W ith three dif ferent le v els of score quantization, we achie v ed 69%, 68%, and 49% of probability of breakin, respecti v ely indicating the rob ustness of the proposed scheme to score quantization. W e also sho wed that, the proposed reconstruction scheme has 47% more probability of breaking in as a randomly chosen tar get template for the commercial system as compare to a hill climbing based approach with 600 attempts. 7.1 Futur e W ork Although distance based learning approach such as multidimensional scaling is an established theory in statistics, human perception and manifold learning, the e xploration of such methodologies 88 PAGE 99 (a) (b) Figure 7.1 Easy/Dif cult F ace Images for Commercial Algorithm (a) Sample of Dif cult T ar get Images which Can Not be Brok en with 600 Attempts (b) Sample of Easy T ar get Images which Can be Brok en with Only 75 Attempts 89 PAGE 100 to biometric distances are no v el ideas in this proposal along with the construction of linear models to f ace recognition algorithms. Of fline performance analysis of an y f ace recognition system without decoding details of the algorithm design, inde xing of f ace databases, impact of f ace template reconstruction on system security and user' s pri v ac y are fe w other directions which are ne w and has signicant impact on biometric applications. The major contrib ution of this dissertation is attrib uted to the af ne modeling of a f ace recognition algorithm from the distance matrix. The solution strate gy that we ha v e outlined from distances to coordinates is adopted from classical multidimensional scaling (MDS) [13, 19] as well as stress minimization using iterati v e majorization techniques. The goal of MDS is to nd a lo w dimensional placement of a gi v en set of points, which w ould approximate distances between them in a higher dimensional space. It is essentially an unsupervised learning method for the metric embedded in the gi v en data set or in this particular case associated with an unkno wn algorithm. MDS based embedding are useful in application such as clustering and pattern classication where a predened metric or (di)similarity measure may not achie v e the desired accurac y In our scheme we use this embedding to model a recognition algorithm and con v ert the unkno wn distance measure to an equi v alent Euclidean distance measure in lo wer dimensional space. W ang et al. [63] e xtended the similar embedding technique to distance based queries on a high dimensional database where the distance measure is predened, and the lo w dimensional projected space is restricted to be orthogonal. In [68], a distance metric learning method is proposed that learns a distance metric preserving the similarity/dissimilarity (binary 0 or 1) relationship between a set of points. The method is based on posing distance metric learning problem as a con v e x optimization problem. A distance metric learning algorithm with k ernels w as proposed in [57, 69] describing a feature weighting method that w orks in the input space as well as in the k ernel space. Basically it performs a non parametric k ernel adaptation. The distance learning method described in [51] learns by relati v e comparisons, which is a e xible w ay for describing qualitati v e training data as a set of constraints. These constraints lead to a con v e x quadratic programming problem solv ed by adapting standard methods for Support V ector Machine (SVM) training. It can process a distance metric from qualitati v e and relati v e e xamples. Future research could determine if an y of these methods can be used to learn distance function based solely on match and nonmatch biometric e xamples. Other than MDS, there 90 PAGE 101 are other linear and nonlinear approaches such as Locally Linear Embedding (LLE), ISOMAP and nonlinear manifold learning methods that rely on embedding general metric and nonmetric spaces into lo w dimensional Euclidean spaces. The Isomap [56] and LLE [49] algorithms use a collection of local neighborhoods or e xploit the spectral properties of adjacenc y graphs from which the global geometry of the manifold is reconstructed. The distrib utional scaling method presented in [43] describes a method for embedding metric as well as non metric spaces in lo w dimensional Euclidean spaces. Based on semidenite programming, a ne w method for dimensionality reduction or learning underlying manifolds w as proposed by W einber ger and Saul [66]. It combines the concepts of semidenite programming for learning k ernel matrices with spectral methods of non linear dimensionality reduction. W e presently ha v e v ery encouraging results with the simplest form of embedding, i.e. classical MDS; ho we v er in our future research we can consider the possibility of learning transformations to directly map an y unseen image to these nonlinear manifolds. F or the reconstruction of original biometric templates from lo wer dimensional coordinates, we could e xplore the use of pseudoin v erse of approximated af ne transform. W e w ould also lik e to e xplore v arious in v erse scaling methods to reconstruct the original coordinates from lo wer dimensional v ectors. W ith the help of the proposed modeling scheme, future research will e xplore the possibility of nding optimal performance of an y f ace recognition algorithm with respect to a gi v en training set. Also, instead of classical scaling other possible choices to arri v e at the MDS coordinates include metric leastsquare scaling that allo ws for metric transformations of the gi v en dissimilarities so as to minimize a gi v en lossfunction, capturing the dif ferences, maybe weighted, between the transformed dissimilarities and the distances in the embedded space. Note that metric in metric scaling refers to the transformation and not the point conguration space. In nonmetric scaling, arbitrary monotonic, transformations are allo wed as long as rank orders are preserv ed. These could be focus of future w ork. Ho we v er as we ha v e seen in this w ork the stress minimization along with the classical MDS suf ces for modeling most f ace recognition algorithms. There is also the danger that complicated schemes might o v er t to the gi v en distances. Although we ha v e demonstrated the accurac y of the linear model on dif ferent f ace recognition algorithms, we are also interested to e xtend the proposed linear modeling scheme to the human f ace 91 PAGE 102 recognition process. Recently O'T oole et al. [36] compared and analyzed the performance of f ace recognition systems ag ainst human ability to recognize similar f ace images. A successful linear model to the human performances will pro vide more in depth comparison of a person' s ability to recognize f ace images with that of the f ace recognition algorithms. Also, the modeling and analyzing algorithms using linear models, is not limited to f ace recognition algorithms, or in general, to an y other biometric modalities. The proposed learning approach can also be e xtended to the object recognition process, as well as local manifold e xploration of an y classication problems. A closer look at our modeling scheme will re v eal that, the proposed modeling approach is in f act a systematic approach to approximate the global manifold of f ace images as observ ed by the f ace recognition algorithms. Hence, in case of object classiers, we can e xtend the proposed w ork to approximate an y manifold with a cascade of linear models which will be an equi v alent procedure to the piece wise linear approximation of the manifold. At present, the distance based learning approach is limited to the distances between objects in spatial domains, ho we v er the learning approach can also be e xtended to incorporate temporal information and can be useful to analyze algorithm for video processing. 92 PAGE 103 REFERENCES [1] A. Adler Images can be re generated from quantized biometric match score data. In Canadian Confer ence on Electrical and Computer Engineering pages 469472, 2004. [2] A. Adler V ulnerabilities in biometric encryption system. In International Confer ence on A udio and V ideo based Biometric P er son A uthentication pages 11001109, 2005. [3] A. Antonelli, R. Cappelli, D. Maio, and D. Maltoni. F ak e nger detection by skin distortion analysis. IEEE T r ansactions on Information F or ensics and Security 1(3):360373, 2006. [4] D. Baldisserra, A. Franco, D. Maio, and D. Maltoni. F ak e ngerprint detection by odor analysis. In International Confer ence on Biometric A uthentication pages 265272, 2006. [5] M. Bartlett. F ace Ima g e Analysis by Unsupervised Learning Kluwer Academic Publishers, 2001. [6] M. Bartlett, J. Mo v ellan, and T Sejno wski. F ace recognition by independent component analysis. IEEE T r ansactions on Neur al Networks 13(6):14501464, 2002. [7] P Belhumeur J. Hespanha, and D. Krie gman. Eigenf aces vs. sherf aces: Recognition using class specic linear projection. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 19(7):711720, 1997. [8] R. Be v eridge, D. Bolme, M. T eix eira, and B. Draper The csu f ace identication e v aluation system. Mac hine V ision and Applications 16(2):128138, 2005. [9] BioAPI. Bioapi 2.0. international v ersion. In BioAPI Consortium A v ailable Online: http://www .bioapi.or g/internationalv ersion.html, 2005. [10] R. Bolle, J. H. Connell, S. P ankanti, N. K. Ratha, and A. W Senior Guide to Biometrics Springer 2004. [11] D. Bolme. Elastic b unch graph matching. In Master s Thesis, Color ado State Univer sity 2003. [12] D. Bolme. The bayesian intrapersonal/e xtrapersonal classers. In Master s Thesis, Color ado State Univer sity July 2003. [13] I. Bor g and P Groenen. Modern Multidimensional Scaling Springer Series in Statistics, Springer 1997. [14] M. Boshra and B. Bhanu. Predicting performance of object recognition. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 22(9):956969, 2000. 93 PAGE 104 [15] T Boult. Rob ust distance measures for f acerecognition supporting re v ocable biometric tok ens. In International Confer ence on A utomatic F ace and Gestur e Reco gnition (FGR06) pages 560 566, 2006. [16] R. Cappelli, D. Maio, and D. Maltoni. Synthetic ngerprint image generation. In International Confer ence on Advances in P attern Reco gnition v olume 3, pages 475478, 2000. [17] R. Cappelli, D. Maio, D. Maltoni, and L. Nanni. A tw ostage ngerprint classication system. In A CM SIGMM workshop on Biometrics methods and applications pages 9599, 2003. [18] P Comon. Independent component analysis, a ne w concept? Signal Pr ocessing 36(3):287314, 1994. [19] T Cox and M. Cox. Multidimensional Scaling Chapman and Hall, second edition, 1994. [20] J. Go wer and P Le gendree. Metric and euclidean properties of dissimilarity coef cients. J ournal of Classication 3:548, 1986. [21] P Grother and P Phillips. Models of lar ge population recognition performance. In In Pr oc. of IEEE Computer Society Confer ence on Computer V ision and P attern Reco gnition pages 6875, 2004. [22] G. Hachez, F K oeune, and J. Quisquater Biometrics, access control, smart cards: A not so simple combination. In 4th Smart Car d Resear c h and Advanced Applications Confer ence v olume 4, pages 273288, 2000. [23] A. K. Jain and S. Li. Handbook of F ace Reco gnition Springer 2005. [24] A. Johnson, J. Sun, and A. Bobick. Using similarity scores from a small g allery to estimate recognition performance for lar ger g alleries. In IEEE International W orkshop on Analysis and Modeling of F aces and Gestur es pages 100104, 2003. [25] S. G. K ong, J. Heo, B. R. Abidi, J. P aik, and A. M. A. Recent adv ances in visual and infrared f ace recognition: a re vie w International J ournal on Computer V ision and Ima g e Under standing 97(1):103135, 2005. [26] D. Litman, J. Hirschber g, and M. Swerts. Predicting automatic speech recognition performance using prosodic cues. In F ir st confer ence on North American c hapter of the Association for Computational Linguistics pages 218225, 2000. [27] D. Lopresti and J. Raim. The ef fecti v eness of generati v e attacks on an online handwritting biometric. In International Confer ence on A udio and V ideo based Biometric P er son A uthentication pages 10901099, 2005. [28] T Matsumoto, H. Matsumoto, K. Y amada, and S. Hoshino. Impact of articial gummy ngers on ngerprint systems. In Pr oc. of SPIE, Optical Security and Counterfeit Deterr ence T ec hniques IV v olume 4667, pages 275289, 2002. [29] A. Mhatre, S. P alla, S. Chikk erur and V Go vindaraju. Ef cient search and retrie v al in biometric databases. SPIE Defense and Security Symposium 5779:265273, 2005. 94 PAGE 105 [30] S. Mitra, M. Sa vvides, and A. Brockwell. Statistical performance e v aluation of biometric authentication systems using random ef fects models. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 29(4):517530, 2007. [31] B. Moghaddam and A. Pentland. Be yond eigenf aces: Probabilistic matching for f ace recognition. In International Confer ence on A utomatic F ace and Gestur e Reco gnition pages 3035, 1998. [32] P Mohanty S. Sarkar and R. Kasturi. Designing af ne transformations based f ace recognition algorithms. In IEEE workshop on F ace Reco gnition Gr and Challeng e: IEEE Confer ence on Computer V ision and P attern Reco gnition v olume 3, pages 173174, 2005. [33] P Mohanty S. Sarkar and R. Kasturi. From scores to f ace template: A modelbased approach. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 29(12):20652078, 2007. [34] S. Nana v ati, M. Thieme, and R. Nana v ati. Biometrics: Identity V erication in a Network ed W orld W ile y 2005. [35] NIST F ace recognition v endor test (frvt). In National Institute of Standar ds and T ec hnolo gy http://face .nist.go v/frvt/ 2006. [36] A. J. OT oole, P J. Phillips, F Jiang, J. A yyad, N. Pnard, and H. Abdi. F ace recognition algorithms surpass humans matching f aces across changes in illumination. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 29(9):16421646, 2007. [37] E. Pekalska and P W Duin. The Dissimilarity Repr esentation for P attern Reco gnition: F oundations and Applications v olume 64 of Series in Mac hine P er ception and Articial Intellig ence W orld Scientic, rst edition, 2006. [38] E. Pekalska, P P aclik, and R. Duin. A generalized k ernel approach to dissimilarity based classication. J ournal of Mac hine Learning Resear c h 2:175211, 2001. [39] V Perlibakas. Distance measures for pcabased f ace recognition. P attern Reco gnition Letter s 25(6):711724, 2004. [40] P Phillips, H. W echsler J. Huang, and P J. Rauss. The feret database and e v aluation procedure for f ace recognition algorithms. In Ima g e and V ision Computing v olume 16, pages 295306, 1998. [41] P J. Phillips, P Flynn, T Scruggs, K. Bo wyer J. Chang, K. Hof fman, J. Marques, J. Min, and W W orek. Ov ervie w of the f ace recognition grand challenge. In IEEE Confer ence on Computer V ision and P attern Reco gnition v olume 1, pages 947954, 2005. [42] S. Prabhakar S. P ankanti, and A. Jain. Biometric recognition: Security and pri v ac y concerns. IEEE Security and Privacy Ma gazine 1(2):3342, 2003. [43] M. Quist and G. Y ona. Distrib utional scaling: An algorithm for structurepreserving embedding of metric and nonmetric spaces. J ournal of Mac hine Learning Resear c h 5:339420, 2004. [44] N. Ratha, J. Connell, and R. Bolle. An analysis of minutiae matching strength. In International Confer ence on A udio and V ideo based Biometric P er son A uthentication pages 223228, 2001. 95 PAGE 106 [45] N. Ratha, J. Connell, and R. Bolle. Enhancing security and pri v ac y in biometricsbased authentication systems. IBM Systems J ournal 40(3):614634, 2001. [46] N. Ratha, K. Karu, S. Chen, and A. Jain. A realtime matching system for lar ge ngerprint databases. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 18(8):799813, 1996. [47] N. K. Ratha, J. Connell, and R. M. Bolle. Enhancing security and pri v ac y in biometricsbased authentication systems. IBM System J ournal 40(3):614634, 2001. [48] V Roth, J. Laub, M. Ka w anabe, and J. M. Buhmann. Optimal cluster preserving embedding of nonmetric proximity data. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 25(12):15401551, 2003. [49] S. Ro weis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science Ma gazine 290(5500):23232326, 2000. [50] S. Schimk e, C. V ielhauer and T Kalk er Security analysis for biometric data in id documents. In Pr oc. of SPIEIS&T Electr onic Ima ging v olume 5681, pages 474485, 2005. [51] M. Schultz and T Joachims. Learning a distance metric from relati v e comparisons. In Advances in Neur al Information Pr ocessing Systems pages 4148, 2004. [52] C. Soutar Biometric system security Secur e 5:4649, 2002. [53] C. Soutar D. Rober ge, S. A. Stojano v R. Gilro y and B. V K. V ijaya. Biometric encryption using image processing. In Pr oc. SPIE, Optical Security and Counterfeit Deterr ence T ec hniques II v olume 3314, pages 178188, 1998. [54] G. W Ste w art and J. Sun. Risk of Masquer ade Arising fr om the Stor a g e of Biometrics Masters thesis, Australian National Uni v ersity 1990. [55] M. L. T eix eira. The bayesian intrapersonal/e xtrapersonal classer In Master s Thesis, Color ado State Univer sity 2003. [56] J. T enenbaum, V Silv a, and J. Langford. A global geometric frame w ork for nonlinear dimensionality reduction. Science Ma gazine 290(5500):23192323, 2000. [57] I. Tsang and J. Kw ok. Distance metric learning with k ernels. In International Confer ence on Articial Neur al Networks pages 126129, 2003. [58] M. A. T urk and P Pentland. F ace recognition using eigenf aces. In IEEE Confer ence on Confer ence on Computer V ision and P attern Reco gnition pages 586591, 1991. [59] M. A. T urk and P Pentland. F ace recognition using eigenf aces. In IEEE Confer ence on Confer ence on Computer V ision and P attern Reco gnition pages 586591, 1991. [60] U. Uludag and A. Jain. Attacks on biometric systems: A case study in ngerprints. In Pr oc. SPIEEI 2004, Security Ste gano gr aphy and W atermarking of Multimedia Contents pages 622 633, 2004. 96 PAGE 107 [61] U. Uludag, S. P ankanti, S. Prabhakar and A. K. Jain. Biometric cryptosystems: Issues and challenges. In Pr oc. of IEEE, Special Issue on Enabling Security T ec hnolo gies for Digital Rights Mana g ement v olume 92, pages 948960, 2004. [62] B. D. W Y ambor and R. Be v eridge. Analyzing pcabased f ace recognition algorithms: Eigenv ector selection and distance measures. In W orld Scientic Pr ess 2002. [63] J. L. W ang, X. W ang, D. Shasha, and K. Zhang. Metricmap: An embedding technique for processing distancebased queries in metric spaces. IEEE T r anscation on Systems, Man, and Cybernetics P art B: Cybernetics 35(5):973987, 2005. [64] P W ang, Q. Ji, and J. W ayman. Modeling and predicting f ace recognition system performance based on analysis of similarity scores. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 29(4):665670, 2007. [65] R. W ang and B. Bhanu. Learning models for predicting recognition performance. In IEEE International Confer ence on Computer V ision (ICCV) pages 16131618, 2005. [66] K. W einber ger and L. Saul. Unsupervised learning of image manifolds by semidenite programming. In IEEE Confer ence on Computer V ision and P attern Reco gnition v olume 2, pages 988995, 2004. [67] L. W isk ott, J. Fellous, N. Kruger and C. Malsb ur g. F ace recognition by elastic b unch graph matching. IEEE T r ansactions on P attern Analysis and Mac hine Intellig ence 2(7):11601169, 1997. [68] E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning with application to cluster ing with side information. In Advances in Neur al Information Pr ocessing Systems v olume 16, pages 505512, 2003. [69] Z. Zhang. Learning metrics via discriminant k ernels and multidimensional scaling: T o w ards e xpected euclidean representation. In Pr oc. of 20th International Confer ence on Mac hine Learning pages 872879, 2003. [70] W Zhao, R. Chellappa, A. Rosenfeld, and P J. Phillips. F ace recognition: A literature surv e y A CM Computing Surve ys (CSUR) 35(4):399458, 2003. 97 PAGE 108 ABOUT THE A UTHOR Pranab Mohanty recei v ed the M.S. de gree in Mathematics from the Utkal Uni v ersity Orissa, India in 1997 and M.S. de gree in computer science from the Indian Statistical Institute, Calcutta, India in 2000. He is currently a PhD candidate at the Uni v ersity of South Florida. His research interests include biometrics, image and video processing, computer vision and pattern recognition. xml version 1.0 encoding UTF8 standalone no record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd leader nam Ka controlfield tag 001 001935123 003 fts 005 20080421121146.0 006 med 007 cr mnuuuuuu 008 080421s2007 flua sbm 000 0 eng d datafield ind1 8 ind2 024 subfield code a E14SFE0002298 035 (OCoLC)225863952 040 FHM c FHM 049 FHMM 090 TK7885 (ONLINE) 1 100 Mohanty, Pranab. 0 245 Learning from biometric distances : b performance and security related issues in face recognition systems h [electronic resource] / by Pranab Mohanty. 260 [Tampa, Fla.] : University of South Florida, 2007. 520 ABSTRACT: We present a theory for constructing linear, black box approximations to face recognition algorithms and empirically demonstrate that a surprisingly diverse set of face recognition approaches can be approximated well using a linear model. The construction of the linear model to a face recognition algorithm involves embedding of a training set of face images constrained by the distances between them, as computed by the face recognition algorithm being approximated. We accomplish this embedding by iterative majorization, initialized by classical multidimensional scaling (MDS). We empirically demonstrate the adequacy of the linear model using six face recognition algorithms, spanning both template based and feature based approaches on standard face recognition benchmarks such as the Facial Recognition Technology (FERET) and Face Recognition Grand Challenge (FRGC) data sets.The experimental results show that the average Error in Modeling for six algorithms is 6.3% at 0.001 False Acceptance Rate (FAR), for FERET fafb probe set which contains maximum number of subjects among all the probe sets. We demonstrate the usefulness of the linear model for algorithm dependent indexing of face databases and find that it results in more than 20 times reduction in face comparisons for Bayesian Intra/Extraclass person classifier (BAY), Elastic Bunch Graph Matching algorithm (EBGM), and the commercial face recognition algorithms. We also propose a novel paradigm to reconstruct face templates from match scores using the linear model and use the reconstructed templates to explore the security breach in a face recognition system.We evaluate the proposed template reconstruction scheme using three, fundamentally different, face recognition algorithms: Principal Component Analysis (PCA), Bayesian Intra/Extraclass person classifier (BAY), and a feature based commercial algorithm. With an operational point set at 1% False Acceptance Rate (FAR) and 99% True Acceptance Rate (TAR) for 1196 enrollments (FERET gallery), we show that at most 600 attempts (score computations) are required to achieve 73%, 72% and 100% chance of breaking in as a randomly chosen target subject for the commercial, BAY and PCA based face recognition system, respectively. We also show that the proposed reconstruction scheme has 47% more probability of breaking in as a randomly chosen target subject for the commercial system as compared to a hill climbing approach with the same number of attempts. 502 Dissertation (Ph.D.)University of South Florida, 2007. 504 Includes bibliographical references. 516 Text (Electronic dissertation) in PDF format. 538 System requirements: World Wide Web browser and PDF reader. Mode of access: World Wide Web. 500 Title from PDF of title page. Document formatted into pages; contains 97 pages. Includes vita. 590 Advisor: Sudeep Sarkar, Ph.D. 653 Security and privacy issues in biometrics. Modeling face recognition algorithms. Face template reconstruction. Multidimensional scaling. Distance based embedding. Hill climbing approach. Hacking face recognition systems. Face template indexing. 690 Dissertations, Academic z USF x Computer Science and Engineering Doctoral. 773 t USF Electronic Theses and Dissertations. 4 856 u http://digital.lib.usf.edu/?e14.2298 