xml version 1.0 encoding UTF8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 2200409Ka 4500
controlfield tag 001 002069508
005 20100422142326.0
007 cr mnuuuuuu
008 100422s2009 flu s 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14SFE0003288
035
(OCoLC)608534882
040
FHM
c FHM
049
FHMM
090
TK7885 (Online)
1 100
Manohar, Vasant.
0 245
Facial skin motion properties from video :
b modeling and applications
h [electronic resource] /
by Vasant Manohar.
260
[Tampa, Fla] :
University of South Florida,
2009.
500
Title from PDF of title page.
Document formatted into pages; contains 101 pages.
Includes vita.
502
Dissertation (Ph.D.)University of South Florida, 2009.
504
Includes bibliographical references.
516
Text (Electronic dissertation) in PDF format.
520
ABSTRACT: Deformable modeling of facial soft tissues have found use in application domains such as humanmachine interaction for facial expression recognition. More recently, such modeling techniques have been used for tasks like age estimation and person identification. This dissertation is focused on development of novel image analysis algorithms to follow facial strain patterns observed through video recording of faces in expressions. Specifically, we use the strain pattern extracted from nonrigid facial motion as a simplified and adequate way to characterize the underlying material properties of facial soft tissues. Such an approach has several unique features. Strain pattern instead of the image intensity is used as a classification feature. Strain is related to biomechanical properties of facial tissues that are distinct for each individual.Strain pattern is less sensitive to illumination differences (between enrolled and query sequences) and face camouflage because the strain pattern of a face remains stable as long as reliable facial deformations are captured. A finite element modeling based method enforces regularization which mitigates issues (such as temporal matching and noise sensitivity) related to automatic motion estimation. Therefore, the computational strategy is accurate and robust. Images or videos of facial deformations are acquired with video camera and without special imaging equipment. Experiments using range images on a dataset consisting of 50 subjects provide the necessary proof of concept that strain maps indeed have a discriminative value.On a video dataset containing 60 subjects undergoing a particular facial expression, experimental results using the computational strategy presented in this work emphasize the discriminatory and stability properties of strain maps across adverse data conditions (shadow lighting and face camouflage). Such properties make it a promising feature for image analysis tasks that can benefit from such auxiliary information about the human face. Strain maps add a new dimension in our abilities to characterize a human face. It also fosters newer ways to capture facial dynamics from video which, if exploited efficiently, can lead to an improved performance in tasks involving the human face. In a subsequent effort, we model the material constants (Young's modulus) of the skin in subregions of the face from the motion observed in multiple facial expressions.On a public database consisting of 40 subjects undergoing some set of facial motions, we present an expression invariant strategy to matching faces using the Young's modulus of the skin. Such an efficient way of describing underlying material properties from the displacements observed in video has an important application in deformable modeling of physical objects which are usually gauged by their simplicity and adequacy. The contributions through this work will have an impact on the broader vision community because of its highly novel approaches to the longstanding problem of motion analysis of elastic objects. In addition, the value is the cross disciplinary nature and its focus on applying image analysis algorithms to the rather difficult and important problem of material property characterization of facial soft tissues and their applications.We believe this research provides a special opportunity for the utilization of video processing to enhance our abilities to make unique discoveries through the facial dynamics inherent in video.
538
Mode of access: World Wide Web.
System requirements: World Wide Web browser and PDF reader.
590
Coadvisor: Dmitry B. Goldgof, Ph.D.
Coadvisor: Sudeep Sarkar, Ph.D.
653
Face
Deformable modeling
Strain pattern
Finite element method
Person identification
690
Dissertations, Academic
z USF
x Computer Science and Engineering
Doctoral.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.3288
PAGE 1
F acial Skin Motion Prop erties from Video: Mo deling and Applications b y V asan t Manohar A dissertation submitted in partial fulllmen t of the requiremen ts for the degree of Do ctor of Philosoph y Departmen t of Computer Science and Engineering College of Engineering Univ ersit y of South Florida CoMa jor Professor: Dmitry B. Goldgof, Ph.D. CoMa jor Professor: Sudeep Sark ar, Ph.D. Rangac har Kasturi, Ph.D. T apas K. Das, Ph.D. Thomas A. Sano c ki, Ph.D. Date of Appro v al: Octob er 29, 2009 Keyw ords: F ace, Deformable Mo deling, Strain P attern, Finite Elemen t Metho d, P erson Iden tication. c r Cop yrigh t 2009,V asan t Manohar
PAGE 2
DEDICA TION T o Amma, Appa, Anna, and Manni.
PAGE 3
A CKNO WLEDGEMENTS I w ould lik e to express m y sincere thanks to Prof. Dmitry Goldgof and Prof. Sudeep Sark ar for giving me an opp ortunit y to w ork on this topic and for their adept guidance and encouragemen t through the course of m y do ctorate. I am grateful to Prof. Rangac har Kasturi for his supp ort during m y sta y at USF and for pro viding me with n umerous prosp ects to rourish as a researc her. I am deeply indebted to Prof. T apas Das and Prof. Thomas Sano c ki for b eing on m y committee and pro viding an orthogonal direction to m y researc h through their v ast exp ertise in the area of statistical analysis and h uman cognition. It w as a pleasure and honor to w ork with m y committee. Sp ecial thanks are due to Dr. Y ong Zhang and Ms. Sangeeta Kundu for their insigh tful discussions on understanding the concept of using material prop erties for ob ject recognition. Am also thankful to Dr. P admanabhan Soundarara jan, Dr. Pranab Mohan t y and Dr. Himansh u V a jaria for their helpful and willing in teractions during m y sta y at USF. I w ould lik e to thank Mr. Daniel Prieto and Mr. P eter Sc hia v o (CSE tec hnical supp ort team) for their quic k resp onses to all m y computer and soft w are needs. Finally I w ould lik e to extend m y deep est gratitude to m y family and friends for alw a ys b eing there in p erplexing situations and adding color to life through their lo v e and aection.
PAGE 4
T ABLE OF CONTENTS LIST OF T ABLES iv LIST OF FIGURES v ABSTRA CT vii CHAPTER 1 INTR ODUCTION 1 1.1 Motiv ation 2 1.2 Problem Description 3 1.3 Presen ted Approac h 3 1.4 Supp ortiv e Evidence 5 1.5 Con tributions of the Dissertation 7 1.5.1 Main Thrust Areas 7 1.5.1.1 Computer Vision and Deformable Ob ject Analysis 7 1.5.1.2 Biometrics 8 1.5.2 Other Thrust Areas 8 1.5.2.1 P erceptual Psyc hology 8 1.5.2.2 Computer Graphics 9 1.6 Organization of the Dissertation 9 CHAPTER 2 LITERA TURE REVIEW 10 2.1 Deformable Mo deling 10 2.1.1 Nonph ysical Mo deling 11 2.1.1.1 Spline T ec hniques 11 2.1.1.2 F reeform Deformation 12 2.1.2 Ph ysical Mo deling 13 2.1.2.1 MassSpringDamp er Systems 14 2.1.2.2 Finite Dierence Metho d 15 2.1.2.3 Boundary Elemen t Metho d 16 2.1.2.4 Finite Elemen t Metho d 17 2.1.3 F ace Mo deling and Biomec hanics 19 2.2 P erson Iden tication 21 2.2.1 ImageBased Algorithms 22 2.2.1.1 2D Algorithms 22 2.2.1.2 3D Algorithms 23 2.2.2 VideoBased Algorithms 25 i
PAGE 5
2.2.2.1 Algorithms Using SpatioT emp oral Represen tation 26 2.2.2.2 F acial Dynamics for Iden tit y Recognition 27 CHAPTER 3 THEORETICAL BA CK GR OUND 29 3.1 Theory of Elasticit y 29 3.1.1 Stress and Strain 30 3.2 Motion Analysis 32 3.2.1 Optical Flo w 32 3.2.2 Flo w Computation 33 3.2.3 Handling F ailure Cases 35 3.3 Subspace Learning for F ace Recognition 37 3.3.1 Principal Comp onen t Analysis 37 3.3.1.1 Prepro cessing 37 3.3.1.2 T raining 38 3.3.1.3 T esting 38 3.3.1.4 Analysis 39 CHAPTER 4 THE ALGORITHM AND COMPUT A TIONAL METHODS 42 4.1 Computation of Motion Fields 43 4.2 Strain Computation using the Finite Dierence Metho d 45 4.3 Strain Computation using the Finite Elemen t Metho d 47 4.4 Pro of of Concept using Range Images 51 4.4.1 Computing Strain with Range Data 52 4.4.2 PCA Results and Analysis 54 CHAPTER 5 RESUL TS ON PERSON IDENTIFICA TION 56 5.1 The Finite Dierence Metho d 56 5.1.1 Video Acquisition Conditions 56 5.1.2 Strain Images 56 5.1.3 Discriminatory and Stabilit y Prop erties 57 5.1.4 Exp erimen ts and Results 58 5.1.4.1 T est1 59 5.1.4.2 T est2.1 and T est2.2 59 5.2 The Finite Elemen t Metho d 61 5.2.1 Video Acquisition Conditions 61 5.2.2 Strain Images 62 5.2.3 Discriminatory and Stabilit y Prop erties 63 5.2.4 Exp erimen ts and Results 63 5.2.4.1 T est1 64 5.2.4.2 T est2 65 5.2.4.3 T est3 66 ii
PAGE 6
CHAPTER 6 TO W ARDS EXPRESSION INV ARIANT MODELING 68 6.1 Algorithm for Mo deling Y oung's Mo dulus from Multiple Expressions 69 6.1.1 Automatic Landmark Detection 70 6.1.2 The F ace Mo del 70 6.2 Searc h Algorithm for Computing Material Constan ts 72 6.2.1 Genetic Algorithm 73 6.2.1.1 Genetic Co ding 74 6.2.1.2 Genetic Op erators 74 6.2.2 T raining 76 6.3 Com bination Rules in a MultiF eature Classication System 77 6.4 Exp erimen ts and Results 80 6.4.1 BU4DFE Dataset 80 6.4.2 Nonrigid Motion T rac king 81 6.4.3 Expression In v arian t Matc hing 83 CHAPTER 7 DISCUSSION AND CONCLUSIONS 87 7.1 Impact on Image Analysis Applications 88 7.1.1 Expression Sp otting in Video 88 7.1.2 MicroExpression Detection 89 7.1.3 Age Estimation 89 7.2 Ideas for F uture W ork 90 REFERENCES 92 ABOUT THE A UTHOR End P age iii
PAGE 7
LIST OF T ABLES T able 4.1 Range and in tensit y images for eac h sub ject. 52 T able 4.2 Exp erimen t design for p erson iden tication using strain maps computed from range data. 54 T able 5.1 Exp erimen t design for p erson iden tication using strain maps computed from the nite dierence metho d. 59 T able 5.2 Exp erimen t design for p erson iden tication using strain maps computed from the nite elemen t metho d. 64 T able 6.1 P arameter settings used in the genetic algorithm. 76 T able 6.2 P erformance ev aluation of mo delbased trac king and in terp olationbased trac king. 82 T able 6.3 Exp erimen t design for matc hing based on the Y oung's mo duli mo deled from m ultiple expressions. 84 T able 6.4 Comparing the p erformance of maxrule and minrule decision principles for the expression in v arian t matc hing exp erimen t at 5% false acceptance rate. 84 iv
PAGE 8
LIST OF FIGURES Figure 3.1 Flo w eld pro duced b y a t ypical optical ro w algorithm on an image pair. 33 Figure 3.2 The prepro cessing step in Principal Comp onen t Analysis (PCA). 38 Figure 4.1 System ro w of the presen ted approac h for p erson iden tication using strain patterns. 43 Figure 4.2 Dense motion eld generated b y the optical ro w metho d using t w o video frames b et w een whic h the sub ject op ened his mouth sligh tly 45 Figure 4.3 Sparse motion eld generated b y the optical ro w metho d using t w o video frames b et w een whic h the sub ject op ened his mouth sligh tly 45 Figure 4.4 Strain magnitude images computed from the optical ro w data using the nite dierence metho d. 47 Figure 4.5 2D nite elemen t face mo del. 48 Figure 4.6 Strain magnitude images computed from the optical ro w data using the nite elemen t metho d. 51 Figure 4.7 Samples of range and 2D in tensit y images. 52 Figure 4.8 Samples of strain pattern calculated using range data, eac h for a dieren t sub ject. 53 Figure 4.9 In trasub ject and in tersub ject v ariation of strain maps computed from range data. 54 Figure 4.10 R OC curv es of the PCA exp erimen t using strain maps computed with range data. 55 Figure 5.1 Video acquisition conditions. 56 Figure 5.2 Strain images computed from the nite dierence metho d. 57 Figure 5.3 Samples of facial strain pattern (computed from the nite dierence metho d) that illustrate in tersub ject v ariabilit y 58 v
PAGE 9
Figure 5.4 Samples of facial strain pattern (computed from the nite dierence metho d) that illustrate in trasub ject consistency under dieren t conditions. 58 Figure 5.5 In trasub ject and in tersub ject v ariation of strain maps (computed from the nite dierence metho d) in T est1 (regular face: normal vs. shado w ligh ting). 60 Figure 5.6 R OC curv e of T est1 (regular face: normal vs. shado w ligh ting). 60 Figure 5.7 In trasub ject and in tersub ject v ariation of strain maps (computed from the nite dierence metho d) in T est2.1 (regular face normal vs. regular face shado w ligh ting) and T est2.2 (regular face normal vs. camouraged face shado w ligh ting). 61 Figure 5.8 Video acquisition conditions. 62 Figure 5.9 Strain images computed from the nite elemen t metho d. 62 Figure 5.10 Strain maps of the same individual under dieren t ligh tings. 63 Figure 5.11 Strain maps of dieren t individuals under normal ligh ting. 64 Figure 5.12 In trasub ject and in tersub ject v ariation of strain maps (regular face: normal vs. lo w ligh ting). 65 Figure 5.13 In trasub ject and in tersub ject v ariation of strain maps (regular face: normal vs. shado w ligh ting). 65 Figure 5.14 In trasub ject and in tersub ject v ariation of strain maps (normal ligh ting: regular vs. camouraged faces). 66 Figure 6.1 Results from the ViolaJones landmark detection algorithm. 71 Figure 6.2 FE face mo del with ner sub divisions. 71 Figure 6.3 Plot of the ob jectiv e function for dieren t v alues of relativ e stiness for A r e a2 and A r e a3 sho wn in Figure 6.2. 72 Figure 6.4 Genetic co ding of the Y oung's mo dulus in a nite elemen t mo del. 75 Figure 6.5 Sample expression mo del sequences (male and female). 81 Figure 6.6 In trasub ject and in tersub ject v ariation of Y oung's mo dulus using the maxrule decision principle. 85 vi
PAGE 10
F A CIAL SKIN MOTION PR OPER TIES FR OM VIDEO: MODELING AND APPLICA TIONS V asan t Manohar ABSTRA CT Deformable mo deling of facial soft tissues ha v e found use in application domains suc h as h umanmac hine in teraction for facial expression recognition. More recen tly suc h mo deling tec hniques ha v e b een used for tasks lik e age estimation and p erson identication. This dissertation is fo cused on dev elopmen t of no v el image analysis algorithms to follo w facial strain patterns observ ed through video recording of faces in expressions. Sp ecically w e use the strain pattern extracted from nonrigid facial motion as a simplied and adequate w a y to c haracterize the underlying material prop erties of facial soft tissues. Suc h an approac h has sev eral unique features. Strain pattern instead of the image in tensit y is used as a classication feature. Strain is related to biomec hanical prop erties of facial tissues that are distinct for eac h individual. Strain pattern is less sensitiv e to illumination dierences (b et w een enrolled and query sequences) and face camourage b ecause the strain pattern of a face remains stable as long as reliable facial deformations are captured. A nite elemen t mo deling based metho d enforces regularization whic h mitigates issues (suc h as temp oral matc hing and noise sensitivit y) related to automatic motion estimation. Therefore, the computational strategy is accurate and vii
PAGE 11
robust. Images or videos of facial deformations are acquired with video camera and without sp ecial imaging equipmen t. Exp erimen ts using range images on a dataset consisting of 50 sub jects pro vide the necessary pro of of concept that strain maps indeed ha v e a discriminativ e v alue. On a video dataset con taining 60 sub jects undergoing a particular facial expression, exp erimen tal results using the computational strategy presen ted in this w ork emphasize the discriminatory and stabilit y prop erties of strain maps across adv erse data conditions (shado w ligh ting and face camourage). Suc h prop erties mak e it a promising feature for image analysis tasks that can b enet from suc h auxiliary information ab out the h uman face. Strain maps add a new dimension in our abilities to c haracterize a h uman face. It also fosters new er w a ys to capture facial dynamics from video whic h, if exploited ecien tly can lead to an impro v ed p erformance in tasks in v olving the h uman face. In a subsequen t eort, w e mo del the material constan ts (Y oung's mo dulus) of the skin in subregions of the face from the motion observ ed in m ultiple facial expressions. On a public database consisting of 40 sub jects undergoing some set of facial motions, w e presen t an expression in v arian t strategy to matc hing faces using the Y oung's mo dulus of the skin. Suc h an ecien t w a y of describing underlying material prop erties from the displacemen ts observ ed in video has an imp ortan t application in deformable mo deling of ph ysical ob jects whic h are usually gauged b y their simplicit y and adequacy The con tributions through this w ork will ha v e an impact on the broader vision comm unit y b ecause of its highly no v el approac hes to the longstanding problem of motion analysis of elastic ob jects. In addition, the v alue is the cross disciplinary nature and its fo cus on applying image analysis algorithms to the rather dicult and imp ortan t problem of material prop ert y c haracterization of facial soft tissues and their applications. W e b eliev e this researc h pro vides a sp ecial opp ortunit y for the utilization of video processing to enhance our abilities to mak e unique disco v eries through the facial dynamics inheren t in video. viii
PAGE 12
CHAPTER 1 INTR ODUCTION The h uman face has b een the sub ject of tremendous scrutin y in the elds of h uman cognition, computer vision, image pro cessing and computer graphics, and is p erhaps the most extensiv ely researc hed facet of our b o dy This is hardly surprising giv en the imp ortance of its function in so cial in teractions [50 ]. The face is the primary medium for con v eying iden tit y cognitiv e state, emotiv e in ten t, and disseminating aectiv e resp onses [63 ]. Although h umans use language as the main c hannel for con v eying facts and sc hemes, it do es not matc h the face's capacit y for comm unicating emotions. The study of ho w facial expressions transmit aectiv e information w as pioneered in b eha vioral psyc hology b y Ekman and his asso ciates [35 ]. F rom these studies it w as disco v ered that facial expressions are uniquely h uman, basic emotional expressions are consisten t in exhibition and in terpretation across all demographics, and emotional messages are constructed b y the actions of certain facial m uscles. Due to its complex dynamic nature, the study of the face from a computational p ersp ectiv e has resulted in increasingly more sophisticated to ols in nonrigid ob ject mo deling and trac king, ob ject parameterization and recognition, o cclusion handling, and extracting in v arian t features in realw orld settings. Automated facial analysis has b een motiv ated b y some v ery desirable applications, whic h consider b oth its static nature, suc h as iden tit y recognition, and its dynamic nature, suc h as expression recognition, visual sp eec h recognition, and the realistic animation of virtual h umans. This analysis has also pa v ed new insigh ts and to ols in medical applications, b eha vioral science, securit y education, and h uman computer in teraction. 1
PAGE 13
1.1 Motiv ation Muc h of the ab o v e men tioned analysis has b een conducted on static 2D or 3D images or short 2D image sequences. Ho w ev er there has b een v ery little w ork in in v estigating facial dynamics in videorate data. With resp ect to the h uman face, one imp ortan t use for this is to analyze the w a ys in whic h individuals can, or are able to deform their face while p erforming expression or sp eec h. W e can use this analysis to explore the question of whether it is p ossible to c haracterize an individual based on their facial motion [52 64 ]. F or example, ho w can w e quan tify the similarly b et w een t w o p eoples' smiles? Whic h expressions b est discriminate individuals? Is it p ossible to build a protot ypical mo del of ho w p eople smile? Ho w are individual dierences rerected as a deviation from this mo del? T o the b est of our kno wledge, these questions ha v e receiv ed a v ery small amoun t of atten tion in 2D image analysis. F rom an iden tication system's p oin t of view, the concept of recognizing a p erson based on facial motion is attractiv e; since facial mo v emen ts comprise a complex sequence of m uscle activ ations, it is fairly dicult to imitate another p erson's facial expressions and these facial motion c haracteristics are discriminativ e to an individual [121 ]. F urthermore, the use of facial motion is in certain resp ects more robust to fraudulen t attac ks than curren t static face recognition algorithms, whic h fail if presen ted with a ph ysical mo del of a p erson's face. F acial dynamics are also indep enden t of ligh ting, p ose, and app earance c hanges (suc h as face co v ered with sand/dust, w earing mak eup, etc...), whic h are apparen t in realw orld en vironmen ts. In exp erimen tal psyc hology determining the precise role of facial motion in determining iden tit y is still largely unkno wn, and is activ ely pursued [52 96]. By exploring this notion using computer vision tec hniques, w e will b e able to ev aluate the strength of the dynamic cue in iden tit y recognition. 2
PAGE 14
1.2 Problem Description The cen tral goal of this dissertation is to v alidate the follo wing h yp othesis: Using a vide or ate image pr o curing system c apturing a p erson 's fac e under going some set of facial motions, it is p ossible to extr act motion pr op erties of the facial skin which p ossess distinct individual char acteristics. F urthermor e, these pr op erties wil l b e mor e r obust and stable than pr eviously attempte d facial motion analysis systems, which have only c onsider e d motion fr om two image `snapshots' [121, 89 ], r ather than deriving subtle pr op erties in the sp atiotemp or al dimension. W e extract these skin motion prop erties from a public facial motion database and a more c hallenging database collected as part of this researc h. This comprises at least 60 individuals eac h p erforming a n um b er of facial expressions under dieren t ligh ting conditions. F ew sub jects ha v e data collected with face camourage. The motiv ation for collecting motion data under camourage is to in v estigate their robustness and stabilit y ev en under suc h c hallenging y et realistic conditions; something whic h has not b een explored in the computer vision literature. The n um b er of sub jects in our database has b een c hosen to meet a tradeo b et w een the h uman lab or in v olv ed in data collection, y et b eing sucien tly large to represen t the c haracteristics of facial motion and to facilitate empirical ev aluation. The metho ds w e emplo y to extract individual motion signatures include deformable surface mo deling for explaining and trac king face scans o v er time, and statistical tec hniques to unco v er a lo wdimensional motion manifold emb edded in the highdimensional spatiotemp oral space. F urther details of our researc h are presen ted in the rest of this man uscript. 1.3 Presen ted Approac h Deformable mo deling of facial soft tissues ha v e found use in application domains suc h as h umanmac hine in teraction for facial expression recognition [36 ]. More recen tly 3
PAGE 15
suc h mo deling tec hniques ha v e b een used for tasks lik e age estimation [66 ] and p erson iden tication [121 89, 78 ]. Existing mo deling approac hes can b e divided in to t w o ma jor groups. Mo dels based on solving con tin uum mec hanics problems under consideration of material prop erties and other ph ysical constrain ts are called ph ysical mo dels. All other mo deling tec hniques, ev en if they are related to mathematical ph ysics, are kno wn as non{ph ysical mo dels. A comprehensiv e review of deformable mo deling tec hniques and their applications in computer vision, graphics, and medical imaging applications can b e found in [41 ] and [82 ] resp ectiv ely Though ph ysical mo deling, i.e., n umerically solving partial dieren tial equations (PDEs) of elasticit y theory pro vides a highly accurate and robust solution strategy the ma jor problem with suc h approac hes is that: the observ ed ph ysical phenomena can b e v ery complex and solving the underlying PDEs requires substan tial computational cost. The answ ers to these t w o questions th us lie in: nding an adequate simplied mo del of the giv en problem co v ering the essen tial observ ations and applying ecien t n umerical tec hniques for solving the PDEs. In this w ork, w e use the strain pattern extracted from non{rigid facial motion as a simplied and adequate w a y to c haracterize the underlying material prop erties of facial soft tissues. The presen ted metho d has sev eral unique features: It is the strain pattern, instead of the image in tensit y that is used as a classication feature. Strain is related to the biomec hanical prop erties of facial tissues that are distinct for eac h individual. 4
PAGE 16
Strain pattern is less sensitiv e to illumination dierences (b et w een enr ol le d and query sequences) and face camourage b ecause the strain pattern of a face remains stable as long as reliable facial deformations are captured. Images or videos of facial deformations can b e acquired with a regular video camera and no sp ecial imaging equipmen t is needed. A subsequen t nite elemen t mo deling based metho d enforces regularization whic h mitigates issues related to automatic motion estimation (suc h as noise sensitivit y temp oral matc hing, and motion discon tin uit y). Therefore, the computational strategy is accurate and robust. Besides pro viding an iden tit y signature for an individual, the facial strain pattern of a face in expression also rev eals the facial dynamics of a p erson. By facial dynamics, w e refer to the non{rigid mo v emen t of facial features, in addition to the rigid mo v emen t of the head [48 ]. Recen t psyc hological and neural studies indicate that c hanging facial expressions and head mo v emen ts pro vide additional evidences for face recognition [88 ]. This hin ts that facial dynamics, if exploited ecien tly can lead to impro v ed p erformance in automatic face recognition. Th us, the fo cus of this researc h is on the dev elopmen t of no v el image analysis algorithms to follo w strain patterns observ ed through video recording of faces in expressions. The ob jectiv e is to mo del the soft tissue prop erties of individual faces at a coarse lev el that is sucien t for certain image analysis applications. In particular, the tasks w e address is p erson iden tication from video. 1.4 Supp ortiv e Evidence Besides the results observ ed through this researc h, w e supp ort our claim with the follo wing evidence found in existing literature: 5
PAGE 17
F acial mo v emen ts during the formation of expression are the result of a complex sequence of m uscle activ ations whic h are distinct to an individual in b oth `st yle' (i.e. the sequence of m uscular activ ations in p erforming an expression) and `conten t' (i.e. the underlying anatomical structure of the individual's face.) [121 ]. This giv es a strong indication that facial motion elds can b e used to recognize iden tit y In psyc hological studies, it has b een observ ed that h umans use facial motion as a cue for iden tit y recognition, and this b ecomes increasingly imp ortan t with increased familiarit y with that p erson [48 96]. There is also evidence to suggest h umans can recognise a p erson based solely on facial motion [52 64]. Recen t studies in using facial motion for iden tit y recognition ha v e sho wn p ositiv e results, although w e b eliev e there is ro om for considerable impro v emen t. P am udurth y et al. [89 ] constructed a deformation v ector eld b y corresp onding skin p ore lo cations in a neutral and smiling expression from t w o highresolution digital images. They sho w ed that it is p ossible to ac hiev e sup erior face recognition p erformance when compared with the p opular PCA+LD A static metho d when a p ersons app earance has b een altered (in this case this w as the application of mak eup.) Zhang et al. [121 ] also prop osed a metho d for recognising individuals based on the biomec hanical c haracteristics of facial tissue. They man ually compute an elastic strain pattern at the region b et w een the c heek b one and ja w line, co v ering the masseter m uscle. The results w ere p ositiv e; a v erication rate of 67.4% and a false alarm rate of 5% using a small data set. One w a y in whic h these results w ould b e impro v ed is to consider dense facial motion o v er time, rather than merely computed using the neutral and expression ap ex states, as is done in these studies. W e b eliev e that this will increase 6
PAGE 18
the repro ducibilit y of the motion signature. In tuitiv ely it is dicult to ha v e a p erson hold exactly the same facial expression as w as recorded in the recognition database. Indeed, this v ariabilit y ma y b e greater than the v ariabilit y b et w een individuals' expression, th us sev erely limiting recognition accuracy Another limitation of existing tec hniques is that the expression has to b e identitical b et w een the enrolled and query sequences. W e o v ercome this limitation b y mo deling the material constan ts from m ultiple expressions and using those prop erties in the recognition pro cess (See Chapter 6 for more details). 1.5 Con tributions of the Dissertation The cen tral con tribution of this researc h to the comm unit y is to dev elop algorithms for extracting, analyzing, and c haracterizing the motion of complex deformable surfaces, guided b y the sp ecic application of iden tifying individuals based on their facial motion. This w ork will b e imp ortan t in a n um b er of applications and future researc h directions. A selection of these is listed b elo w. 1.5.1 Main Thrust Areas 1.5.1.1 Computer Vision and Deformable Ob ject Analysis Man y asp ects of the researc h conducted as part of this dissertation will generalize b ey ond facial motion analysis, and ma y b e used as to ols for analyzing the deformation of arbitrary ob jects o v er time. The mathematical tec hniques dev elop ed in our researc h for mo deling motion and analyzing surface deformation will nd man y applications. These ma y include studying the deformation of ob jects during impact or load b earing, clinical applications suc h as automatically assessing the Range of Mobilit y (RoI) of regions of the b o dy after injury surgery and rehabilitation, and the automatic analysis of gro wth, whic h nds uses in cancerous gro wth and treatmen t. 7
PAGE 19
1.5.1.2 Biometrics A t presen t biometrics based on the h uman face use features deriv ed soley from its static app earance or shap e. Although face recognition is no w seen to p erform v ery reliably in constrained en vironmen ts, the lac k of dynamic information mak es it vulnerable to fraudulen t attac ks using syn thetic mo dels. F urthermore, since facial dynamics are indep enden t of texture they are robust to c hanges in app earance brough t ab out b y ligh ting v ariation, surface rerectance, and facial texture (for example when w earing mak eup.) Static face recognition also generally fails to distinguish iden tical t wins, y et w ork on facial dynamics has indicated that discriminatory c haracteristics can b e found using motion cues [89 ]. It has also b een p ostulated that facial motion remains more in v arian t to age than app earance [121 ]. In summary the use of facial motion signatures in addition to static signatures will pro vide an increased la y er of securit y and robustness to recognising iden tit y 1.5.2 Other Thrust Areas 1.5.2.1 P erceptual Psyc hology F acial motion pro vides a h uman observ er with not only a wide v ariet y of so cial information but also a p oten tially unique source of iden tit y information. A t presen t, there are t w o comp eting theories in p erceptual psyc hology attempting to explain wh y facial motion assists in h uman face recognition, whic h disagree on the precise role of facial motion [88 97 ]. One states that motion facilitates the construction of a more complete structural mo del of the face, th us enhancing the qualit y of the p erceptual represen tation of a face. The second states that w e use facial dynamics as a separate cue learned from the idiosyncratic patterns of motion whic h is complemen tary to structural information. In this w ork w e ev aluate the exten t with whic h facial motion can discriminate individuals whic h will b e useful as empirical evidence for resolving the comp eting theories. 8
PAGE 20
F urthermore, the motion mo dels w e will dev elop ma y b e used as imp ortan t to ols in exp erimen tal psyc hology F or example, extracted motion signatures can b e manipulated to form complemen tary or inconsisten t iden tit y gender, or age cues. They can also b e com bined to in v estigate whether complex expressions ma y b e p erceiv ed b y com bining more primitiv e motion mo dels asso ciated with the basic expressions. 1.5.2.2 Computer Graphics Muc h of the w ork in facial dynamics analysis is pioneered b y researc h in computer graphics, where the goal is to create realistic virtual, emotiv e faces. W e exp ect that facial motion signatures ma y b e com bined in no v el w a ys to animate a face in w a ys whic h w ere not observ ed when the motion data w as captured. 1.6 Organization of the Dissertation The structure of the presen t man uscript is as follo ws. W e start with a review of the existing approac hes for mo deling of deformable ob jects (with an emphasis on face mo deling) and for p erson iden tication (Chapter 2). In Chapter 3, w e presen t an o v erview of the essen tial bac kground kno wledge in the theory of elasticit y motion analysis, and dimensionalit y reduction in face recognition. Chapter 4 describ es our algorithm for computing strain pattern from a motion eld and presen ts a pro of of concept that strain pattern has the desired discriminativ e v alue through our exp erimen ts using range data. Chapter 5 rep orts our results on p erson iden tication using strain maps on a mo derately sized dataset with c hallenging data conditions. In Chapter 6, w e describ e our metho d for mo deling material constan ts of facial regions from m ultiple facial expressions and presen t results on its application to nonrigid motion trac king and expression in v arian t p erson iden tication. Finally w e conclude the thesis and mak e outlo ok for future researc h (Chapter 7). 9
PAGE 21
CHAPTER 2 LITERA TURE REVIEW 2.1 Deformable Mo deling Deformable mo deling of ph ysical ob jects has a long history Since computers b ecome an indisp ensable to ol in mo deling, sophisticated sim ulation of complex ph ysical scenes b ecomes a ma jor ev erlasting trend in computer graphics and man y other applications dealing with the computer assisted mo deling of ph ysical realit y The sim ulation of deformable ob jects is essen tial for man y applications. Historically deformable mo dels app eared in computer graphics and w ere used to create and edit complex curv es, surfaces and solids. Computer aided design uses deformable mo dels to sim ulate the deformation of industrial materials and tissues. In image analysis, deformable mo dels are used for tting curv ed surfaces, b oundary smo othing, registration and image segmen tation. Later, deformable mo dels are used in c haracter animation and computer graphics for the realistic sim ulation of skin, clothing and h uman or animal c haracters [60 81, 87, 41 ]. The mo deling of deformable soft tissue is, in particular, of great in terest for a wide range of medical imaging applications, where the realistic interaction with virtual ob jects is required. Esp ecially computer assisted surgery (CAS) applications demand the ph ysically realistic mo deling of complex tissue biomec hanics. Generally existing mo deling approac hes can b e ranged in to t w o ma jor groups. Mo dels based on solving con tin uum mec hanics problems under consideration of material prop erties and other en vironmen tal constrain ts are called physic al mo dels All other mo deling tec hniques, ev en if they are someho w related to mathematical ph ysics, are 10
PAGE 22
kno wn as nonphysic al mo dels A comprehensiv e review of deformable mo deling for computer vision, graphics, and medical applications can b e found in [82 ]. 2.1.1 Nonph ysical Mo deling Nonph ysical metho ds for mo deling of deformable ob jects are usually based on pure heuristic geometric tec hniques or use a sort of simplied ph ysical principles to ac hiev e the realit ylik e eect. These tec hniques are v ery p opular in computer graphics and sometimes used in real time applications, since they are computationally ecien t in comparison with exp ensiv e ph ysical approac hes. 2.1.1.1 Spline T ec hniques Man y early approac hes for mo deling deformable ob jects w ere dev elop ed in the eld of computer aided geometric design (CA GD), where rexible to ols for creation of in terp olating curv es and surfaces as w ell as the in tuitiv e w a ys to mo dify and rene these ob jects w ere needed. F rom this need came Beziercurv es and subsequen tly man y other metho ds of compact description of w arp ed curv es and surfaces b y a small v ector of n umb ers, including Bsplines, nonuniform rational Bsplines (NURBS) and other t yp es of spline tec hniques. The spline tec hnique is based on the represen tation of b oth planar and 3D curv es and surfaces b y a set of con trol p oin ts, also called landmarks. The main idea of spline based metho ds is to mo dify the shap e of complex ob jects b y v arying the p osition of few con trol p oin ts. Also the n um b er of landmarks as w ell as their w eigh ts can b e used for adjustmen t of the ob ject deformation. Suc h parameterbased ob ject represen tation is computationally ecien t and supp orts in teractiv e mo dication. A comprehensiv e in tro duction in curv e and surface mo deling with splines can b e found in [10 ]. A particular group of landmarkbased tec hniques represen t metho ds, whic h are used in the elastic image registration and based on radial basis functions deriv ed from some 11
PAGE 23
sp ecial closedform solutions of elasticit y theory In [22], a spline tec hnique based on the radial basis function r log( r ) deriv ed from the linear elastic solution of the thinplate deformation problem is prop osed. Suc h thinplate splines (TPS), globally dened in the image domain, are used for in terp olation of the deformation giv en b y the prescrib ed displacemen ts of con trol p oin ts. Extended TPStec hniques are describ ed in [98 ]. In [34 ], an analogous landmarkbased approac h is prop osed, where elastic b o dy spline (EBS) deriv ed from the sp ecial solution of 3D elasticit y is used as an in terp olating radial basis function. 2.1.1.2 F reeform Deformation F reeform deformation (FFD) b ecame p opular in computer assisted geometric design and animation in the last t w o decades. The main idea of FFD is to deform the shap e of an ob ject b y deforming the space in whic h it is em b edded. In early w ork [9], a general metho d based on the geometric mappings of 3D space w as prop osed. This deformation tec hnique uses a set of hierarc hical transformations for deforming an ob ject, including rigid motion, stretc hing, b ending, t wisting, and other op erators. The elementary spacew arpings are obtained b y using the surface normal v ector of the undeformed surface and a transformation matrix to calculate the normal v ector of an arbitrarily deformed smo oth surface. Complex ob jects can b e created from simpler ones, since the deformations are easily com bined in a hierarc hical structure. The p osition v ector and normal v ector in more complex ob jects are calculated from the p osition v ector and normal v ector in simpler ob jects. Eac h lev el in the deformation hierarc h y requires an additional matrix m ultiply for the normal v ector calculation. The term freeform deformation has b een in tro duced in a later w ork [102 ], where a more generalized approac h based on the em b edding an ob ject in a grid of mesh p oin ts of some standard geometry suc h as a cub e or cylinder, has b een prop osed. The basic FFD metho d has b een extended b y sev eral others [32 27 ]. 12
PAGE 24
2.1.2 Ph ysical Mo deling In the applications, whic h demand the realistic sim ulation of deformable ph ysical b o dies, there is no alternativ e to consisten t ph ysical mo deling, i.e., n umerical solving partial dieren tial equations (PDEs) of elasticit y theory The ma jor problem of ph ysical mo deling is that the observ ed ph ysical phenomena can b e v ery complex and solution of underlying PDEs requires substan tial computational exp enses. The answ ers to these t w o questions consist in nding an adequate simplied mo del of the giv en problem co v ering the essen tial observ ations and applying ecien t n umerical tec hniques for solving the PDEs. A v ariet y of approac hes for deformable mo deling, whic h ha v e b een dev elop ed in the past, w ere b ound to giv e their particular answ ers to these t w o questions. It is dicult to trace who rst prop osed a w orking ph ysical mo del of deformable living tissue. The list of names and researc h groups, whic h made their con tributions to this topic, is quite long. The study of biomec hanical prop erties of living tissues and their n umerical mo deling w as triggered b y single researc h programs of car, spaceand militaryindustry b eginning from the 50s and later substan tially b o osted in the early 80s with the dev elopmen t of computer tomograph y [7 ]. F urther ph ysically motiv ated tec hniques for elastic registration and segmen tation of medical images are in [8, 61]. A t the same time, rst fundamen tal theoretical and exp erimen tal in v estigations of tissue biomec hanics app ear. In the last t w o decades, a plethora of v arious approac hes and applications related to biomec hanical mo deling has b een dev elop ed. These metho ds can b e classied b y dieren t criteria. One of suc h classications is based on the t yp e 13
PAGE 25
of the n umerical tec hnique used in the mo deling approac h. There are four common n umerical metho ds for ph ysically based mo deling of deformable ob jects. These are massspringdamp er systems, the nite dierence metho d, the b oundary elemen t metho d, and the nite elemen t metho d. 2.1.2.1 MassSpringDamp er Systems In the early approac hes to soft tissue mo deling, an appro ximation of mec hanical con tin uum b y a massspringdamp er (MSD) system w as used. The ph ysical b o dy is represen ted b y a set of massp oin ts connected b y springs exerting forces on neigh b or p oin ts when a mass is displaced from its rest p ositions. MSD systems can b e seen as a simplied mo del of particle in teraction, since ph ysical b o dies in fact consist of discrete subelemen ts, atoms and molecules. The spring forces F s are usually considered to b e linear (Ho ok ean), F s = k u where u is the displacemen t of massp oin t and k denotes the spring constan t corresp onding to the material stiness. In one of the rst w ork on the eld of facial animation [92 ], a m uscle mo del based on MSD systems, whic h essen tially solv e the static system K u = F ex is presen ted. The face is mo deled as a t w odimensional mesh of p oin ts connected b y linear springs. Muscle actions are represen ted b y forces applied to the corresp onding region of mesh no des. This approac h w as expanded in the later w ork, where a more sophisticated MSD mo del of m uscles w as dev elop ed. In [113 ], m uscles directly displace no des within zones of inruence, whic h are parameterized b y radius, fallo co ecien ts, and other parameters. In [107 ], dynamic massspring systems for facial mo deling are describ ed. In this approac h, a m ultila y er mesh of mass p oin ts represen ting three anatomically distinct facial tissue la y ers: the dermis, the sub cutaneous fat la y er, and the m uscle 14
PAGE 26
la y er is used. This approac h has b een extended in [68 ], where a mesh adaptation algorithm is used that tailors a generic mesh to the individual features b y lo cating these features in a laserscanned image. F or impro v ed realism, this form ulation also includes constrain t forces to prev en t m uscles and facial no des from p enetrating the skull. In [62 ], a mass spring mo del of facial tissue for the soft tissue prediction in craniofacial surgery sim ulations is prop osed. The ma jor dra wbac k of MSD systems is their insucien t appro ximation of true material prop erties. Being a v ery simplied mo del of mec hanical con tin uum, particle systems do not pro vide the required accuracy for the realistic sim ulation of complex comp osite materials suc h as soft tissue. MSD systems are also w eak, if complex, arbitrary shap ed ob jects suc h as thin surfaces, whic h are resistan t to b ending, are to b e mo deled. 2.1.2.2 Finite Dierence Metho d The nite dierence metho d (FDM) is historically the rst true discretization tec hnique for solving partial dieren tial equations. The general approac h of the FDM is to replace the con tin uous deriv ativ es within the giv en b oundary v alue problem with nite dierence appro ximations on a grid of mesh p oin ts that spans the domain of in terest. Consequen tly the dieren tial op erator is appro ximated b y an algebraic op erator. The resulting system of equations can then b e solv ed b y a v ariet y of standard tec hniques. A general algorithm for the nite dierence discretization of linear b oundary v alue problems is as follo ws: Con v ert con tin uous v ariables to discrete v ariables. Appro ximate the deriv ativ es at eac h p oin t using form ulae deriv ed from a T a ylor series expansion using the most accurate appro ximation a v ailable that is consisten t with the giv en problem. 15
PAGE 27
Assem ble the linear system of equations resp ectiv ely to the no dal v alues. Apply b oundary conditions on the b oundary p oin ts separately Solv e the resulting set of coupled equations using either direct or iterativ e sc hemes as appropriate for the giv en problem. The FDM ac hiev es eciency and accuracy when the geometry of the problem is regular. The FDM is usually applied on cubic grids, whic h are naturally giv en b y pixels or v o xels of 2D or 3D digital images, resp ectiv ely Ho w ev er, the discretization of ob jects with the irregular geometry b ecomes extremely dense, whic h requires extensiv e computational resources for data storage and system solving. In [101 ], the FD approac h for the linear elastic prediction of facial tissue in craniofacial surgery planning is applied. 2.1.2.3 Boundary Elemen t Metho d A general principle of solving the b oundary v alue problem giv en b y the partial dieren tial equation (PDE) and the b oundary conditions consists in bringing the dieren tial problem in to an in tegral form. F or a certain class of problems, the resulting in tegration o v er the whole domain of in terest can b e substituted b y the in tegration o v er the b oundary Consequen tly only the b oundary of the domain has to b e discretized, whic h in turn means that: the dimension of the resulting system of equations is signican tly smaller than in the case of total v olume discretization and the dicult problem of v olumetric mesh generation b ecomes redundan t. F or the dieren tial op erator of elasticit y theory suc h b oundary in tegral form ulation can b e obtained. In [14 ], the b oundary elemen t metho d (BEM) for static and dynamic problems of con tin uum mec hanics is describ ed. Unfortunately the v olume in tegrals in the BEM can b e completely eliminated only if 16
PAGE 28
the material is homogeneous and no v olumetric forces are giv en. This is generally not the case in soft tissue mo deling. F urthermore, the system matrix when using BEM is fully o ccupied, whic h mak es the application of ecien t iterativ e solving tec hniques dicult or ev en imp ossible. An example of the application of the b oundary elemen t metho d for the mo deling of deformable ob jects is giv en in [55 ]. 2.1.2.4 Finite Elemen t Metho d The nite elemen t metho d (FEM) b ecomes the ultimate !state of the art tec hnique in ph ysically based mo deling and sim ulation. The FEM is sup erior to all previously discussed metho ds when accurate solution of con tin uum mec hanics problems with the complex geometry has to b e found. It also pro vides the most rexible mo deling platform free of all limitations with resp ect to the material t yp e and the b oundary conditions. More accurate ph ysical mo dels treat deformable ob jects as a mec hanical con tin uum: solid b o dies with mass and energies distributed throughout the threedimensional domain they o ccup y Unlik e the discrete MSD systems, the FEM is deriv ed directly from the equations of con tin uum mec hanics. In a dierence to the FDM, the dieren tial op erators are not appro ximated b y simple algebraic expressions, but applied as they are on the subspaces of those admissible solution elds. The dierence to the BEM consists in the v olume in tegration, whic h enables a more general approac h to the con tin uum mo deling. In elasticit y theory the deformation of a ph ysical b o dy is describ ed as the equilibrium of external forces and in ternal stresses. The static equilibrium for an innitesimal v olume is giv en b y the partial dieren tial equations, whic h implies the relationship b et w een the deformation v ariables suc h as stresses, strains, or displacemen ts and the applied force densit y and also con tains the constan ts describing the ob ject material prop erties. T o compute the ob ject deformation, the PDEs of elasticit y theory ha v e to 17
PAGE 29
b e in tegrated o v er the domain o ccupied b y a b o dy Since it is usually imp ossible to nd a closedform analytical solution for an arbitrary domain, n umerical metho ds are used to appro ximate the ob ject deformation for a discrete n um b er of p oin ts (mesh no des). MSD or FD metho ds appro ximate ob jects as a nite mesh of no des and discretize the equilibrium equation at the mesh no des. The FEM divides the ob ject in to a set of elemen ts and appro ximate the con tin uous equilibrium equation o v er eac h elemen t. The main adv an tage of the FEM o v er the no debased discretization tec hniques is the more rexible no de placemen t and the substan tial reduction of the total n um b er of degrees of freedom needed to ac hiev e the required accuracy of the solution. The main idea of con tin uum based deformable mo deling consists in the minimization of the stored deformation energy since the ob ject reac hes equilibrium when its p oten tial energy is at a minim um. The basic steps of the FEM approac h to compute the ob ject deformations are the follo wing: Deriv e an equilibrium equation for a con tin uum with giv en material prop erties. Select the appropriate nite elemen ts and corresp onding in terp olation functions for the problem. Sub divide the ob ject in to the elemen ts. All relev an t v ariables on eac h elemen t ha v e to b e in terp olated b y in terp olation functions. Assem ble the set of equilibrium equations for all of the elemen ts in to a single system. Implemen t the giv en b oundary constrain ts. Solv e the system of equations for the v ector of unkno wns. 18
PAGE 30
Finite elemen t metho ds enable the most realistic sim ulation of deformable living ob jects. Ho w ev er, ev en this sophisticated approac h has its limitations. The material prop erties of living tissues are highly complex and usually ha v e to b e estimated empirically Living ob jects are comp osite materials with a v ery complex geometrical structure. V arious con tact and obstacle problems are asso ciated with the mo deling of suc h m ultib o dy systems. A general problem concerns the mo deling of large deformations. A widely used linear elastic approac h can only b e applied under the assumption of small deformations, whic h often do es not hold for soft tissue rearrangemen ts in craniofacial surgery in terv en tions. All these and man y other problems mak e the consisten t FE based mo deling of soft tissue a v ery c hallenging task. The FE analysis is widely used for mo deling deformable living tissues in medical imaging and CAS applications [25 33, 24 38 ]. The most adv anced FE based approac h for mo deling of facial tissue within the scop e of the craniofacial surgery planning is in [65 99 ]. Throughout all these and other early w ork, the linear elastic appro ximation of soft tissue b eha vior is t ypically usually used. Till no w, no in v estigations of nonlinear FEbased mo dels of facial tissue are kno wn. In this w ork, w e in v estigate the nite dierence metho d and the nite elemen t metho d as our solution approac h. 2.1.3 F ace Mo deling and Biomec hanics Eorts ha v e b een made to assist face animation and recognition using a highly accurate mo del that tak es in to accoun t anatomical details of a face, suc h as b ones, m usculature, and skin tissues [116 123 ]. This is based on the premise that the n uances recognizable b y h umans can b e syn thesized (and fully explained) only b y an elab orate biomec hanical mo del. The early studies in this direction dev elop ed mo dels with a hierarc hical biomec hanical structure that w ere capable of sim ulating linear and sphincter facial mo v emen ts [108 113 ]. Mo dels of a m uscle structure using quadric segmen ts w ere 19
PAGE 31
also prop osed [58, 59]. Zhang et al. [123 122 ] studied a mo del that incorp orates a more detailed 3la y er skin mo dule to c haracterize the b eha vior and in teraction among the epidermis, dermis, h yp o dermis, and m uscular units. A ma jor c hallenge of using a sophisticated anatom y{based mo del is the high computational complexit y in v olv ed. Although the aforemen tioned mo dels adopted a m uc h simplied springmass system, it is still a daun ting task to emplo y these mo dels in a real time application. F or example, it is a non{trivial task to create a detailed individual mo del for eac h sub ject in the database for a biometric study that could consists of a few h undreds to a ten th of thousands of sub jects [45 ]. An alternativ e is to extract biomec hanical information (that migh t b e adequate for certain tasks) from images and videos without building a fullscale mo del. Essa and P en tland [36 ] dev elop ed a nite elemen t mo del to estimate visual m uscle activ ations and to generate motionenergy templates for expression analysis. Ho w ev er, automatic iden tication of action units that estimate the m uscle activ ations is still a topic of op en researc h. In our approac h, whic h is also based on biomec hanics, w e go a step further b y quan tifying the soft tissue prop erties through its elasticit y and eectiv ely represen ting it b y means of strain maps. The study of facial strain requires high qualit y motion data generated b y robust trac king metho ds, an extensiv ely in v estigated sub ject in computer vision. The trend is to in tegrate v arious image cues and prior kno wledge in to a face mo del, suc h as structural information, depth maps, feature p ositions, optical ro w, statistical shap e prior, as w ell as in ternal regularization assumptions [30 20 12 4, 54]. Most of the metho ds rely on a certain degree of user in terv en tion, for either mo del initialization or trac king guidance. Metho ds that a v oid the use of handlab eled features and man ual corresp ondence w ere also in v estigated. Suc h metho ds emplo y ed conformal geometry and soft optical ro w to facilitate the trac king pro cess [112 119 ]. Statistical heuristics learned from training w ere also explored to impro v e the trac king robustness [100 3]. Ho w ev er, those 20
PAGE 32
metho ds either require an extensiv e collection of training samples or w ork only for certain facial motions, whic h mak es them unsuitable to handle all the data t yp es that w e ha v e. Therefore, in this study w e adopt an algorithm in its basic form: a robust optical ro w metho d. Optical ro w is an appro ximation of the motion eld observ ed on image plane. Man y algorithms ha v e b een dev elop ed to solv e the fundamen tal optical ro w equation whic h is based on the principle of image brigh tness constancy They are usually accompanied b y lo cal or global smo othness constrain ts. The solution approac h adopted in this study is based on a robust estimation framew ork [18 ]. 2.2 P erson Iden tication It is presumptuous to assume that one can giv e a complete review of existing face recognition tec hniques. There are coun tless pap ers in literature that dier in the data and mo dels they use to solv e the problem. Giv en the scop e of this w ork where a surv ey of curren t tec hniques is not the primary ob jectiv e, this section giv es an o v erview of the most imp ortan t w ork in this direction. In a broadsense, face recognition approac hes can b e classied in to t w o categories: ImageBased F ace Recognition Algorithms VideoBased F ace Recognition Algorithms The follo wing sections giv e a brief outline of algorithms that fall in to one of the t w o categories with an emphasis on videobased approac hes that extract dynamic information for impro v ed p erformance. 21
PAGE 33
2.2.1 ImageBased Algorithms 2.2.1.1 2D Algorithms Princip al Comp onent A nalysis (PCA) [109 90 86 ] is a dimensionalit y reduction approac h whic h can b e used as a recognition tec hnique in the con text of learning a subspace whose basis v ectors corresp ond to the maxim um v ariance direction in the original image space. It is one of the rst and most p opular approac h to face recognition whic h has a reasonable p erformance in most scenarios. Indep endent Comp onent A nalysis (ICA) [11 74 ] attempts to nd the basis v ectors along whic h the data are statistic al ly indep endent This is done b y minimizing the secondorder and higherorder dep endencies in the input data. Line ar Discriminant A nalysis (LD A) [77 13, 124 ] nds the basis v ectors in the original space that results in maximal separation among dieren t classes. F or all samples of all classes, the goal of LD A is to maximize the b et w eenclass scatter while minimizing the withinclass scatter. It is generally b eliev ed that ob ject recognition algorithms based on LD A are sup erior to those based on PCA. Ho w ev er, in [80 ], Martinez and Kak argue that this is not the case alw a ys. Results on a face database sho w ed that PCA outp erforms LD A when the training data set is small and also that PCA is less sensitiv e to training data sets. Evolutionary Pursuit (EP) [75 ], a sp ecic kind of genetic algorithm, is an eigenspacebased dynamic approac h that searc hes for the b est set of pro jection axes in order to maximize an ob jectiv e function, while accoun ting for classication accuracy and generalization capabilit y of the solution. All of the ab o v e metho ds fall in the category of templatematching approac hes. Elastic Bunch Gr aph Matching (EBGM) [114 ] is a fe atur eb ase d tec hnique in whic h faces are represen ted as graphs, with no des p ositioned at some anc hor p oin ts detected on the face (ey es, nose, etc...) and edges lab eled with 2D distance v ectors. Recognition is then based on these lab eled graphs. T r ac e tr ansforms [57 106 ] is an image pro cessing tec hnique that is used to recognize ob jects under transformations, i.e. rotation, scaling 22
PAGE 34
and translation. First, a functional is computed along tracing lines of an image and then recognition is p erformed using a set of trace functionals. Bayesian Intr a/extr ap ersonal Classier (BIC) [85 73 ] presen ts a probabilistic framew ork based on the Ba y esian b elief that the image in tensit y dierences are c haracteristic of t ypical v ariations in app earance of an individual. Similarit y among individuals is measured using the Ba y esian rule. Apart from these, there is a dieren t class of face recognition algorithms that are based on mo deling the subspace to b etter c haracterize the sp ecic case of faces. Kernel metho ds [115 6, 128 127 ] w ere dev elop ed based on the premise that the face manifold in the subspace need not necessarily b e linear. These metho ds essen tially explore direct nonlinear manifold sc hemes to generalize the linear metho ds. Metho ds using Supp ort V e ctor Machines (SVM) [47 51 ] handle face recognition as a binary classication problem. They nd a h yp erplane that separates the largest fraction of data samples from one class on the same side, while maximizing the distance from either classes to the h yp erplane. PCA is rst used to extract features of face images and then discrimination functions b et w een eac h pair of images are learned b y SVMs. An excellen t review of 2D face recognition algorithms can b e found in [28 125 ]. 2.2.1.2 3D Algorithms 2D algorithms sho w v ery go o d recognition rates o v er a restricted set of inputs. Ho w ev er, they are not rexible with resp ect to p ose, illumination or scale c hanges, whic h require m ultiple templates to p erform recognition o v er v arying conditions. In ligh t of this, a need to explore new er information for face recognition w as felt necessary Using range data w as found to b e a promising direction b ecause curv ature data whic h, can b e computed from depth information, op ens up new horizons for face recognition metho ds in terms of c haracterizing a face and b eing viewp oin t indep enden t. 3D algorithms can b e further classied in to t w o branc hes: 23
PAGE 35
Using range information only Using range and texture information Metho ds that use depth information alone p erform recognition using geometric properties of the facial surface suc h as curv ature, depth maps, etc... On the other hand, algorithms that use b oth range and texture cues use in tegrated mo dels whic h com bine a mo del of shap e v ariation with a mo del of the app earance v ariations in a shap enormalized frame. An A ctive App e ar anc e Mo del (AAM) [31 ] con tains a statistical mo del of the shap e and gra ylev el app earance of the face whic h can b e generalized to an y v alid face. Matc hing is done b y nding parameters of the mo del whic h minimize the dierence b et w een the original image and the pro jected image obtained from a syn thesized mo del. Morphable mo dels [19 21 ], prop osed b y Blan tz et al., is based on the h yp othesis that a h uman face is in trinsically a surface lying in 3D space. They prop osed a face recogntion metho d based on a morphable face mo del that enco des shap e and texture in terms of parameters of the mo del and an algorithm to reco v er these parameters from a single image of a face. Another seminal w ork in this direction is a recognition metho d based on c anonic al surfac es [26 ]. The no v el con tribution of this w ork is the abilit y to compare facial surfaces irresp ectiv e of deformations caused b y facial expressions. The range image is prepro cessed b y remo ving certain parts suc h as hair and ey ebro ws whic h can complicate the recognition step. A canonical form of the facial surface is then computed. Since suc h a represen tation is less sensitiv e to head orien tation and facial expression, the recognition pro cess is signican tly simplied. Though using 3D information o v ercomes the problems p ose and illumination intro duce, the latency in v olv ed in range sensoring equipmen ts mak e it unfa v orable for practical purp oses. In [23 ], Bo wy er et al. presen t an elab orate discussion on the existing 3D and m ultimo dal 2D + 3D face recognition approac hes and the c hallenges that need to b e addressed. 24
PAGE 36
2.2.2 VideoBased Algorithms F ace recognition in image sequences has receiv ed more atten tion in the last decade. Videobased face recognition algorithms can b e classied in to t w o categories: metho ds that do not utilize the motion information presen t in video and metho ds that in tegrate the motion information for recognition. The rst category of algorithms are those that do not exploit dynamics of video ecien tly and apply still imagebased tec hniques to some \go o d" frames selected from image sequences [28 ]. An example for suc h an approac h is [44 ], where a videobased face recognition system based on trac king the p ositions of the nose and ey es is prop osed. The lo cation of these p oin ts is used to mak e a decision whether the face p ose is acceptable for a still imagebased recognition tec hnique to b e launc hed; otherwise the trac king con tin ues un til suc h a frame o ccurs. Other approac hes in this category include 3D reconstruction and recognition via structure from motion or structure from shading. It is clear that the ab o v e approac hes exploit the abundance of frames in video and not essen tially the facial dynamics b y mainly using the spatial information. The second category of algorithms are those that attempt to sim ultaneously use the spatial and temp oral information for recognizing faces undergoing some mo v emen ts. These metho ds ecien tly exploit the temp oral information b y c ho osing a form of spatiotemp oral represen tation that includes b oth the facial structure and its dynamics. Some of the w ork using this principle include the condensation metho d [129 ] and the metho d based on Hidden Mark o v Mo dels (HMM) [76]. Soatto et al. used a linear dynamic system mo del [104 ] to capture the spatiotemp oral information in image sequences [2 ]. A discussion of the second class of algorithms that use motion information in the follo wing section. 25
PAGE 37
2.2.2.1 Algorithms Using SpatioT emp oral Represen tation In [71 ], Li prop osed an approac h for mo deling facial dynamics using iden tit y surfaces. A set of mo del tra jectories constructed on iden tit y surfaces is matc hed with the face tra jectory constructed from the discriminating features. A recognition rate of 93 : 9% is rep orted on a dataset con taining 12 training sequences and testing sequences of 3 sub jects. In [69 ], Li and Chellappa rep ort enhancemen t o v er the frametoframe matc hing sc heme b y using tra jectories of trac k ed features in video. Gab or lters w ere used to extract the features of in terest. In the p opular w ork of [126 ], Zhou and Chellappa prop osed a framew ork for sim ultaneous trac king and recognition of faces b y including an iden tication v ariable to the state v ector of the mo del. Condensation algorithm is another w a y of c haracterizing temp oral information. Though traditionally used for trac king and recognizing m ultiple spatiotemp oral features, this has b een extended to videobased face recognition [129 126 ]. In metho ds using Hidden Mark o v Mo dels [76 ], an HMM is created to learn b oth the statistics and temp oral structure of eac h sub ject during the training phase. In the testing phase, the temp oral dynamics of the sequence is analyzed o v er time b y the HMM corresp onding to eac h individual. The highest lik eliho o d score pro vided b y an HMM establishes the face iden tit y in the video. In the w ork of Soatto [104 ], the autoregressiv e and mo ving a v erage (ARMA) mo del w as used to c haracterize a mo ving face as a linear dynamical system. Other metho ds for videobased face recognition include algorithms that incorp orate manifold learning. Lee et al. prop osed one suc h an approac h based on probabilistic app earance manifolds [67 ]. The prop osed algorithm of this w ork falls in to the category of videobased face recognition metho ds that adopt a spatiotemp oral represen tation. F or a face undergoing some expression, nonrigid displacemen ts are used to c haracterize the prop erties of soft tissue and b one structure that constitute a face. The strain induced b y the deformation 26
PAGE 38
during the expression is used to dene the elastic prop erties of the facial skin whic h is in turn used to p erform recognition. This op ens up new er a v en ues to the face recognition comm unit y in the con text of mo deling a face using features b ey ond visible cues. 2.2.2.2 F acial Dynamics for Iden tit y Recognition Nonrigid motion information of the facial skin has b een exploited in the traditional face recognition framew ork to o v ercome some of the c hallenges faced b y automatic recognition algorithms. Chen et al. [29 ] augmen ted an app earance{based metho d with facial motion to o v ercome illumination problems. Motion data w ere transformed in to a feature v ector b y concatenating a subset of ro w estimates in an image sequence, whic h w as then pro cessed b y PCA and LD A (Linear Discriminan t Analysis). Exp erimen tal results sho w ed that, using the in tensit y data augmen ted b y dynamic information, face recognition p erformed more robustly under v arying ligh ting conditions. P am udurth y et al. [89 ] suggested an approac h that uses dynamic facial features. F eatures w ere extracted from still images, and corresp ondence at the skin p ore lev el w as established b y a Digital Image Skin Correlation (DISC) metho d. Tw o new images w ere comp osed using in tensities deriv ed from the pro jection of displacemen t magnitude. Exp erimen ts using a small sample set (some w earing mak eup) sho w ed that the disguised sub jects w ere successfully recognized b y the DISC metho d, but missed b y an app earance{based metho d based on PCA+LD A. Empirical evidences w ere pro vided sho wing that iden tit y signatures based on facial dynamics could help distinguish b et w een iden tical t wins. In our w ork, w e used facial dynamics for iden tit y recognition b y exploiting the non{ rigid motion of facial skin [121 ]. The iden tit y signature is based on the elastic prop ert y of facial tissue. By considering a face region inruenced b y the masseter m uscle, w e sho w ed that the strain pattern computed from t w o face images of closed and op ened ja w p ositions can rev eal the underlying m uscular c haracteristics. In [78 ], w e extended 27
PAGE 39
the metho d to video sequences. Strain pattern w as put forw ard as a soft forensic evidence that can augmen t existing p erson iden tication tec hniques under drastic conditions suc h as face camourage and strong shado ws. The robustness of the metho d w as substan tiated b y a comprehensiv e system design and extensiv e exp erimen tal results. In [79 ], w e impro v ed the system through b etter c haracterization of facial soft tissue b y means of a nite elemen t mo deling based approac h incorp orating relativ e material prop erties of individuals. The FEMbased approac h w orks with a sparse motion eld and enforces regularization (smo othness con troller for the motion eld) whic h mak es the computational metho d accurate and robust. 28
PAGE 40
CHAPTER 3 THEORETICAL BA CK GR OUND 3.1 Theory of Elasticit y When a b o dy is sub jected to external forces, the forces acting on it could either b e surfac e for c es whic h act o v er the surface of the solid, or b o dy for c es whic h are distributed o v er the v olume of the solid. The dissemination of these forces through a solid causes the generation of in ternal forces. T o study these forces, w e use a con tin uum mo del of the material in whic h matter is assumed to b e con tin uously distributed across the solid [37 ]. Con tin uum mec hanics deals with the mo v emen t of materials when sub jected to applied forces. The motion of a con tin uous and deformable solid can b e describ ed b y a con tin uous displacemen t eld resulting from a set of forces acting on the solid b o dy In general, the displacemen ts and forces ma y v ary con tin uously with time, but for the purp oses of this w ork w e use a t w ostate quasistatic mo del. The initial unloaded state of the material is referred to as the r efer enc e or undeformed state as the displacemen ts are zero ev erywhere. The material then recongures due to applied loads and reac hes an equilibrium state referred to as the deforme d state. The concepts of strain, a measure of length c hange or displacemen t gradien t, and stress, the force p er unit area on an innitesimally small plane surface within the material, are of fundamen tal imp ortance for nite deformation elasticit y theory 29
PAGE 41
3.1.1 Stress and Strain W e dene the quan tit y stress to measure the in tensit y of eac h of the external forces when a solid b o dy is under equilibrium. Stress is used to measure the state of the force acting on the solid. It denes the force acting p er unit area. Stresses can b e decomp osed in to three comp onen ts based on the dieren t forces acting on the plane of the solid [37 ] as sho wn in Equation 3.1. xx = lim A 0 ( F x A ) xy = lim A 0 ( F y A ) xz = lim A 0 ( F z A ) (3.1) The comp onen t xx is the normal stress whic h measures the in tensit y of the normal force on the plane at a p oin t. The comp onen ts xy and xz are the shear stresses whic h measure the in tensit y of the shear force on the plane. Normal stress pro vides for c hange in the v olume of the material while shear stresses are resp onsible for the deformation of the material without aecting it's v olume. It can b e sho wn that the normal and the shear stresses on an y three orthogonal planes are sucien t to completely describ e the state of stress at a giv en p oin t. The stress tensor comprising the stress comp onen ts can b e expressed in a matrix form [84 37 ] as sho wn in Equation 3.2. [ ] = 2 6 6 6 6 4 xx xy xz y x y y y z xx z y z z 3 7 7 7 7 5 (3.2) There are six distinct strain comp onen ts along with three complimen tary shear stresses along the diagonal whic h are iden tical, i.e. ( xy = y x ), ( z y = y z ) and ( xz = y z ). Str ain is another measure whic h has to b e considered when a b o dy undergo es some deformation. The eect on the b o dy's geometry under external forces can b e dened in 30
PAGE 42
terms of the displacemen ts of eac h p oin t in the b o dy There are t w o t yp es of displacemen ts p ossible: Rigid b o dy displacemen ts Deformation or nonrigid displacemen ts While rigid b o dy displacemen ts consists of translations and rotations of the b o dy as a whole, deformation consists of displacemen ts of p oin ts within the b o dy relativ e to one another [37 ]. Strain is used to quan tify the deformation undergone. The direct strain ( ) is dened as: = ds ds ds (3.3) where ( ds ) is the length b efore deformation and ( ds ) is the length after deformation. An innitesimal strain tensor is dened as: = 1 2 [ r u + ( r u ) T ] (3.4) where u is a displacemen t v ector and r is the gradien t op erator dened as: r u = 2 6 6 6 6 4 @ u @ x @ u @ y @ u @ z @ v @ x @ v @ y @ v @ z @ w @ x @ w @ y @ w @ z 3 7 7 7 7 5 : (3.5) Consequen tly the strain tensor in 3D Cartesian co ordinate can b e expressed as: = 2 6 6 6 6 4 @ u @ x 1 2 ( @ u @ y + @ v @ x ) 1 2 ( @ u @ z + @ w @ x ) 1 2 ( @ v @ x + @ u @ y ) @ v @ y 1 2 ( @ v @ z + @ w @ y ) 1 2 ( @ w @ x + @ u @ z ) 1 2 ( @ w @ y + @ v @ z ) @ w @ z 3 7 7 7 7 5 : (3.6) Using the linear constitutiv e la w (material linear, the generalized Ho ok e's la w) and isotropic prop ert y the follo wing stressstrain relationship can b e deriv ed: 31
PAGE 43
= ( tr ) I + 2 = ( r u ) I + r u + ( r u ) T ; (3.7) where I is the iden tit y matrix, tr denotes trace, and are the Lam e constan ts. The symmetry of stress tensor ( = T ) is automatically deriv ed from the symmetry of strain tensor ( = T ). With ab o v e linear conditions, the go v erning motion equations can rewritten in terms of displacemen t that describ es the deformation of an inhomogeneous, isotropic, and linear elastic ob ject: @ 2 u @ t 2 = r ( r u ) I + r u + ( r u ) T + F : (3.8) Material prop erties more commonly kno wn in the engineering literature suc h as the Y oung's mo dulus ( E ) and the P oisson's ratio ( ) are related to the Lam e constan ts through the follo wing transformations and can b e directly substituted in to the motion equation: = E 2(1 + ) ; = E (1 + )(1 2 ) : (3.9) Since b oth the Y oung's mo dulus and P oisson's ratio are considered as spatial functions, the motion equations will not b e further simplied in to the Na vier's equation. 3.2 Motion Analysis 3.2.1 Optical Flo w Optical ro w rerects the c hanges in the image due to motion during a time in terv al dt It describ es the v elo cit y eld that represen ts the threedimensional motion of ob ject p oin ts across a t w odimensional image. In [105 ], Sonk a et al. quote the follo wing salien t features of an optical ro w algorithm: 32
PAGE 44
Optical ro w should not b e sensitiv e to illumination c hanges and motion of unimp ortan t ob jects (e.g., shado ws) Nonzero optical ro w is detected if a xed sphere is illuminated b y a mo ving source Smo oth sphere rotating under constan t illumination pro vides no optical ro w Figure 3.1 1 giv es an example of the v ector eld that w ould b e pro duced b y a t ypical optical ro w algorithm. (a) Time t 1 (b) Time t 2 (c) Optical ro w Figure 3.1 Flo w eld pro duced b y a t ypical optical ro w algorithm on an image pair. 3.2.2 Flo w Computation Optical ro w computation is based on the follo wing assumptions [53 ]: The observ ed brigh tness of an y ob ject p oin t is constan t o v er time. Nearb y p oin ts in the image plane mo v e in a similar manner (v elo cit y smo othness constrain t). Let f ( x; y ; t ) b e a con tin uous image. W e can use T a ylor series to represen t a dynamic image as a function of p osition and time. f ( x + dx; y + dy ; t + dt ) = f ( x; y ; t ) + f x dx + f y dy + f t dt + O ( @ 2 ) (3.10) 1 This example is from the b o ok Image Pr o c essing: A nalysis and Machine Vision b y Sonk a et al. Chapter 14: Motion A nalysis. 33
PAGE 45
where f x f y and f t are partial deriv ativ es of f and O ( @ 2 ) denotes the higherorder terms in the expansion. When an immediate neigh b orho o d of (x,y) is translated some small distance ( dx; dy ) during the in terv al dt the ev en t can b e mathematically expressed as: f ( x + dx; y + dy ; t + dt ) = f ( x; y ; t ) If dx dy dt are v ery small, the higherorder terms in equation v anish, and w e can reduce Equation 3.10 as: f t = f x dx dt + f y dy dt (3.11) The goal of optical ro w metho d is to determine the v elo cit y c = ( u; v ) = ( dx dt ; dy dy ) f x f y f t can b e computed, or at least appro ximated, from f ( x; y ; t ). Motion v elo cit y can then b e calculated as: f t = f x u + f y v = r f c (3.12) where r f is a t w odimensional image gradien t. F rom Equation 3.12, it can b e seen that the gra ylev el dierence, f t at the same lo cation of the image at times t and t + dt is a pro duct of spatial gra ylev el dierence and v elo cit y in this lo cation. Ho w ev er, Equation 3.12 do es not completely dene the v elo cit y v ector completely but rather pro vides the comp onen t in the direction of the brigh test gradien t. In order to handle this, a smo othness constrain t is in tro duced whic h states that the v elo cit y v ector eld c hanges slo wly in a giv en neigh b orho o d. Th us, the problem no w reduces to minimizing the squared error quan tit y: E 2 ( x; y ) = ( f x u + f y v + f t ) 2 + ( u 2 x + u 2 y + v 2 x + v 2 y ) (3.13) where u 2 x u 2 y v 2 x v 2 y denote partial deriv ativ es squared as error terms. 34
PAGE 46
The rst term in Equation 3.13 is the solution to Equation 3.12 and the second term is the smo othness criterion is a Lagrange m ultiplier. W e can reduce this to solving the dieren tial equations: ( 2 + f 2 x ) u + f x f y v = 2 u f x f t f x f y u + ( 2 + f 2 y ) v = 2 v f y f t (3.14) where u v are mean v alues of the v elo cit y in directions X and Y in some neigh b orho o d of ( x; y ). A solution [105 ] to the dieren tial equations of 3.14 is: u = u f x P D v = v f y P D (3.15) P = f x u + f y v ; D = 2 + f 2 x + f 2 y Measuremen t of the optical ro w is then based on a GaussSeidel iteration metho d [56 ] using pairs of consecutiv e images. 3.2.3 Handling F ailure Cases Errors in optical ro w computation o ccur when the brigh tness constancy and v elo cit y smo othness assumptions are violated. Suc h violations are quite common in real data. Highly textured regions, mo ving b oundaries, and depth discon tin uities are few examples where optical ro w computation fails dramatically In addition to constrain t violations, global relaxation metho ds of optical ro w computation also results in ro w estimation errors propagate across the solution. The reason for this is the fact that global metho ds nd the smo othest v elo cit y eld consisten t with the image data. Th us, a small n um b er of problem areas ma y cause widespread errors and p o or optical ro w estimates. 35
PAGE 47
Lo cal optical ro w estimation w as found to b e a natural solution to these problems. The basic idea is to divide the image in to small regions where the assumptions hold go o d. Though, this solv es the error propagation problem, it has it's o wn pitfall. In regions where the spatial gradien ts c hange slo wly the optical ro w estimation b ecomes illconditioned b ecause of lac k of motion information, and it cannot b e detected correctly If a global metho d is applied to the same region, the information from neighb oring image parts propagates and w ould represen t a basis for optical ro w computation ev en if the lo cal information w as not sucien t b y itself. The conclusion of this discussion is that global sharing of information is b enecial in constrain t sharing but adv erse with resp ect to error propagation. A natural question then is \When to use a glob al metho d and when to use a lo c al appr o ach?" An answ er to this is b est obtained b y nding when the smo othness constrain t is violated. In order to detect regions in whic h the smo othness constrain ts hold, w e ha v e to select a threshold to decide whic h ro w v alue dierence should b e considered substan tial. This has it's o wn inheren t problems: If the threshold is to o lo w, man y p oin ts are considered p ositioned along ro w discon tin uities If the threshold is to o high, some p oin ts violating smo othness remain part of the computational net Blac k and Anandan [18 ] consider the problem of accurately estimating optical ro w from a pair of images using a no v el framew ork based on robust estimation whic h addresses violations of the brigh tness constancy and spatial smo othness assumptions. W e use this algorithm in our computations of facial motion. The details of this algorithm are summarized in Chapter 4. 36
PAGE 48
3.3 Subspace Learning for F ace Recognition 3.3.1 Principal Comp onen t Analysis Principal Comp onen t Analysis (PCA) is a dimensionalit y reduction tec hnique wherein the features along whic h maxim um v ariation in the dataset is captured are retained. The classication th us reduces from a higher dimension to a lo w er dimension called the eigen space, whic h is the space dened b y the principal comp onen ts or the eigen v ectors of the data set. In the PCA tec hnique, eorts ha v e b een on b oth fully automatic and partially automatic algorithms. P artially automatic algorithms are ones in whic h the co ordinates of landmark p oin ts on the image are supplied to the normalization routine i.e there is no automatic trac king of landmark p oin ts. In this w ork, w e use the partially automatic tec hnique. There are four steps in v olv ed in the partially automatic eigen approac h describ ed in the follo wing sections. 3.3.1.1 Prepro cessing Lo cation of seed p oin ts. As men tioned earlier, the PCA approac h considered in this study requires that anc hor p oin ts b e supplied to the normalization routine. So w e need to lo cate 2 seed p oin ts whic h are presen t in all the sub jects in the dataset and whic h will b e further used for normalization. In our study for normalization of prole images, w e use the top of the nose and the corner of the ear as our 2 p oin ts as seen in Figure 3.2. Geometric Normalization. In this step, the h uman c hosen seed p oin ts are lined up across the sub jects. Tw o xed lo cations (using the co de) lx, ly rx, ry w ere decided suc h that all the c hosen seed p oin ts for all the sub jects rest on these 2 xed p oin ts. F or aligning them translation, rotation and scaling are p erformed. Masking. This is done to crop the (scaled and aligned from Step 2) image using a rectangular mask and the image b orders suc h that only the face from the forehead 37
PAGE 49
to c hin and ear to nose is visible. This is done to remo v e the un w an ted areas suc h as hair, bac kground etc. The mask for face/strain map is man ually sp ecied from the mean face/strain map image. In our case w e used images sequences of sizes 720x480 whic h w ere reduced to 200x250 pixels. (a) Seed p oin ts (b) Mask ed ra w image (c) Normalized strain map Figure 3.2 The prepro cessing step in Principal Comp onen t Analysis (PCA). 3.3.1.2 T raining In the training phase, the algorithm learns the subspace from the giv en inputs, i.e the eigen v alues and eigen v ectors of the training set are extracted. The eigen v ectors are c hosen based on the top eigen v alues whic h represen t the feature v ectors whic h retain the most v ariations across the images in the training set for discriminativ e purp oses. The training set should preferably con tain images whic h do not con tain m uc h of artifacts, suc h as sp ectacles, earrings, etc. and it should b e a set of images that do not ha v e an y duplicates. After extracting the most signican t v ectors ( m ), the images are then pro jected in to the eigen space of m dimensions. Eac h image is represen ted as a linear com bination of the m eigen v ectors in the reduced dimension. 3.3.1.3 T esting The testing phase is where the algorithm is pro vided a set of kno wn/enrolled faces (strain maps in our case) kno wn as the gallery set and a set of unkno wn faces/strain maps kno wn as the prob e set. The algorithm matc hes eac h prob e to its p ossible iden tit y 38
PAGE 50
in the gallery b y computing the Euclidean distance b et w een eac h prob e and eac h of the gallery images. 3.3.1.4 Analysis Dep ending on the mo de the biometric is op erated, the p erformance of the tec hnique is measured b y it's: V erication Rate. A v erication system has to tak e the measurable features of the sub ject (p) and compare it against the kno wn features of the p erson (g) whose iden tit y is b eing claimed, whic h mak es it a onetoone matc hing problem. The p erformance of the system is measured using 2 statistics, rst is the probabilit y of v erication (( P V )) i.e accepting that the prob e p is actually the p erson g who the sub ject claims to b e, i.e. rep orting p = g when p g and second is the probabilit y of false alarm (( P F )), i.e rep orting p = g when p 6 = g The v erication rate is computed b y: P c;i V = 8 > < > : 0 if j D i j = 0 j s i ( k ) c giv en p k "D i j j D i j otherwise, and P c;i F = 8 > < > : 0 if j F i j = 0 j s i ( k ) c giv en p k "F i j j F i j otherwise, where s i ( k ) is the similarit y measure [95 ]. Th us a pair of ( P V ; P F ) are generated for a giv en cuto v alue c The cut o v alue is selected b y v arying b et w een the minim um and maxim um distances obtained after pro jecting all the prob e images. By v arying c dieren t com binations of ( P V ; P F ) are pro duced. A plot of all ( P V ; P F ) is called the Relativ e (or Receiv er) Op erating Characteristics (R OC). 39
PAGE 51
Iden tication Rate. Iden tication scores are rep orted using the same n um b er of prob e and gallery images. A plot of all the p ercen tage of correct matc hes on the v ertical axis and the Rank along the horizon tal axis is called the Cum ulativ e Matc h Scores Curv e(CMC). The top rank matc h is at Rank 1 whic h indicates the fraction of prob es correctly iden tied. It has b een sho wn in [91 ] that CMC for small gallery sizes (50 in our case) dramatically underestimates the recognition p erformance for large gallery sizes. F or this reason, w e presen t our results as R OCs and similarit y score distributions of gen uine and imp ostor matc hes. An alternativ e to this is to emplo y metho ds that use similarit y scores of exp erimen ts conducted on small datasets to estimate the p erformance of the algorithm on a large dataset ha ving c haracteristics comparable to that of the smaller dataset [45 111 ]. W e use the Receiv erOp erating Characteristic (R OC) curv e [83 ], score distribution of gen uine and imp ostor matc hes, as w ell as a lik eliho o d ratio to quan tify exp erimen tal results. One w a y to analyze the strength of a forensic evidence is to examine the increase in the likeliho o d r atio [46] that a certain trace material originated from a susp ect. F or example, giv en an original susp ect p opulation of size R if an evidence A can reduce the susp ect p opulation to R A then the strength of the evidence is ev aluated as R =R A In this study R A is the n um b er of p oten tial matc hes for a strain pattern giv en a distance threshold T in the PCA subspace. The higher the lik eliho o d ratio, the stronger the evidence against a particular susp ect. The threshold T is obtained b y analyzing the R OC curv e, whic h is a plot b et w een the p ercen tage of gen uine matc hes or T rue Acceptance Rate (T AR) and the p ercen tage of imp ostor matc hes or F alse Acceptance Rate (F AR) for v arying thresholds. The c hoice of T dep ends on the p enalt y for eac h t yp e of error, namely false acceptance and false rejection. In an authen tication system, where the p enalt y for an incorrect access to a 40
PAGE 52
conned facilit y is high, T is c hosen suc h that F AR is v ery lo w. On the other hand, in a forensic iden tication system, where the p enalt y is high for completely excluding the true matc h in a set of p ossible matc hes, T is c hosen suc h that T AR is 100%. W e analyze the strength of strain pattern b y c ho osing a T on the R OC curv e suc h that T AR = 100% and rep ort the corresp onding increase in the lik eliho o d ratio. 41
PAGE 53
CHAPTER 4 THE ALGORITHM AND COMPUT A TIONAL METHODS A video sequence of a sub ject undergoing a particular facial expression is the essen tial input to the system. The apparen t facial motion is empirically estimated using an optical ro w algorithm. Since optical ro w traditionally fails on extensiv e motion, it is applied on subsequen t framepairs instead of the start and end frame of the expression. The displacemen t v ectors from framepairs are then com bined using a b ackwar dmapp e d w arping approac h. Based on a few p oin ts man ually selected in the rst frame and the displacemen t v ectors obtained from the previous step, frames are registered to remo v e an y rigid motion of the face (head). The strain map of the sub ject's facial expression is calculated using a nite dierence metho d or a nite elemen t metho d based on whether w e ha v e a dense or a sparse motion eld. The strain images of all the sub jects are then pro jected to a lo w er dimensional space using the Karh unenLo ev e transform or principal comp onen t analysis (PCA) [40 ]. The main motiv ation b ehind using this is the fact that the features in suc h subspace pro vide more salien t and ric her information for recognition than the ra w images themselv es [70 ]. In this space, a strain image is ecien tly represen ted as a v ector of w eigh ts (fe atur e ve ctor) of lo w dimensionalit y Subspace learning is the pro cess of learning these w eigh ts using a set of training images. During testing, when a query face is to b e iden tied with one of the faces in the database, distances to eac h of the training strain patterns are calculated in the pro jected subspace. This results in a matrix of similarit y scores (kno wn/enrolled or gal lery images vs. unkno wn or pr ob e images) whic h are p opularly analyzed using 42
PAGE 54
Receiv er Op erating Characteristic (R OC) curv es [83 49]. Figure 4.1 sho ws a ro w c hart of the presen ted metho d for p erson iden tication using strain patterns. Strain Computation Module Training TestingCoordinate Point Extraction Repeated for each subject in the databaseOn the database containing strain maps of all subjectsOrthogonal Subspace Optic Flow Intra& InterSubject Variation, ROC, and Likelihood ratio Link flow values Displacement vectors for framepairs across sequence Strain Map of a Subject (Strain to Intensity) Principal Component Analysis Classifier Nearest Neighbor Geometric Normalization and Masking Distances in projected subspace two subsequent frames Correspondence between Input Sequence of Expression Displacement vector for the complete sequence from each framepair Figure 4.1 System ro w of the presen ted approac h for p erson iden tication using strain patterns. 4.1 Computation of Motion Fields There are n umerous algorithms that ha v e b een dev elop ed to the solv e the Equation 3.14. These are usually accompanied with lo cal or global smo othness constrain ts explained in Section 3.2.3. The metho d adopted in this study is based on a robust estimation framew ork [18 ]. In [18 ], Blac k and Anandan ha v e considered the issues of robustness related to the reco v ery of optical ro w with m ultiple motions in an image sequence. It is understo o d from the discussions in Section 3.2.3 that measuremen ts are incorrect whenev er information is used from a spatial neigh b orho o d that spans a motion b oundary This holds go o d for b oth the brigh tness constancy and v elo cit y smo othness assumptions and violations of these constrain ts cause problems for the leastsquares form ulation of optical ro w in Equation 3.13. The basic idea is to recast the least squared error form ulations with a dieren t err ornorm function instead of the standard quadr atic function. T o increase robustness, the 43
PAGE 55
errornorm function should b e more forgiving ab out outlying measuremen ts. One of the most common errornorm function in computer vision is the trunc ate d quadr atic where errors are w eigh ted quadratically up to a xed threshold but receiv e a constan t v alue b ey ond that. There are n umerous other errornorm functions that ha v e b een used in the literature, eac h with dieren t motiv ations and ecacies [16 ]. Loren tzian and GemanMcClure errornorm functions are used in [18 ]. The motiv ation for using these b eing the fact that they ha v e dieren tial inruential functions 1 whic h pro vide a more gradual transition b et w een inliers and outliers than do es the truncated quadratic. The authors explore n umerous optimization tec hniques to solv e the robust form ulations and found that deterministic con tin uation metho ds [17] to b e more ecien t and practical as a minimization tec hnique. T o cop e with large motions, a coarsetone strategy [5, 43 ] is emplo y ed in whic h a p yramid of spatially ltered and subsampled images is constructed. Beginning at the lo w est spatial resolution with the ro w c b eing zero, the c hange in the ro w estimate dc is computed. The new ro w eld, c + dc is then pro jected to the next lev el in the p yramid and the rst image at that lev el is warp e d to w ards the later image using the ro w information. The w arp ed image is then used to compute the dc at this lev el. The pro cess is rep eated un til the ro w has b een computed at the full resolution. Giv en a pair of frames, the algorithm pro duces t w o motion comp onen ts, u and v Examples of the generated dense motion eld are sho wn in Figure 4.2 (c,d). F or the metho d describ ed in Section 4.3, w e use a sparse motion eld as the input. Examples of the generated sparse motion eld are sho wn in Figure 4.3. 1 An inruential function asso ciated with an errornorm function c haracterizes the bias that a particular measuremen t has on the solution. 44
PAGE 56
(a) Video frame 12 (b) Video frame 15 (c) Horizon tal motion (d) V ertical motion Figure 4.2 Dense motion eld generated b y the optical ro w metho d using t w o video frames b et w een whic h the sub ject op ened his mouth sligh tly Note the dominance of v ertical motion as displa y ed with brigh ter in tensities in (d). (a) Normal Ligh t (b) Lo w Ligh t (c) Shado w Ligh t (d) Camourage Figure 4.3 Sparse motion eld generated b y the optical ro w metho d using t w o video frames b et w een whic h the sub ject op ened his mouth sligh tly 4.2 Strain Computation using the Finite Dierence Metho d Giv en a motion eld measured b et w een t w o video frames, w e can compute a corresp onding strain image within the framew ork of classical elasticit y theory W e rst presen t a brief discussion of strain theory as dened in con tin uum mec hanics, and then elab orate on the n umerical metho d that is adopted for computing strain in this w ork. An ob ject under the inruence of external forces can displa y dieren t t yp es of motion. While the rigid motion of a b o dy accoun ts for its translation and rotation as a whole, the non{rigid motion (deformation) describ es the displacemen ts of p oin ts within a b o dy relativ e to one another. Str ain is used to quan tify the second t yp e of motion or deformation. A strain tensor ( ) that is capable of describing large deformations is dened as: = 1 2 [ r u + ( r u ) T + ( r u ) T r u ] ; (4.1) where u ( u; v ; w ) is the displacemen t v ector, and r is the gradien t op erator. 45
PAGE 57
In the case of small deformations, the maximal Eigen v alue of the strain tensor ( i ), whic h represen ts the maximal elongation of the principal axes, is signican tly smaller than 1. In this case, the quadratic pro duct term in (4.1), c haracterizing the ge ometric non{line arity can b e neglected and the strain tensor can b e linearized: = 1 2 [ r u + ( r u ) T ] : (4.2) In a 2D image co ordinate, strain b ecomes: = 2 6 4 @ u @ x 1 2 ( @ u @ y + @ v @ x ) 1 2 ( @ v @ x + @ u @ y ) @ v @ y 3 7 5 : (4.3) Strain can b e computed as a spatial deriv ativ e using the nite dierence metho d (FDM), ltering metho d or the Ric hardson extrap olation metho d. W e c hose the cen tral dierence metho d based on the considerations of b oth execution sp eed and n umerical accuracy [93 ]: @ u @ x = u ( x + 4 x ) u ( x 4 x ) 2 4 x ; @ v @ y = v ( y + 4 y ) v ( y 4 y ) 2 4 y ; (4.4) where 4 x and 4 y are preset distances (usually 13 pixels). The next step is to in tegrate all strain comp onen ts in to a single feature. It has b een observ ed that, when a sub ject op ens his/her mouth, motion is mostly v ertical and strain pattern is dominated b y its normal comp onen ts ( @ u @ x ; @ v @ y ). Therefore, w e compute a strain magnitude image ( m ) with normal strains only: m = s @ u @ x 2 + @ v @ y 2 : (4.5) A sample of strain magnitude pattern, after b eing con v erted to a gra y scale of 0250, is sho wn in Figure 4.4. 46
PAGE 58
(a) Horizon tal motion (b) V ertical motion (c) Strain magnitude Figure 4.4 Strain magnitude images computed from the optical ro w data using the nite dierence metho d. In [118 ], Y oshida et al. empirically ev aluated a similar strain computation metho d and their results are consisten t with our observ ations. 4.3 Strain Computation using the Finite Elemen t Metho d The FDM is ecien t and accurate when the geometry of the problem is regular. Ho w ev er, the discretization of ob jects with irregular geometry b ecomes extremely inv olv ed requiring extensiv e computational resources for data storage and system solving. F urther, FDM is restricted as a mo deling platform b ecause of its limitations with resp ect to the material t yp e and the b oundary conditions. Under these situations, the nite elemen t metho d (FEM) [130 ] b ecomes the ultimate stateoftheart tec hnique in ph ysically based mo deling that treats the deformable ob jects as a mec hanical con tin uum with mass and energies distributed throughout the domain they o ccup y As against the FDM where the dieren tial op erators are appro ximated b y simple algebraic expressions, FEM applies them as they are on the subspaces of those admissible solution elds. The FDM appro ximates the ob ject as a nite mesh of no des and discretizes the equilibrium equation at the mesh no des. The FEM divides the ob ject in to a set of elemen ts and appro ximates the con tin uous equilibrium equation o v er eac h elemen t. The main adv an tage FEM pro vides o v er no de{based discretization metho ds is the more rexible no de placemen t and the substan tial reduction of the total n um b er of degrees of freedom needed to ac hiev e the required solution accuracy Suc h an approac h w ould lead to more accurate strain estimation and seamless incorp oration of material constan ts asso ciated with facial tissues. Through the FEM based 47
PAGE 59
approac h, w e could also reduce the noise in tro duced during the motion estimation stage b y the optical ro w algorithm. The displacemen t v alues computed for a subset of pixels where the motion estimate is highly reliable can b e used to solv e for the motion eld in the rest of the regions through the nite elemen t mo del incorp orating the material prop erties. The geometry of the FE mo del w as made generic and the dimensions of the mo del and the p ositions of the landmarks (ey es, nose, and mouth) w ere dened b y taking an a v erage o v er the en tire dataset. Based on the face anatom y and the observ ation that material constan ts v ary for dieren t tissues, w e divided the face in to regions (forehead, ey es, nose, left c heek, righ t c heek, and c hin) with v arying material constan ts. Figure 4.5 (a) sho ws the geometry of the mo del (eac h region is represen ted b y a unique color). Figure 4.5 (b) presen ts the undeformed meshed mo del. (a) Geometry (b) Mesh Figure 4.5 2D nite elemen t face mo del. In this w ork, the linear elastic appro ximation of soft tissue b eha vior w as used. This has b een sho wn to b e adequate through exp erimen ts in earlier studies [65 ]. The equations go v erning a linear elastic b oundary v alue problem are based on three tensor equations, namely 48
PAGE 60
the e quation of motion (Newton's second la w), the str aindisplac ement e quation and the c onstitutive e quations (Ho ok e's la w). Giv en the sparse set of displacemen ts computed using the optical ro w metho d, the displacemen ts in other facial regions are computed using the equation of motion. The v alues of strain due to ob ject deformation are computed using the straindisplacemen t equation. Since w e do not explicit use forces in our mo del, w e do not utilize the constitutiv e equations (stressstrain relation) in our metho d. In our approac h, eac h homogeneous and isotropic region of the face is c haracterized b y the stiness and the compressibilit y whic h are describ ed b y t w o elastic constan ts, the Y oung's mo dulus, E i and the P oisson's ratio, i resp ectiv ely The P oisson's ratio is theoretically ranged in 2 [0 ; 0 : 5]. Based on the ndings in existing literature [42 ], the P oisson's ratio of 0.4 w as found to b e a go o d compromise b et w een the requiremen t of the constitutiv e mo deling and the computational p erformance for a pure displacemen tbased FEM. This v alue of w as used in all the regions of the mo del for all individuals. In con trast to the P oisson's ratio, the v alues of the Y oung's mo dulus whic h describ es the stiness of the facial skin will v ary among regions and individuals. W e learn the v alues of E in dieren t facial regions for eac h individual based on the observ ation that the resulting homogeneous system of equations is not sensitiv e with resp ect to the absolute v alue of the Y oung's mo dulus if the b oundary conditions are giv en in the form of prescrib ed displacemen ts. Th us, w e use the concept of relativ e stiness whic h is dened as a ratio, r el E i = E i =E 0 ; i = 0 ; 1 ; 2 ; :::; where, E i is the absolute stiness of the i th material and E 0 is the absolute stiness of the reference material. As long as no forces are giv en, the absolute stiness E i can b e replaced b y r el E i 49
PAGE 61
The cen tral idea of the algorithm is to p erform a global searc h o v er a few p ossible v alues of the relativ e stiness and nd the optimal settings from this set of com binations. The algorithm for computing the relativ e stiness is as follo ws: Dene a nite elemen t mo del with the forehead as the reference material. The nose w as mo deled as a highly rigid region and the ey es w ere mo deled as a region with v arying stiness (innermost area: highly elastic; outermost area: stiness of the region it shares a b oundary with). In order to reduce the computational cost during the searc h, w e assigned the same relativ e stiness v alue for b oth the left and the righ t c heek. Dene an optimization function: X x;y j u ( x;y ) v u ( x;y ) m j ; (4.6) where, u ( x;y ) v is the measured displacemen t b y the optical ro w metho d at pixel ( x; y ) and u ( x;y ) m is the displacemen t of the no de in the nite elemen t mesh corresp onding to the p osition ( x; y ). Use 1 4 th of the sparse motion eld to driv e the mo del and the remaining set of displacemen ts for v alidation. This step w as done using videos obtained in normal ligh ting where w e can reasonably assume high reliabilit y in the optical ro w estimation. It has to b e noted that this step is oine and p erformed once p er individual registered in the database. Compute the v alue of the optimization function for a preset n um b er of com binations. Use the setting with the least error as the nal learned parameter v alues for eac h patc h in the nite elemen t mo del. F or the query sequences, use the learned parameter v alues. 50
PAGE 62
F or the motion eld sho wn in Figure 4.3, Figure 4.6 sho ws the strain maps of the same sub ject under dieren t ligh ting conditions using the presen ted metho d. (a) Normal Ligh t (b) Lo w Ligh t (c) Shado w Ligh t (d) Camourage Figure 4.6 Strain magnitude images computed from the optical ro w data using the nite elemen t metho d. 4.4 Pro of of Concept using Range Images W e carried out the exp erimen ts using range images whic h allo w us to compute strain with 3D motion pro viding a more complete description of strain distribution. Ho w ev er, range scanners require longer acquisition time and demand more co op eration from subjects, whic h mak es it less app ealing for real deplo ymen ts. In this study exp erimen ts with range images are conducted as a pro of of concept that strain pattern has the desired discriminativ e v alue for forensic iden tication. A Minolta VIVID 900 scanner w as used to acquire the range data that consist of b oth 3D p ositions (x,y ,z) and 2D in tensit y images. The depth is measured from the scanner to the p oin t where the laser in tersects the sub ject. Range data and 2D in tensit y images are fully registered. During image acquisition, eac h sub ject sho w ed t w o expressions: closed mouth and op ened mouth. F or eac h acquisition session, t w o ligh ting conditions w ere considered: normal ligh t and lo w ligh t. As a result, eac h sub ject has eigh t images (T able 4.1). A total of 50 sub jects participated in the exp erimen t, all with prole faces only Images of prole faces presen t more reliable and con tin uous motion, while an op ened mouth in 51
PAGE 63
fron tal view often leads to holes in range images. Figure 4.7 sho ws a few sample images. T able 4.1 Range and in tensit y images for eac h sub ject. Ligh ting Condition Expression Data T yp e Normal Ligh t Op ened Mouth Range Normal Ligh t Op ened Mouth 2D In tensit y Image Normal Ligh t Closed Mouth Range Normal Ligh t Closed Mouth 2D In tensit y Image Lo w Ligh t Op ened Mouth Range Lo w Ligh t Op ened Mouth 2D In tensit y Image Lo w Ligh t Closed Mouth Range Lo w Ligh t Closed Mouth 2D In tensit y Image (a) Range images (b) In tensit y images Figure 4.7 Samples of range and 2D in tensit y images. Range and in tensit y images w ere acquired sim ultaneously and fully registered. 4.4.1 Computing Strain with Range Data Giv en t w o range images of a sub ject (closed mouth and op ened mouth) and registered in tensit y images, w e can determine feature motion in 3D space. W e compute strain in lo cal co ordinates b y the follo wing steps: Giv en a set of feature p oin ts in the closed mouth image, w e generate a triangle mesh using the Delaunay triangulation metho d. Rep eat step 1 for the op ened mouth image. In the closed mouth image, w e establish a tted plane equation for eac h elemen t using its three no dal p oin ts. 52
PAGE 64
W e c ho ose one no dal p oin t as the origin and dene a lo cal 2D co ordinate system. This w a y w e obtain the 2D lo cations of the three no dal p oin ts in the tted plane. W e then pro ject the corresp onding elemen t in the op ened mouth image on to the tted plane in the closed mouth image. The displacemen t of the no dal p oin ts is computed using the pro jections in the lo cal co ordinate system dened earlier. Within eac h elemen t, w e map no dal p oin t motion to pixel motion b y an in terp olation metho d. W e compute strain for eac h elemen t using the nite dierence appro ximation of the Cauc h y strain tensor. The ma jor dierence b et w een the strain from range data and the strain from 2D images is that the former is calculated lo cally on eac h triangle elemen t (including the in terp olation step), while the later is obtained directly on the 2D image plane. Consequen tly the strain from range data ma y p ossess sligh tly more c haracteristics of the \true 3D strain" than the strain from still images. Figure 4.8 sho ws a few strain samples from range data. Figure 4.8 Samples of strain pattern calculated using range data, eac h for a dieren t sub ject. 53
PAGE 65
4.4.2 PCA Results and Analysis W e designed one test using the range data (see T able 4.2). In the test, the gallery set consists of strain maps computed using images tak en under normal ligh t and the prob e set con tains strain maps computed using images tak en under lo w ligh t. T able 4.2 Exp erimen t design for p erson iden tication using strain maps computed from range data. Gallery/Susp ect P opulation Prob e/T races T est1 50 (regular face, normal ligh t) 50 (regular face, lo w ligh t) In other w ords, giv en an initial susp ect p opulation of 50 ( R = 50), w e in v estigate to what degree the strain pattern, as a forensic evidence, can narro w do wn the susp ect n um b ers ( R S tr ain ). All exp erimen ts w ere conducted using the principal comp onen t analysis algorithm with the Mahalanobis distance for computing the distance scores. More details ab out the PCA implemen tation used in our exp erimen ts can b e found in [15 ]. Figure 4.9 sho ws the score (dissimilarit y) distribution of the gen uine and the imp ostor matc hes. It can b e seen that the t w o distributions are w ell separated, a sign of a promising biometric source. 0 2 4 6 8 10 12 14 16 18 20 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Dissimilarity BinsNormalized Bin Values Match Scores NonMatch Scores Figure 4.9 In trasub ject and in tersub ject v ariation of strain maps computed from range data (Note: All v alues are in generic units). 54
PAGE 66
The R OC curv e of the exp erimen t is plotted in Figure 4.10. A v erication rate of 99% is ac hiev ed at a false acceptance rate of 5%. With a sligh t increase of the false acceptance rate to 8%, the true acceptance rate reac hes 100%. This accoun ts for an a v erage increase in the lik eliho o d ratio of 12.5 ( R = 50, R S tr ain = 4, R =R S tr ain = 12 : 5), suggesting that strain pattern can b e a v ery eectiv e forensic evidence. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 0.2 0.4 0.6 0.8 1 False Acceptance Rate (FAR)True Acceptance Rate (TAR) Figure 4.10 R OC curv es of the PCA exp erimen t using strain maps computed with range data (Note: All v alues are in generic units). 55
PAGE 67
CHAPTER 5 RESUL TS ON PERSON IDENTIFICA TION 5.1 The Finite Dierence Metho d In this section, w e presen t the results on p erson iden tication using the metho d describ ed in Section 4.2. 5.1.1 Video Acquisition Conditions W e collected videos using a Canon Optura{20 digital camcorder at a capture sp eed of 30 frames p er second and image resolution of 720 x 480 pixels. Sub jects sit ab out 2 meters in fron t of the camcorder against a grey b oard. In addition to normal indo or ligh t, a p oin tligh t source w as arranged ab o v e the sub ject's head to create a shado w eect. F ollo wing acquisition conditions w ere considered: normal ligh t, lo w ligh t, shado w ligh t, regular face, and camouraged face (see Figure 5.1). (a) Normal ligh t (b) Lo w ligh t (c) Shado w ligh t (d) Camourage Figure 5.1 Video acquisition conditions. 5.1.2 Strain Images Giv en a pair of video frames, the optical ro w algorithm pro duces t w o motion comp onen ts, horizon tal ( u ) and v ertical ( v ). Since optical ro w data is already at the pixel 56
PAGE 68
lev el, w e use them directly to compute strain using (4.4) and (4.5). A sample of strain magnitude pattern, after b eing con v erted to a gra y scale of 0250, is sho wn in Figure 5.2. T o extract the region of in terest inruenced b y the activit y of the underlying massetter m uscle, w e created a rectangle mask (200 x 250 pixels) using the top of the nose and the cen ter of the ear as normalization landmarks. W e then applied the mask to crop the strain image (Figure 5.2 (c)) whic h w as then used for PCA exp erimen ts (Figure 5.2 (d)). (a) Horizon tal motion (b) V ertical motion (c) Strain magnitude (d) PCA Input Figure 5.2 Strain images computed from the nite dierence metho d. 5.1.3 Discriminatory and Stabilit y Prop erties The deformation of a face is largely determined b y the strength of its soft tissues. Since the app earance of a face allo ws us to establish its iden tit y it is natural to argue that the soft tissues that actually mak e up a face m ust con tain unique information ab out the sub ject. This t yp e of information can b e quan tied through a biomec hanical prop ert y suc h as elasticit y The elasticit y of a face can b e adequately represen ted b y its strain pattern that can b e harnessed b y recognition algorithms, pro vided that the Diric hlet b oundary condition is satised. In [36 ], a similar indirect approac h w as emplo y ed for expression analysis. But their fo cus is on mo dal parameters, while w e mak e an explicit use of strain pattern. Strain pattern is discriminativ e in the sense that it enables us to \see through" the app earance of a face and analyze its iden tit y from b oth anatomical and biomec hanical p ersp ectiv es. Figure 5.3 sho ws the strain patterns of six sub jects under the same il57
PAGE 69
lumination condition. The discriminatory prop ert y of strain pattern across sub jects is clearly observ able. (a) Sub ject 1 (b) Sub ject 2 (c) Sub ject 3 (d) Sub ject 4 (e) Sub ject 5 (f ) Sub ject 6 Figure 5.3 Samples of facial strain pattern (computed from the nite dierence metho d) that illustrate in tersub ject v ariabilit y The facial strain pattern has a desirable attribute of b eing relativ ely stable. This is partially attributed to the fact that a strain image is deriv ed from the in tensit y dierence b et w een t w o frames rather than the absolute in tensit y v alue of a single frame. If t w o frames w ere tak en under a similar ligh ting condition, the impact of illumination c hange or w earing mak eup on strain image is m uc h less sev ere. Figure 5.4 sho ws strain images of the same p erson under dieren t illumination and camourage conditions. (a) Normal ligh t (b) Lo w ligh t (c) Shado w ligh t (d) Camourage Figure 5.4 Samples of facial strain pattern (computed from the nite dierence metho d) that illustrate in trasub ject consistency under dieren t conditions. 5.1.4 Exp erimen ts and Results In this section, w e presen t a more rigorous empirical v alidation of the discriminatory and stabilit y criteria through v arious exp erimen ts. A total of 60 sub jects participated in the exp erimen ts. Results w ere obtained using the principal comp onen t analysis algorithm with Mahalanobis distance for computing the metric scores. More details ab out the PCA implemen tation used in our exp erimen ts can b e found in [15 ]. 58
PAGE 70
W e designed t w o tests using videos (see T able 5.1). In T est1, the prob e set consists of videos in whic h strong shado ws are cast on faces (Figure 5.1 (c)). In T est2, one prob e set comprises of videos in whic h the app earance of faces is altered b y camourage (Figure 5.1 (d)). T able 5.1 Exp erimen t design for p erson iden tication using strain maps computed from the nite dierence metho d. Gallery/Susp ect P opulation Prob e/T races T est1 60 (regular face, normal ligh t) 60 (regular face, shado w ligh t) T est2.1 60 (regular face, normal ligh t) 10 (regular face, shado w ligh t) T est2.2 60 (regular face, normal ligh t) 10 (camouraged face, shado w ligh t) 5.1.4.1 T est1 Figure 5.5 sho ws the score distribution of the gen uine and imp ostor matc hes that ha v e a go o d separation but also some o v erlaps. The R OC curv e (Figure 5.6) has a v erication rate of 90% at a false acceptance rate of 5%. The 100% T AR is ac hiev ed at a 16.67% F AR, whic h is equiv alen t to an a v erage increase in the lik eliho o d ratio of 6.0 ( R = 60, R S tr ain = 10, R =R S tr ain = 6 : 0). It should b e emphasized that shado ws mark edly c hange the lo ok of a face and could cause a drastic p erformance degradation for an app earance{based metho d. The impact of shado ws on the p erformance of strain biometric, ho w ev er, seems m uc h less sev ere. 5.1.4.2 T est2.1 and T est2.2 Camourage or mak eup p oses a serious c hallenge to forensic study that relies on the traditional face iden tication metho d. W earing camourage or mak eup is common in criminal op erations b y la w enforcemen ts and in public spaces suc h as during a fo otball game. T o our kno wledge, no researc h has b een done on ho w to deal with those extreme y et realistic cases. W e use T est2 to demonstrate that facial strain pattern has the p oten tial to help recognize a face in spite of the disguise b y camourage or mak eup. 59
PAGE 71
0 5 10 15 20 25 30 0 0.05 0.1 0.15 0.2 0.25 Dissimilarity Score BinsNormalized Bin Values Genuine Scores Impostor Scores Figure 5.5 In trasub ject and in tersub ject v ariation of strain maps (computed from the nite dierence metho d) in T est1 (regular face: normal vs. shado w ligh ting) (Note: All v alues are in generic units). 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 0.2 0.4 0.6 0.8 1 True Acceptance Rate (TAR)False Acceptance Rate (FAR) Figure 5.6 R OC curv e of T est1 (regular face: normal vs. shado w ligh ting) (Note: All v alues are in generic units). Because of the small n um b er of camouraged faces (10 sub jects), w e only presen t the score distributions (Figure 5.7). Tw o observ ations can b e made: The o v erall score distributions of T est2.1 and T est2.2 are more or less the same, suggesting that the camourage do es not aect the p erformance of strain pattern v ery m uc h. 60
PAGE 72
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0.1 0.2 0.3 0.4 Dissimilarity binsNorm. bin valuesRegular faces normal lighting and regular faces shadow lighting Impostor Scores Genuine Scores 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0.1 0.2 0.3 0.4 Dissimilarity binsNorm. bin valuesRegular faces normal lighting and camouflage faces shadow lighting Impostor Scores Genuine Scores Figure 5.7 In trasub ject and in tersub ject v ariation of strain maps (computed from the nite dierence metho d) in T est2.1 (regular face normal vs. regular face shado w ligh ting) and T est2.2 (regular face normal vs. camouraged face shado w ligh ting) (Note: All v alues are in generic units). T est2.2 sho ws a sligh tly b etter p erformance than T est2.1 in terms of the separation b et w een the gen uine and imp oster matc hes. A p ossible explanation is that the camouraged face images ha v e more features and textures that lead to more accurate optical ro w results. The increases in the lik eliho o d ratio of the t w o tests are also consisten t with the ab o v e observ ation. T est2.1 has an increase in the lik eliho o d ratio of 7.5 ( R = 60, R S tr ain = 8, R =R S tr ain = 7 : 5). T est2.2 has an increase in the lik eliho o d ratio of 15.0 ( R = 60, R S tr ain = 4, R =R S tr ain = 15 : 0). 5.2 The Finite Elemen t Metho d In this section, w e presen t the results on p erson iden tication using the metho d describ ed in Section 4.3. 5.2.1 Video Acquisition Conditions All videos w ere acquired using a JV C HD Camera Recorder (GYHD100) at a default sp eed of 30 frames p er second and at a spatial resolution of 1280 x 720 pixels. Sub jects sat ab out 1 meter a w a y from the camcorder against a plain white b oard. In addition 61
PAGE 73
to normal indo or ligh t, a p oin tligh t source w as arranged ab o v e the sub ject's head to create a shado w eect. There w ere 3 illumination conditions (Normal, Lo w, and Shado w Ligh ting) and 2 app earance conditions (Regular and Camouraged F ace). All exp erimen ts w ere using fron tal views with op ening the mouth as the facial expression. Conditions used in this w ork are illustrated in Figure 5.8. (a) Normal Ligh t (b) Lo w Ligh t (c) Shado w Ligh t (d) Camourage Figure 5.8 Video acquisition conditions. 5.2.2 Strain Images Giv en the sparse set of displacemen ts computed using the optical ro w metho d, the displacemen ts in other facial regions are computed using the equation of motion. The v alues of strain due to ob ject deformation are computed using the straindisplacemen t equation. F or the sparse motion eld sho wn in Figure 5.9 (a), Figure 5.9 (b) sho ws the computed strain map using the metho d describ ed in Section 4.3. (a) Motion eld (b) Strain pattern Figure 5.9 Strain images computed from the nite elemen t metho d. 62
PAGE 74
As a natural extension to our nite dierence based metho d, where the strain pattern w as computed in the c heek region in prole views, w e represen ted the c heek regions alone in the nal strain maps used for matc hing. An imp ortan t dierence to b e noted is that the presen ted FEMbased metho d requires only 1 25 th of the motion v ectors as compared to the nite dierence metho d that mandates a dense motion computation. 5.2.3 Discriminatory and Stabilit y Prop erties Figure 5.10 sho ws the strain maps of the same sub ject under dieren t ligh ting conditions using the presen ted metho d. It can b e observ ed that the strain pattern remains fairly stable across adv erse data conditions. (a) Normal Ligh t (b) Lo w Ligh t (c) Shado w Ligh t (d) Camourage Figure 5.10 Strain maps of the same individual under dieren t ligh tings. Figure 5.11 sho ws the strain maps of 4 sub jects for the same expression under the same ligh ting. The discriminatory prop ert y can b e visually observ ed from these images. 5.2.4 Exp erimen ts and Results In this section, w e presen t exp erimen ts with the goal of pro viding a more empirical ev aluation to the discriminatory and stabilit y criteria. A total of 20 sub jects participated in the exp erimen ts. All sub jects app eared in video sequences of regular faces. 5 sub jects w ere in v olv ed in sessions of Camouraged F ace (Figure 5.8 (d)). Results w ere obtained using the principal comp onen t analysis algorithm with Mahalanobis distance 63
PAGE 75
(a) Sub ject 1 (b) Sub ject 2 (c) Sub ject 3 (d) Sub ject 4 Figure 5.11 Strain maps of dieren t individuals under normal ligh ting. for computing the metric scores. More details ab out the PCA implemen tation used in our exp erimen ts can b e found in [15 ]. W e designed three tests using videos (see T able 5.2). In T est1, the prob e set has videos in whic h the am bien t ligh ting w as v ery lo w (Figure 5.8 (b)). In T est2, the prob e set consists of videos in whic h strong shado ws are cast on faces (Figure 5.8 (c)). In T est3, the prob e set comprises of videos in whic h the app earance of faces is altered b y camourage (Figure 5.8 (d)). T able 5.2 Exp erimen t design for p erson iden tication using strain maps computed from the nite elemen t metho d. Gallery/Susp ect P opulation Prob e/T races T est1 20 (regular face, normal ligh t) 20 (regular face, lo w ligh t) T est2 20 (regular face, normal ligh t) 20 (regular face, shado w ligh t) T est3 20 (regular face, normal ligh t) 5 (camouraged face, normal ligh t) 5.2.4.1 T est1 In this setup, strain maps computed in normal ligh ting sequences w ere used as enlisted features and those computed in lo w ligh ting sequences w ere used as query features. Figure 5.12 sho ws the distribution for the in trasub ject and in tersub ject v ariation. It can b e noted that there is a reasonable separation b et w een the t w o distributions, a sign of a stable feature. 64
PAGE 76
20 18 16 14 12 10 8 6 4 2 0 0.05 0.1 0.15 0.2 Subspace Distance BinsNorm. Bin Values InterSubject IntraSubject Figure 5.12 In trasub ject and in tersub ject v ariation of strain maps (regular face: normal vs. lo w ligh ting) (Note: All v alues are in generic units). 5.2.4.2 T est2 In this exp erimen t, strain maps computed in normal ligh ting sequences w ere used as enrolled features and those computed in shado w ligh ting sequences w ere used as query features. Figure 5.13 presen ts the in trasub ject and in tersub ject lik eness v ariation. One can observ e that despite a drastic c hange in the video acquisition condition b et w een the sample and query sequences, there is only a minor decrease in the p erformance while matc hing strain maps. 20 18 16 14 12 10 8 6 4 2 0 0.05 0.1 0.15 0.2 Subspace Distance Bins Norm. Bin Values InterSubject IntraSubject Figure 5.13 In trasub ject and in tersub ject v ariation of strain maps (regular face: normal vs. shado w ligh ting) (Note: All v alues are in generic units). 65
PAGE 77
T ests 1 and 2 rev eal the stabilit y of strain maps for v arying illumination conditions that often plague p erformance in man y image analysis tasks. This can b e explained based on the fact that strain maps are computed from in tensit y dierences b et w een t w o image frames as against ra w in tensit y information. 5.2.4.3 T est3 In this design, strain maps computed using regular faces in normal ligh ting sequences w ere used as catalogued features and those computed using camouraged faces in normal ligh ting sequences w ere used as query features. 20 18 16 14 12 10 8 6 4 2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Subspace Distance BinsNorm. Bin Values InterSubject IntraSubject Figure 5.14 In trasub ject and in tersub ject v ariation of strain maps (normal ligh ting: regular vs. camouraged faces) (Note: All v alues are in generic units). Based on the within sub ject and b et w een sub ject v ariation sho wn in Figure 5.14, one can mak e t w o observ ations: the metho d can matc h strain maps of an individual inspite of disguise through hea vy mak eup that en tirely deforms his/her app earance. the b etter separation b et w een in trasub ject and in tersub ject similarit y distribution suggests that, in fact, there is a higher c hance of matc h when a sub ject manipulates his/her app earance. A p ossible explanation is that the camouraged 66
PAGE 78
face images ha v e more features and texture that result in accurate optical ro w results leading to b etter strain map computation. 67
PAGE 79
CHAPTER 6 TO W ARDS EXPRESSION INV ARIANT MODELING Through our metho d describ ed in Section 4.3, w e made a rst attempt at computing the Y oung's mo dulus of the facial skin from the motion observ ed during a particular facial expression. Through the w ork describ ed in this c hapter, w e mak e the follo wing adv ances: By using motion observ ed in m ultiple facial expressions, w e extend the approac h to w ards more accurate estimation of the elasticit y parameters at all regions. The matc hing pro cess is more scalable and allo ws the query expression to b e dieren t from the enrolled expressions. W e use a m uc h rened searc h metho d to b etter scan a bigger sp ectrum of v alues in the solution space. This pro vides a means for m uc h ner estimation of the parameters. Ha ving adopted a sophisticated searc h tec hnique, w e dene m uc h ner spatial sub divisions with dieren t elasticit y in our nite elemen t mo del. Finally the 2D geometry of the mo del is customized to eac h individual b y appropriate lo calization of facial features suc h as the ey es and mouth. In this c hapter, w e describ e individual asp ects of our mo deling approac h and presen t results on t w o protot yp e image analysis tasks. 68
PAGE 80
6.1 Algorithm for Mo deling Y oung's Mo dulus from Multiple Expressions The rst few steps in our approac h to solv e for the material constan ts are the same as describ ed in Section 4.3. W e use the concept of relativ e stiness and dene a tness function that computes the dierence b et w een the measured displacemen t b y the optical ro w metho d and the displacemen t of the corresp onding no de in the nite elemen t mesh. As earlier, w e compute the v alue of the ob jectiv e function for a few com binations in this n dimensional solution space. In Section 4.3, w e used the com bination that had the least error as our nal set of parameter v alues. Here, w e use these data p oin ts as seeds to further explore the solution space without ha ving to in v olv e the in tensiv e nite elemen t mo deling step. The steps in v olv ed in the mo deling algorithm are: Dene a nite elemen t mo del with the forehead as the reference material. The nose w as mo deled as a highly rigid region and the ey es w ere mo deled as a region with v arying stiness (innermost area: highly elastic; outermost area: stiness of the region it shares a b oundary with). Dene an optimization function: X x;y j u ( x;y ) v u ( x;y ) m j ; (6.1) where, u ( x;y ) v is the measured displacemen t b y the optical ro w metho d at pixel ( x; y ) and u ( x;y ) m is the displacemen t of the no de in the nite elemen t mesh corresp onding to the p osition ( x; y ). Use 1 4 th of the sparse motion eld to driv e the mo del and the remaining set of displacemen ts for v alidation. Obtain the v alues of the tness function for a few com binations of the relativ e stiness v alues. 69
PAGE 81
Run a searc h algorithm to explore this solution space and use the con v erged v alues as the nal learned parameter v alues for eac h patc h in the nite elemen t mo del. Rep eat Steps 1{4 for an y query sequence. Matc h templates based on enrolled parameter v alues and the query parameter v alues along the appropriate feature dimensions (dep ending on whic h patc hes in the mo del are c hosen for the matc hing step). 6.1.1 Automatic Landmark Detection As a step to w ards automating the system, reducing the computation b y just w orking on the region of in terest, and building an individual nite elemen t mo del that conforms with a sub ject's 2D geometry w e implemen ted a facial landmark detection algorithm to lo cate the face and the ey es of the sub ject in the frame. The detection algorithm used here w as one of the biggest milestones in the area of realtime ob ject detection in video. It w as the w ork of Viola and Jones [110 ] whic h w as later extended b y Lienhart and Ma ydt [72]. The ViolaJones detector detects ob jects of in terest using a cascade of pretrained classiers of increasing complexit y applied to rectangular Haarlik e binary w a v elet features ecien tly computed from video frames using the \in tegral image" prepro cessing tec hnique. More details of the algorithm can b e found in [110 72]. The In tel Op en Source Computer Vision Library (Op enCV) [1] implemen tation (V ersion 1.0rc1) of the Haar ob ject detection algorithm w as used for this task. Figure 6.1 sho ws the output of the landmark detection algorithm. 6.1.2 The F ace Mo del As men tioned earlier, w e describ ed a 2D nite elemen t mo del with m uc h ner spatial sub division of regions with dieren t elasticit y parameters. F urther, the geometry of 70
PAGE 82
Figure 6.1 Results from the ViolaJones landmark detection algorithm. eac h individual's mo del w as driv en b y the results from the landmark detection step describ ed earlier. Figure 6.2 (a) sho ws the geometry of a mo del (region's with dieren t parameters are represen ted b y unique colors). Figure 6.2 (b) presen ts the undeformed meshed mo del. (a) Geometry (b) Mesh Figure 6.2 FE face mo del with ner sub divisions. 71
PAGE 83
6.2 Searc h Algorithm for Computing Material Constan ts Before describing our searc h algorithm, it is imp ortan t to rst analyze our solution space. Figure 6.3 sho ws the v ariation of the ob jectiv e function for dieren t v alues of the relativ e stiness for A r e a2 sho wn in Figure 6.2. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Increasing value of relative stiffness for Area2 in the Finite Element ModelObjective function value Relative stiffness value of 0.01 for Area3 Relative stiffness value of 0.1 for Area3 Relative stiffness value of 1.0 for Area3 Figure 6.3 Plot of the ob jectiv e function for dieren t v alues of relativ e stiness for A r e a2 and A r e a3 sho wn in Figure 6.2 (Note: All v alues are in generic units). F rom Figure 6.3, w e can mak e the follo wing t w o observ ations: The ob jectiv e function is not smo oth. There are m ultiple lo cal optima. In this situation, a deriv ativ ebased optimization metho d will confron t m ultiple issues { They often progress slo wly when there is a large n um b er of parameters. If the gradien t is b eing ev aluated n umerically then eac h iteration of the optimization requires as man y function ev aluations as there are parameters. The space to b e searc hed is v ast, so a large n um b er of iterations will probably b e required. 72
PAGE 84
If the ob jectiv e function do es not ha v e a deriv ativ e, then clearly suc h a metho d can not b e used. Suc h metho ds tak e no allo w ance of m ultiple optima { they go to a lo cal optim um near to where they start. If there are lik ely to b e only a few lo cal optima, then using sev eral random starts ma y b e enough to solv e the problem. In ligh t of this, a randomized optimization metho d seemed to b e a reasonable c hoice. W e use Genetic Algorithms for they ecien tly address the issues faced with our solution space. There are n umerous other sto c hastic metho ds. The c hoice of genetic algorithms w as motiv ated b y the w ork of Zhang [120 ] where suc h an approac h w as used in a similar domain for burn scar assessmen t. 6.2.1 Genetic Algorithm A general outline of a genetic algorithm (GA) is as follo ws: (Start) Generate random p opulation of n c hromosomes (suitable solutions for the problem). (Fitness) Ev aluate the tness f(x) of eac h c hromosome x in the p opulation. (New p opulation) Create a new p opulation b y rep eating follo wing steps un til the new p opulation is complete. { (Selection) Select t w o paren t c hromosomes from a p opulation according to their tness (the b etter tness, the bigger c hance to b e selected). { (Crosso v er) With a crosso v er probabilit y cross o v er the paren ts to form a new ospring (c hildren). If no crosso v er w as p erformed, ospring is an exact cop y of paren ts. { (Mutation) With a m utation probabilit y m utate new ospring at eac h lo cus (p osition in c hromosome). 73
PAGE 85
{ (Accepting) Place new ospring in a new p opulation. (Replace) Use new generated p opulation for a further run of algorithm. (T est) If the end condition is satised, stop, and return the b est solution in curren t p opulation. (Lo op) Go to step 2. 6.2.1.1 Genetic Co ding Giv en the nite elemen t mo del of the face, its Y oung's mo dulus can b e in terpreted as a c hromosome in a GA. The Y oung's mo dulus v alue of eac h region is enco ded as a gene in the c hromosome through a onetoone mapping function (Figure 6.4). As a result, if the nite elemen t mo del has N regions, the corresp onding c hromosome w ould ha v e N genes. Eac h c hromosome in the p opulation p o ol represen ts a p ossible Y oung's mo dulus distribution. If adaptiv e remeshing is used in the nite elemen t mo del, more sophisticated enco ding sc hemes ha v e to b e considered that allo w the size and shap e of c hromosomes to c hange adaptiv ely during the ev olution. 6.2.1.2 Genetic Op erators T o minimize the ob jectiv e function 4.6, sev eral imp ortan t genetic op erators are dened. Mutation: A standard Gaussian m utation op erator is used, g j = g j + N (0 ; 1) g j (6.2) where g j is the v alue of gene j b efore the m utation, g j is the v alue of gene j after the m utation, N (0 ; 1) is a random Gaussian n um b er (mean = 0, standard deviation = 1) and is the m utation step size. The dynamic con trol of m utation step size is 74
PAGE 86
Figure 6.4 Genetic co ding of the Y oung's mo dulus in a nite elemen t mo del. determined b y a predened deca y rate dened as ( k + 1) = ( k ) where, k is the generation coun ter. is usually set in the range of 0.99 { 1.0. A smaller v alue of ma y help sp eed up the con v ergence rate but has the risk of premature con v ergence. Exp erimen ts w ere conducted with the metho d of dynamically setting the m utation probabilit y ( P m ) based on p opulation statistics and impro v emen t in terms of the tness function. The reasoning b ehind ha ving a relativ ely high m utation rate is to main tain p opulation div ersit y and prev en t premature con v ergence. Cr ossover: It is found that the onep oin t crosso v er op erator and the m ultiplep oin t crosso v er op erator p erform equally w ell, at least for this particular in v erse problem setting [120 ]. The crosso v er probabilit y ( P c ) is xed to 0.7. The onep oin t crosso v er function is implemen ted in a traditional fashion: c hildren are generated b y joining t w o paren ts at a randomly selected crosso v er p osition and then sw apping eac h sides. Par ent Sele ction and R eplac ement: The paren t selection op erator is implemen ted as tournamen t selection (k=2). In [120 ], exp erimen ts w ere conducted with a wide range of replacemen t ratios (0.05 1.0), i.e., the p ercen tage of paren t p opulation to b e replaced 75
PAGE 87
b y new c hromosomes. Giv en a p opulation of size S and a replacemen t ratio of r the n um b er of paren ts to b e replaced is: N = r S Smaller replacemen t ratios ( < 0 : 3) did not yield satisfying results b ecause of the lac k of con tributions from new c hromosomes to the p opulation div ersit y Ho w ev er, no signican t dierence w as observ ed with replacemen t ratios ranging from 0.5 to 1.0. On the other hand, a higher the replacemen t ratio resulted in a longer sim ulation time. W e used a replacemen t ratio of 0.8. An elitism strategy is also enabled during the replacemen t op eration, in whic h some elite mem b ers of the old generation are c hosen to surviv e to the next generation without comp etition (p oten tially b eing replaced b y a b etter ospring). Giv en the genetic parameters sp ecied ab o v e, trial tests with dieren t elite ratios (1% 15%) sho w ed that 3% ga v e a sligh tly b etter result on a v erage (although v ery marginal). So an elite ratio of 3% is c hosen in all the exp erimen ts. Based on the exp erimen ts done b y Zhang [120 ], the nal setting used in our genetic algorithm is sho wn in T able 6.1. T able 6.1 P arameter settings used in the genetic algorithm. Replacemen t Ratio 0.8 Mutation Deca y Rate 0.997 Initial Mutation Step Size 0.2 Onep oin t Crosso v er Probabilit y 0.2 P opulation Size 50 6.2.2 T raining Out of the six expressions for eac h sub ject in the Bingham ton dataset, w e use the motion elds from v e expressions to compute the Y oung's mo dulus of individual regions. It is apparen t that not all regions presen t sucien t nonrigid motion for a particular expression. It is trivial to optimize the material constan t v alues in regions where there is not m uc h nonrigid motion. This is due to the fact that an y setting of the stiness matrix will conform with the observ ed motion (no motion or just rigid motion). Hence, w e use a predened threshold for the n um b er of elemen ts in a region that should 76
PAGE 88
undergo some deformation in order to optimize the Y oung's mo dulus of that region. W e c hose a v alue of 40% as the threshold. In cases when none of the regions satised this threshold, w e used the top t w o regions with maximal ro w information. This w as the case with almost all of the sad expressions, ma jorit y of the fe ar expressions, and some of the angry expressions. Although w e iden tied the regions with maximal ro w information, w e used the Y oung's mo dulus of all the regions in our genetic co ding of the c hromosome. The reasoning b ehind this is the fact that theoretically in a comp osite elastic ob ject, deformation in an y one part of the ob ject can inruence its b eha vior in other parts of the b o dy based on its underlying material c haracteristics. Ho w ev er, w e observ ed that this w as not the case for the domain w e are in v estigating (face). This migh t b e b ecause of the face anatom y itself and also due to the fact that the regions are fairly large and their inruence is conned to adjoining regions alone (meaning displacemen ts observ ed in the c heek region do not ha v e an explicit inruence on the forehead region). This is pro v en b y the observ ation that when w e used the Y oung's mo dulus of only a subset of the regions (adjoining regions to where substan tial ro w information w as computed), the con v erged v alues of the Y oung's mo dulus w ere within 5% when compared to using all the regions in the genetic co ding. F or a particular expression, w e only use the con v erged v alues of the Y oung's mo dulus for regions where there w as substan tial motion information. Finally when w e get m ultiple v alues of Y oung's mo dulus for a particular region computed from dieren t expressions, w e use the mean of the v alues as our nal setting. It is w orth noting that these m ultiple v alues w ere fairly close enough to justify the use of their mean as the nal v alue. 6.3 Com bination Rules in a MultiF eature Classication System Giv en the fact that the estimation of material constan ts ma y not b e accurate in all regions if w e w ere to use one particular expression, it is imp ortan t to come up with 77
PAGE 89
a strategy to matc h templates if w e just one query expression. One approac h is to treat the Y oung's mo dulus in eac h patc h as a separate feature in the matc hing pro cess. Then, the next question is \How to c ombine the information inher ent in e ach of these fe atur es to c ome up with a unie d de cision?" W e can nd man y w a ys to com bine suc h information in the m ultibiometrics literature. Based on where the information is actually com bined, they can b e broadly classied as: sensorlev el fusion, featurelev el fusion, scorelev el fusion, ranklev el fusion, and decisionlev el fusion tec hniques. Among all, fusion at the score lev el oers the b est tradeo b et w een information con ten t and ease of fusion. Hence, w e adopt it in our exp erimen ts. Although a wide v ariet y of score lev el fusion tec hniques ha v e b een prop osed in the literature, these can b e group ed in to three main categories, viz., densit ybased, transformationbased, and classierbased sc hemes. The p erformance of eac h sc heme dep ends on the amoun t and qualit y of the a v ailable training data. If a large n um b er of matc h scores is a v ailable for training the fusion mo dule, then densit ybased approac hes suc h as the lik eliho o d ratio test can b e used. Estimating the gen uine and imp ostor distributions ma y not alw a ys b e feasible due to the limited n um b er of training samples that are a v ailable. In suc h cases, transformationbased sc hemes are a viable alternativ e. The nonhomogeneit y of the similarit y scores presen ted b y the dieren t mo dules raises a n um b er of c hallenges. Suitable score normalization sc hemes are essen tial in order to transform these matc h scores in to a comparable domain. The sum of scores fusion metho d with simple score normalization (suc h as z score) represen ts a commonly used transformationbased sc heme. Classicationbased fusion sc hemes consolidate the outputs of dieren t matc hers in to a single v ector of scores whic h is then fed in to a trained classier. The classier then determines to whic h class the v ector b elongs to. Some of the p opular com bination rules are describ ed b elo w. Pr o duct R ule: This rule is a direct implication of the assumption of statistical indep endence b et w een the R feature represen tations x 1 ; :::; x R Assuming equal prior for 78
PAGE 90
all classes (in traand in terclass v ariation), the pro duct rule can b e stated as: Assign X ! r if R Y j =1 P ( r j x j ) R Y j =1 P ( k j x j ) ; k = 1 ; :::; M ; (6.3) where M is the total n um b er of classes (sub jects). One of the main limitations of the pro duct rule is its sensitivit y to errors in the estimation of the p osteriori probabilities. Ev en if one of the classiers outputs a probabilit y close to zero, the pro duct of the R p osteriori probabilities is rather small and this often leads to an incorrect classication decision. Sum R ule: The sum rule is more eectiv e than the pro duct rule when the input X tends to b e noisy leading to errors in the estimation of the p osteriori probabilities. Assuming equal prior for all classes, the sum rule can b e stated as: Assign X ! r if R X j =1 P ( r j x j ) R X j =1 P ( k j x j ) ; k = 1 ; :::; M : (6.4) This decision rule is also kno wn as the me an or aver age decision rule b ecause it is equiv alen t to assigning the input pattern to the class that has the maxim um a v erage p osteriori probabih t y o v er all the R classiers. Max R ule: The max rule appro ximates the mean of the p osteriori probabilities b y their maxim um v alue, i.e., 1 R R X j =1 P ( k j x j ) max R j =1 P ( k j x j ) : (6.5) Assuming equal prior for all classes, the max rule can then b e stated as: Assign X ! r if 79
PAGE 91
max R j =1 P ( r j x j ) max R j =1 P ( k j x j ) ; k = 1 ; :::; M : (6.6) There are couple of more v ariations to the ab o v e rules suc h as the minrule and the me dianrule W e use the sumrule and the maxrule in our exp erimen ts. 6.4 Exp erimen ts and Results In this section, w e presen t our exp erimen t and results on the application of material constan t mo deling in t w o tasks, namely Generating a dense motion eld from a set of sparse displacemen t v ectors. Expression in v arian t matc hing of templates using Y oung's mo dulus v alues. 6.4.1 BU4DFE Dataset Besides con tin uing our eorts on collecting our inhouse data of sub jects with m ultiple facial expressions and c hallenging ligh ting conditions, w e also adopted the facial expression dataset named BU4DFE (Bingham ton Univ ersit y 4D F acial Expression) dev elop ed b y the SUNY Bingham ton Univ ersit y NY [117 ]. It is a newly created highresolution 3D dynamic facial expression database dev elop ed with the ob jectiv e of analyzing facial b eha vior from a static 3D space to a dynamic 3D space. The 3D facial expressions are captured at a video rate (25 frames p er second). F or eac h sub ject, there are six mo del sequences sho wing six protot ypic facial expressions (anger, disgust, happiness, fear, sadness, and surprise), resp ectiv ely Eac h expression sequence con tains ab out 100 frames. The database con tains 606 3D facial expression sequences captured from 101 sub jects, with a total of appro ximately 60,600 frame mo dels. Eac h 3D mo del of a 3D video sequence has the resolution of appro ximately 35,000 v ertices. The texture video has a resolution of ab out 1040 1329 pixels p er frame. The resulting database consists of 58 female and 43 male sub jects, with a v ariet y of ethnic/racial ancestries, in80
PAGE 92
cluding Asian, Blac k, Hispanic/Latino, and White. Figure 6.5 sho ws sample expression mo del sequences from this dataset. Figure 6.5 Sample expression mo del sequences (male and female). 6.4.2 Nonrigid Motion T rac king In this exp erimen t, the ob jectiv e is to compute a dense motion eld based on the sparse displacemen t observ ations and the underlying material c haracteristics of an individual. As men tioned earlier, giv en a subset of motion v ectors, the displacemen t in the other p oin ts can b e computed using the equation of motion. Here, the Y oung's mo dulus acts as an adaptiv e smo othness con troller of the motion eld. F or ev aluating the p erformance of this metho d, w e use the dense optical ro w generated b y the Blac k and Anandan metho d as our ground truth. The idea b ehind this is to substan tiate the fact that a fairly iden tical dense motion eld can b e computed without ha ving to do a full blo wn temp oral matc hing. F urther, w e also compare the p erformance against an in terp olation based metho d ( bicubic ) to emphasize the v alue added b y the mo deling of material constan ts in the deformation domain. W e use a simple error function to describ e the error rate of the presen ted metho ds: E r r or ( x; y ) = k D ispl acement M odel D ispl acement F l ow k D ispl acement F l ow T otal E r r or = P 8 ( x;y ) E r r or ( x; y ) N ; (6.7) where, N is the total n um b er of p oin ts in the computed motion eld. 81
PAGE 93
T able 6.2 P erformance ev aluation of mo delbased trac king and in terp olationbased trac king. Expression Region Av erage Error % Max Error % Mo del In terp olation Mo del In terp olation Angry F orehead (Area1) 6.3 6.7 10.5 15.0 Left upp er c heek (Area2) 4.5 9.3 8.3 20.5 Left middle c heek (Area3) 4.2 8.7 7.8 19.3 Left lo w er c heek (Area4) 2.1 2.3 4.2 4.8 Chin (Area5) 1.2 1.1 1.5 1.3 Righ t lo w er c heek (Area6) 2.4 2.2 4.0 4.5 Righ t middle c heek (Area7) 4.0 8.1 7.3 18.8 Righ t upp er c heek (Area8) 4.3 9.1 8.1 21.8 Disgust F orehead (Area1) 7.1 7.8 11.3 16.8 Left upp er c heek (Area2) 5.4 8.2 9.3 18.5 Left middle c heek (Area3) 4.8 10.3 8.2 24.5 Left lo w er c heek (Area4) 5.7 9.8 10.2 27.8 Chin (Area5) 3.0 5.1 5.8 11.5 Righ t lo w er c heek (Area6) 5.8 9.5 10.6 28.2 Righ t middle c heek (Area7) 4.5 10.6 8.1 22.5 Righ t upp er c heek (Area8) 5.0 8.1 9.1 17.8 F ear F orehead (Area1) 6.2 6.5 10.6 14.8 Left upp er c heek (Area2) 1.7 1.9 3.1 3.6 Left middle c heek (Area3) 3.0 4.5 5.1 8.8 Left lo w er c heek (Area4) 3.7 4.8 5.8 9.3 Chin (Area5) 1.3 1.7 2.3 2.5 Righ t lo w er c heek (Area6) 3.4 4.7 5.7 9.2 Righ t middle c heek (Area7) 2.9 4.2 5.3 9.1 Righ t upp er c heek (Area8) 1.9 1.7 3.0 3.4 Happ y F orehead (Area1) 1.6 2.1 2.8 3.1 Left upp er c heek (Area2) 4.2 7.4 8.1 14.4 Left middle c heek (Area3) 5.1 10.1 9.2 20.3 Left lo w er c heek (Area4) 3.9 8.3 6.2 19.0 Chin (Area5) 1.9 2.3 2.6 3.1 Righ t lo w er c heek (Area6) 4.1 8.7 6.5 19.1 Righ t middle c heek (Area7) 4.9 9.8 9.5 20.1 Righ t upp er c heek (Area8) 4.0 7.0 7.9 14.1 Sad F orehead (Area1) 3.2 4.1 5.4 7.8 Left upp er c heek (Area2) 1.0 1.3 1.8 2.3 Left middle c heek (Area3) 1.2 1.1 1.9 2.7 Left lo w er c heek (Area4) 1.5 1.9 2.4 3.1 Chin (Area5) 2.1 2.7 3.8 4.6 Righ t lo w er c heek (Area6) 1.8 2.1 2.9 3.7 Righ t middle c heek (Area7) 1.1 1.4 1.8 2.5 Righ t upp er c heek (Area8) 1.0 1.2 1.7 2.4 Surprise F orehead (Area1) 3.7 6.3 6.2 15.3 Left upp er c heek (Area2) 1.8 2.1 3.4 3.9 Left middle c heek (Area3) 4.9 8.4 8.4 19.1 Left lo w er c heek (Area4) 5.7 10.5 10.8 27.6 Chin (Area5) 2.9 3.1 4.6 5.9 Righ t lo w er c heek (Area6) 5.9 10.8 11.1 28.6 Righ t middle c heek (Area7) 5.0 8.6 8.7 19.4 Righ t upp er c heek (Area8) 1.6 2.0 3.1 3.4 82
PAGE 94
T able 6.2 sho ws the error rates for the mo delbased trac king metho d and the in terp olation metho d for eac h of the expressions. W e split the results b y individual regions to emphasize the p erformance of the mo delbased metho d in regions where there are large deformations. W e can mak e the follo wing observ ations from T able 6.2: The a v erage error is within 7% and the w orst case error is within 11% for all regions across all expressions. This sho ws that a sparse and reliable feature trac k er will suce for the generation of a dense motion eld. W e can alleviate issues related to automatic motion estimation suc h as noise sensitivit y temp oral matc hing, and motion discon tin uit y The a v erage error from the mo delbased approac h is alw a ys less than that of the in terp olationbased metho d. The w orst case error is signican tly lo w er when compared to the in terp olation metho d. This sho ws that the stiness v alues appro ximating the underlying material b eha vior acts as regularization term whic h is b etter compared to na v e approac hes lik e mere in terp olation. 6.4.3 Expression In v arian t Matc hing In this section, w e presen t our results on expression in v arian t template matc hing algorithm describ ed in Section 6.1. A total of 40 sub jects (20 males and 20 females) w ere randomly c hosen from the Bingham ton dataset. Results w ere obtained using b oth the maxrule decision and the sumrule decision principles (See Section 6.3) with Equilidean distance for computing the metric scores. W e designed six tests using the six expressions a v ailable for eac h sub ject (see T able 6.3). In T est1, the enrolled features consisted of Y oung's mo dulus v alues computed from the disgust, fear, happ y sad, and surprise expressions and the query features comprised of Y oung's mo dulus v alues computed from the angry expression. W e designed 83
PAGE 95
the rest of the tests in a similar fashion where eac h of the six expression w as used as the query expression and the mo del w as trained on the remaining v e expressions. T able 6.3 Exp erimen t design for matc hing based on the Y oung's mo duli mo deled from m ultiple expressions. Enrolled features (training expressions) Query feature T est1 40 (disgust, fear, happ y sad, and surprise) 40 (angry) T est2 40 (angry fear, happ y sad, and surprise) 40 (disgust) T est3 40 (angry disgust, happ y sad, and surprise) 40 (fear) T est4 40 (angry disgust, fear, sad, and surprise) 40 (happ y) T est5 40 (angry disgust, fear, happ y and surprise) 40 (sad) T est6 40 (angry disgust, fear, happ y and sad) 40 (surprise) During the matc hing pro cess, the metric scores w ere computed along only those dimensions (patc hes in the nite elemen t mo del) where there w as sucien t deformation. As w as the case earlier, w e use a threshold of 40% for c ho osing the patc hes that qualify for the matc hing pro cess. Figure 6.6 sho ws the in trasub ject and in tersub ject v ariation of the metric scores for eac h of the tests using the maxrule decision principle. T able 6.4 compares the accuracy of the maxrule decision and the sumrule decision principles at a false acceptance rate of 5%. T able 6.4 Comparing the p erformance of maxrule and minrule decision principles for the expression in v arian t matc hing exp erimen t at 5% false acceptance rate. Maxrule Sumrule T est1 65.0% 57.5% T est2 87.5% 75.0% T est3 60.0% 57.5% T est4 80.0% 65.0% T est5 47.5% 50.0% T est6 82.5% 72.5% F rom the results, w e can mak e the follo wing observ ations: Though on a mo derate sized dataset of 40 sub jects, the results sho w that it is indeed p ossible to matc h face templates based on their material constan ts and mak e a step to w ards an expression in v arian t matc hing approac h. W e see that the metho d do esn't p erform w ell when the query expression is sad, fear, or angry as compared to when the query expression is disgust, surprise, or 84
PAGE 96
0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 Feature Distance BinsNorm. ValuesTrained on Disgust, Fear, Happy, Sad, and Surprise. Tested on Angry. IntraSubject InterSubject 0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 Feature Distance BinsNorm. ValuesTrained on Angry, Fear, Happy, Sad, and Surprise. Tested on Disgust. IntraSubject InterSubject 0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 Feature Distance BinsNorm. ValuesTrained on Angry, Disgust, Happy, Sad, and Surprise. Tested on Fear. IntraSubject InterSubject 0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 Feature Distance BinsNorm. ValuesTrained on Angry, Disgust, Fear, Sad, and Surprise. Tested on Happy. IntraSubject InterSubject 0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 Feature Distance BinsNorm. ValuesTrained on Angry, Disgust, Fear, Happy, and Surprise. Tested on Sad. IntraSubject InterSubject 0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 Feature Distance BinsNorm. ValuesTrained on Angry, Disgust, Fear, Happy, and Sad. Tested on Surprise. IntraSubject InterSubject Figure 6.6 In trasub ject and in tersub ject v ariation of Y oung's mo dulus using the maxrule decision principle. The elasticit y mo del w as constructed from v e random expressions and the query sequence had a completely dieren t expression (Note: All v alues are in generic units). happ y A p ossible explanation is the fact that there isn't m uc h of facial deformation during the sad, fear, or angry expressions leading to less accurate estimation of the Y oung's mo dulus in almost all of the patc hes in the mo del. It is in teresting 85
PAGE 97
to note that this is not a ma jor cause of concern when these expressions are part of the training set. This is due to the fact that other expressions comp ensate for suc h lac k of information. This alludes to the imp ortance of using m ultiple expressions during the mo deling step. The disparit y in the p erformance across expressions aligns with the ndings in existing literature that nonrigid motion pla ys an imp ortan t role in facilitating subsequen t recognition b oth b y h umans and computers. W e observ e that the maxrule p erforms b etter than the sumrule in our exp erimen ts. This is probably b ecause of the fact the maxrule pic ks the patc h with maximal deformation (meaning the estimation of Y oung's mo dulus is more reliable in this patc h) as against the sumrule whic h merges the metric scores that could p oten tially maul the score from the patc h whic h has ric h and reliable information. This is indirectly hin ted in the results where the sumrule p erforms as w ell as (or sligh tly b etter) the maxrule on the fear and the sad expressions since none of the patc hes ha v e reliable information due to the lac k of sucien t deformation. In these cases, the information from other patc hes aids the p erformance. 86
PAGE 98
CHAPTER 7 DISCUSSION AND CONCLUSIONS W e describ ed an eectiv e and ecien t w a y of c haracterizing the material prop erties of the facial soft tissues using strain pattern extracted from the nonrigid facial motion observ ed in video. Suc h a simple and adequate w a y of mo deling deformable ph ysical b o dies has an imp ortan t implication in man y image analysis applications suc h as facial expression recognition, age estimation, and p erson iden tication from video. W e discussed t w o tec hniques for computing the facial strain pattern. The nite dierence metho d is ecien t when the geometry of the problem is regular. Suc h an approac h is limited with its requiremen t for a dense motion computation with high reliabilit y W e impro v ed the system through b etter c haracterization of facial soft tissue b y means of a nite elemen t mo deling based approac h incorp orating relativ e material prop erties of individuals. The FEMbased approac h w orks with a sparse motion eld and enforces regularization (smo othness con troller for the motion eld) whic h mak es the computational metho d accurate and robust. W e presen ted initial results on matc hing strain maps on a mo derate sized database. Exp erimen ts sho w promising results that strain maps can b e used as a discriminativ e and stable feature. Its v alue is further justied b y its p erformance under unfa v orable conditions suc h as faces with strong shado ws and camourage. In a subsequen t eort, w e mo deled the material constan ts (Y oung's mo dulus) of the skin in subregions of the face from the motion observ ed in m ultiple facial expressions. On a public database consisting of sub jects undergoing some set of facial motions, w e presen ted an expression in v arian t strategy to matc hing faces using the Y oung's mo dulus 87
PAGE 99
of the skin. Suc h an ecien t w a y of describing underlying material prop erties from the displacemen ts observ ed in video has an imp ortan t application in deformable mo deling of ph ysical ob jects whic h are usually gauged b y their simplicit y and adequacy One related issue is that, due to the limited p opulation size (40 to 60 sub jects), this study can only pro vide a baseline ev aluation on the p erformance of the presen ted metho ds. A large eort spanning a longer timeline that includes a bigger dataset is a p oten tial topic for future studies. 7.1 Impact on Image Analysis Applications In this section, w e discuss a few image analysis applications where the metho ds presen ted in this dissertation will ha v e an impact on. 7.1.1 Expression Sp otting in Video In a long video sequence comprising of c hanging facial expressions, detecting the temp oral b oundaries of individual expressions will b e useful for subsequen t analysis. Using the metho ds presen ted in Section 4.2, w e can follo w the strain magnitude across frames. The temp oral c hanging pattern of the strain magnitude can then b e prob ed for saddles using an adaptiv e lo cal thresholding strategy The h yp othesis is that these saddle p oin ts corresp ond to expression b oundaries as they denote instances where the face restores to its undeformed shap e (neutral face). This falls in line with the general observ ation in the literature on facial expressions: the dynamics of a facial expression consists of three main phases { start, p eak, and stop where, the start and the stop phases matc h in their facial deformation. Shrev e et al. [103 ] presen t initial results on expression sp otting using this approac h. 88
PAGE 100
7.1.2 MicroExpression Detection A microexpression is a short liv ed, mostly in v olun tary facial expression, sho wn when a p erson is trying to hide his/her true emotions. Microexpressions usually last an ywhere b et w een onefth to one t w en t yfth of a second b efore it is recognized and suppressed. The iden tifying c haracteristic of a microexpression is its duration and spatial lo calit y Microexpressions last for v ery short a p erio d and usually o ccur in one part of the face whic h mak es it hard to detect them with the nak ed ey e. By dividing the face in to regions similar to our nite elemen t mo del and analyzing the strain magnitude in this regions using the metho ds presen ted in Section 4.2, w e can detect un usual deformation in m uc h smaller spatial and temp oral exten ts. The h yp othesis is that suc h un usual activit y should corresp ond to the microexpression b oundaries. Shrev e et al. [103 ] presen t initial results on microexpression detection using this approac h. 7.1.3 Age Estimation Besides face matc hing and expression sp otting, the metho ds presen ted in this w ork can b e applied to more exploratory tasks suc h as age estimation. Suc h an application has signican t impacts on m ultimedia comm unication and h uman{computer in teraction. One can nd a large amoun t literature on aging syn thesis and rendering [39, 94]. Ho w ev er, due to the nondenitiv e nature of the classication step, there are relativ ely few er w orks on age estimation. The individualized uncon trolled nature of age progression cannot b e accurately captured through the assorted information comm unicated b y h uman faces. Using the metho ds presen ted in Chapter 6 for age estimation falls in the general paradigm of ph ysical mo deling. Existing metho ds in this category estimate the age b y mo deling gro wth related c hanges of face shap e and textural patterns. W e can build a metho d on a more elemen tary fact that the elasticit y prop erties of facial soft tissues v aries among dieren t age groups and will b e rerected in its stiness v alues. The 89
PAGE 101
metho dology is v ery similar to face matc hing pro cess. In that task, classes corresp ond to individuals and on the other hand for the age estimation task, classes corresp ond to dieren t age groups. Our h yp othesis is substan tiated b y ndings in the existing literature that rep ort that as a p erson ages, his/her skin b ecomes less elastic. 7.2 Ideas for F uture W ork The ob jectiv e of this dissertation has b een on dev eloping accurate and robust computational strategies for c haracterizing facial skin motion prop erties and analyzing them from their discriminatory and stabilit y criteria. Some of the p ossible extensions to this are listed b elo w: F usion with In tensit y Information: One of the imp ortan t ideas to b e explored is the augmen tation of existing tec hnologies for p erson iden tication from video with the v alue pro vided b y strain pattern and material prop ert y c haracterization. The fact that the computational metho ds in v olv ed are stable in terms of p erformance under drastic conditions suc h as shado w ligh ting and camourage suggests that the metho d most certainly has the promise to supplemen t curren t tec hnology under suc h adv erse op erational conditions. It will b e in teresting to v alidate this b y fusing the orthogonal information pro vided b y the metho ds presen ted in this dissertation with the ra w in tensit y information and c hec k the p erformance impro v emen t in a traditional biometrics framew ork. In tegration of the optical ro w constrain ts in to the FE Mo del: The solution approac h tak en b y a nite elemen t metho d is in breaking up the problem domain in to smaller elemen ts, assem bling a global matrix that denes the geometric relationships b et w een elemen ts/no des, and nally solving for a large system of equations with the help of b oundary conditions. Ho w ev er, for the sp ecic problem that w e in v estigate, w e ha v e additional observ ations from video. It will b e in teresting to 90
PAGE 102
in v estigate if the t w o optical ro w constrain ts (in tensit y conserv ation and smo othness constrain t) can b e in telligen tly in tegrated in to this system of equations. As a result, w e will then ha v e a no v el w a y to estimate displacemen ts on an ob ject domain b y not only lo oking at image corresp ondence but also incorp orating the material constan ts of the ob ject b eing deformed. Suc h an approac h will b e robust to noise whic h is usually a cause of big concern in motion estimation algorithms. Besides the ab o v e, the incorp oration of these constrain ts will bring do wn the degrees of freedom for the domain, th us, needing lesser n um b er of equations to solv e the nite elemen t problem. 91
PAGE 103
REFERENCES [1] The In tel Op en Source Computer Vision Library Online: h ttp://www.in tel.com/tec hnology/computing/op encv/. [2] G. Aggarw al, A. R. Cho wdh ury and R. Chellappa. A System Iden tication Approac h for Videobased F ace Recognition. In International Confer enc e on Pattern R e c o gnition v olume 4, pages 175{178, 2004. [3] J. Ahlb erg. CANDIDE3: An Up dated P arametrized F ace. T ec hnical Rep ort LiTHISYR2326, Link oping Univ ersit y 2001. [4] B. Allen, B. Curless, and Z. P op o vi c. The Space of Human Bo dy Shap es: Reconstruction and P arameterization from Range Scans. In A CM SIGGRAPH pages 587{594, 2003. [5] P Anandan. A Computational F ramew ork and an Algorithm for the Measuremen t of Visual Motion. International Journal of Computer Vision 2:283{310, 1989. [6] F. R. Bac h and M. I. Jordan. Kernel Indep enden t Comp onen t Analysis. Journal of Machinc e L e arning R ese ar ch 3:1{48, 2003. [7] R. Ba jcsy and C. Broit. Matc hing of Deformed Images. In ICPR '82: 6th International Confer enc e on Pattern R e c o gnition (ICPR'82) pages 351{353, 1982. [8] R. Ba jcsy and S. Ko v a ci c. Multiresolution Elastic Matc hing. Computer Vision, Gr aphics, and Image Pr o c essing 46(1):1{21, 1989. [9] A. H. Barr. Global and Lo cal Deformations of Solid Primitiv es. SIGGRAPH Computer Gr aphics 18(3):21{30, 1984. [10] R. H. Bartels, J. C. Beatt y and B. A. Barsky A n Intr o duction to Splines for Use in Computer Gr aphics and Ge ometric Mo deling Morgan Kaufmann Publishers Inc., San F rancisco, CA, USA, 1987. [11] M. S. Bartlett, J. R. Mo v ellan, and T. J. Sejno wski. F ace Recognition b y Indep enden t Comp onen t Analysis. IEEE T r ansactions on Neur al Networks 13(6):1450{ 1464, 2002. [12] C. Basso, P P a ysan, and T. V etter. Registration of Expressions Data using a 3D Morphable Mo del. In 7th International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 205{210, 2006. 92
PAGE 104
[13] P N. Belh umeur, J. P Hespanha, and D. J. Kriegman. Eigenfaces vs. Fisherfaces: Recognition using Class Sp ecic Linear Pro jection. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 19(7):711{720, 1997. [14] D. E. Besk os. Boundary Element Metho ds in Me chanics. NorthHolland, Amsterdam, 1987. [15] R. Bev eridge and B. Drap er. Ev aluation of F ace Recognition Algorithms. Online: h ttp://www.cs.colostate.edu/ev alfacerec/index.h tml. [16] M. J. Blac k. R obust Incr emental Optic al Flow PhD thesis, Y ale Univ ersit y 1992. [17] M. J. Blac k and P Anandan. A F ramew ork for the Robust Estimation of Optical Flo w. In International Confer enc e on Computer Vision pages 231{236, 1993. [18] M. J. Blac k and P Anandan. The Robust Estimation of Multiple Motions: P arametric and Piecewise Smo oth Flo w Fields. Computer Vision and Image Understanding 63(1):75{104, 1996. [19] V. Blanz and T. V etter. A Morphable Mo del for the Syn thesis of 3D F aces. In 26th A nnual Confer enc e on Computer Gr aphics and Inter active T e chniques pages 187{194, 1999. [20] V. Blanz and T. V etter. F ace Recognition based on Fitting a 3D Morphable Mo del. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 25(9):1063{ 1074, 2003. [21] V. Blanz and T. V etter. F ace Recognition based on Fitting a 3D Morphable Mo del. IEEE T r ansactions on Pattern A nalysis and Machinc e Intel ligenc e 25(9):1063{ 1074, 2003. [22] F. Bo okstein. Principal W arps: ThinPlate Splines and the Decomp osition of Deformations. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 11(6):567{585, 1989. [23] K. W. Bo wy er, K. Chang, and P J. Flynn. A Surv ey of Approac hes and Challenges in 3D and Multimo dal 3D + 2D F ace Recognition. Computer Vision and Image Understanding 101(1):1{15, 2006. [24] M. BroNielsen. F ast Finite Elemen ts for Surgery Sim ulation. In Me dicine Me ets Virtual R e ality 5 1997. [25] M. BroNielsen and S. Cotin. RealTime V olumetric Deformable Mo dels for Surgery Sim ulation using Finite Elemen ts and Condensation. In Computer Gr aphics F orum (Eur o gr aphics'96) v olume 15, pages 57{66. Blac kw ell Publishers for Eurographics Asso ciation, 1996. [26] A. M. Bronstein, M. M. Bronstein, and R. Kimmel. ExpressionIn v arian t 3D F ace Recognition. In International Confer enc e on A udio and Vide ob ase d Biometric Person A uthentic ation pages 62{69, 2003. 93
PAGE 105
[27] Y.K. Chang and A. P Ro c kw o o d. A Generalized de Casteljau Approac h to 3D F reeform Deformation. In SIGGRAPH '94: 21st A nnual Confer enc e on Computer Gr aphics and Inter active T e chniques pages 257{260, New Y ork, NY, USA, 1994. A CM. [28] R. Chellappa, C. L. Wilson, and S. Sirohey Human and Mac hine Recognition of F aces: A Surv ey. Pr o c e e dings of the IEEE 83(5):705{740, 1995. [29] L. Chen, H. Laio, and J. Lin. P erson Iden tication using F acial Motion. In International Confer enc e on Image Pr o c essing v olume 2, pages 677{680, 2001. [30] T. F. Co otes, G. J. Edw ards, and C. J. T a ylor. Activ e App earance Mo dels. IEEE T r ansactions on Pattern A nalysis and Machinc e Intel ligenc e 23(6):681{685, 2001. [31] T. F. Co otes, K. N. W alk er, and C. J. T a ylor. Viewbased Activ e App earance Mo dels. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 227{232, 2000. [32] S. Co quillart. Extended F reeform Deformation: A Sculpturing T o ol for 3D Geometric Mo deling. In SIGGRAPH '90: 17th A nnual Confer enc e on Computer Gr aphics and Inter active T e chniques pages 187{196, New Y ork, NY, USA, 1990. A CM. [33] S. Cotin, H. Delingette, and N. Ay ac he. Real Time V olumetric Deformable Mo dels for Surgery Sim ulation. In VBC '96: 4th International Confer enc e on Visualization in Biome dic al Computing pages 535{540, London, UK, 1996. SpringerV erlag. [34] M. H. Da vis. Deformable Matching of 3D Me dic al Images PhD thesis, Southern Metho dist Univ ersit y 1996. [35] P Ekman and W. V. F riesen. Emotion in the Human F ac e Pren tice{Hall, 1972. [36] I. A. Essa and A. P P en tland. Co ding, Analysis, In terpretation, and Recognition of F acial Expressions. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 19(7):757{763, 1997. [37] D. N. F enner. Engine ering Str ess A nalysis A Finite Element Appr o ach With F OR TRAN 77 Softwar e J. Wiley and Sons, 1987. [38] M. F erran t, S. K. W areld, C. R. G. Guttmann, R. V. Mulk ern, F. A. Jolesz, and R. Kikinis. 3D Image Matc hing using a Finite Elemen t Based Elastic Deformation Mo del. In MICCAI '99: Se c ond International Confer enc e on Me dic al Image Computing and ComputerAssiste d Intervention pages 202{209, London, UK, 1999. SpringerV erlag. [39] Y. F u and N. Zheng. MF ace: An App earanceBased Photorealistic Mo del for Multiple F acial A ttributes Rendering. IEEE T r ansactions on Cir cuits and Systems for Vide o T e chnolo gy 16(7):830{842, 2006. 94
PAGE 106
[40] K. F ukunaga. Intr o duction to Statistic al Pattern R e c o gnition Academic Press Professional, Inc., second edition, 1990. [41] S. Gibson and B. Mirtic h. A Surv ey of Deformable Mo deling in Computer Graphics. T ec hnical Rep ort TR9719, Mitsubishi Electric Researc h Lab oratory 1997. [42] E. Gladilin. Biome chanic al Mo deling of Soft Tissue and F acial Expr essions for Cr aniofacial Sur gery Planning PhD thesis, ZuseInstituteBerlin, 2002. [43] F. Glazer. Hier ar chic al Motion Dete ction PhD thesis, Univ ersit y of Massac h ussets, 1987. [44] D. O. Goro dnic h y On Imp ortance of Nose for F ace T rac king. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 188{196, 2002. [45] P Grother and P J. Phillips. Mo dels of Large P opulation Recognition P erformance. In International Confer enc e on Computer Vision and Pattern R e c o gnition v olume 2, pages 68{75, 2004. [46] D. M. Gro v e. The In terpretation of F orensic Evidence using a Lik eliho o d Ratio. Biometrika 67(1):243{246, 1980. [47] G. Guo, S. Z. Li, and K. L. Chan. F ace Recognition b y Supp ort V ector Mac hines. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 196{201, 2000. [48] A. Hadid and M. Pietik ainen. An Exp erimen tal In v estigation ab out the In tegration of F acial Dynamics in Videobased F ace Recognition. Ele ctr onic L etters on Computer Vision and Image A nalysis 5(1):1{13, 2005. [49] J. A. Hanley and B. J. McNeil. The Meaning and Use of the Area under a Receiv er Op erating Characteristic (R OC) Curv e. R adiolo gy 143:29{36, 1982. [50] D. R. Heise, editor. Understanding Events Cam bridge Univ ersit y Press, 1979. [51] B. Heisele, P Ho, and T. P oggio. F ace Recognition with Supp ort V ector Mac hines: Global v ersus Comp onen tbased Approac h. In International Confer enc e on Computer Vision pages 688{694, 2001. [52] H. Hill and A. Johnston. Categorizing Sex and Iden tit y from the Biological Motion of F aces. Curr ent Biolo gy 11(11):880{885, 2001. [53] B. K. P Horn and B. G. Sc h unc k. Determining Optical Flo w. A rticial Intel ligenc e 17:185{204, 1981. [54] X. Huang, S. Zhang, Y. W ang, D. Metaxas, and D. Samaras. A Hierarc hical F ramew ork for High Resolution F acial Expression T rac king. In 2004 Confer enc e on Computer Vision and Pattern R e c o gnition Workshop v olume 1, page 22, 2004. 95
PAGE 107
[55] D. James and D. P ai. AR TDEF O { Accurate RealTime Deformable Ob jects. SIGGRAPH Computer Gr aphics pages 65{72, 1999. [56] H. Jereys and B. S. Jereys. Metho ds of Mathematic al Physics Cam bridge Univ ersit y Press, third edition, 1988. [57] A. Kadyro v and M. P etrou. The T race T ransform and Its Applications. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 23(8):811{828, 2001. [58] K. K ahler, J. Hab er, and H.P Seidel. Geometrybased Muscle Mo deling for F acial Animation. In Gr aphics Interfac e 2001 pages 37{46, 2001. [59] K. K ahler, J. Hab er, H. Y amauc hi, and H.P Seidel. Head Shop: Generating Animated Head Mo dels with Anatomical Structure. In 2002 A CM SIGGRAPH/Eur o gr aphics Symp osium on Computer animation pages 55{63, 2002. [60] P Kalra, A. Mangili, N. M. Thalmann, and D. Thalmann. Sim ulation of F acial Muscle Actions based on Rational F ree F orm Deformations. In Computer Gr aphics Forum v olume 11, pages 59{69, 1992. [61] M. Kass, A. Witkin, and D. T erzop oulos. Snak es: Activ e Con tour Mo dels. International Journal of Computer Vision V1(4):321{331, 1988. [62] E. Keev e, S. Giro d, and B. Giro d. Craniofacial Surgery Sim ulation. In VBC '96: 4th International Confer enc e on Visualization in Biome dic al Computing v olume 1131 of L e ctur e Notes in Computer Scienc e pages 541{546. Springer, 1996. [63] T. D. Kemp er, editor. A So cial Inter actional The ory of Emotions Wiley In terscience, 1978. [64] B. Knappmey er, I. Thorn ton, and H. B ultho. The Use of F acial Motion and F acial F orm during the Pro cessing of Iden tit y. Vision R ese ar ch 43(18):1921{ 1936, 2003. [65] R. M. Ko c h, M. H. Gross, F. R. Carls, D. F. v on B uren, G. F ankhauser, and Y. I. H. P arish. Sim ulating F acial Surgery using Finite Elemen t Mo dels. In SIGGRAPH '96: 23r d A nnual Confer enc e on Computer Gr aphics and Inter active T e chniques pages 421{428, 1996. [66] Y. H. Kw on and N. da Vitoria Lob o. Age Classication from F acial Images. Computer Vision and Image Understanding 74(1):1{21, 1999. [67] K. C. Lee, J. Ho, M. H. Y ang, and D. J. Kriegman. Videobased F ace Recognition using Probabilistic App earance Manifolds. In International Confer enc e on Computer Vision and Pattern R e c o gnition v olume 1, pages 313{320, 2003. [68] Y. Lee, D. T erzop oulos, and K. W alters. Realistic Mo deling for F acial Animation. In SIGGRAPH '95: 22nd A nnual Confer enc e on Computer Gr aphics and Inter active T e chniques pages 55{62, New Y ork, NY, USA, 1995. A CM. 96
PAGE 108
[69] B. Li and R. Chellappa. F ace V erication through T rac king F acial F eatures. Journal of the Optic al So ciety of A meric a 18:2969{2981, 2001. [70] S. Z. Li and A. K. Jain, editors. Handb o ok of F ac e R e c o gnition Springer, 2005. [71] Y. Li. Dynamic F ac e Mo dels: Construction and Applic ations PhD thesis, Univ ersit y of London, 2001. [72] R. Lienhart and J. Ma ydt. An Extended Set of Haarlik e F eatures for Rapid Ob ject Detection. In International Confer enc e on Image Pr o c essing pages 900{ 903. IEEE, 2002. [73] C. Liu and H. W ec hsler. A Unied Ba y esian F ramew ork for F ace Recognition. In International Confer enc e on Image Pr o c essing v olume 1, pages 151{155, 1998. [74] C. Liu and H. W ec hsler. Comparativ e Assessmen t of Indep enden t Comp onen t Analysis (ICA) for F ace Recognition. In International Confer enc e on A udio and Vide ob ase d Biometric Person A uthentic ation pages 211{216, 1999. [75] C. Liu and H. W ec hsler. Ev olutionary Pursuit and Its Application to F ace Recognition. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 22(6):570{582, 2000. [76] X. Liu and T. Chen. Videobased F ace Recognition using Adaptiv e Hidden Mark o v Mo dels. In International Confer enc e on Computer Vision and Pattern R e c o gnition v olume 1, pages 340{345, 2003. [77] J. Lu, K. Plataniotis, and A. V enetsanop oulos. F ace Recognition using LD Abased Algorithms. IEEE T r ansactions on Neur al Networks 14(1):195{200, 2003. [78] V. Manohar, D. B. Goldgof, S. Sark ar, and Y. Zhang. F acial Strain P attern as a Soft F orensic Evidence. In Eighth IEEE Workshop on Applic ations of Computer Vision page 42, 2007. [79] V. Manohar, M. Shrev e, D. B. Goldgof, and S. Sark ar. Finite Elemen t Mo deling of F acial Deformation in Videos for Computing Strain P attern. In 19th International Confer enc e on Pattern R e c o gnition pages 1{4, 2008. [80] A. M. Mart inez and A. C. Kak. PCA v ersus LD A. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 23(2):228{233, 2001. [81] T. McInerney and D. T erzop oulos. Deformable Mo dels in Medical Images Analysis: A Surv ey Me dic al Image A nalysis 1(2):91{108, 1996. [82] D. N. Metaxas. Physics{Base d Deformable Mo dels: Applic ations to Computer Vision, Gr aphics, and Me dic al Imaging Klu w er Academic Publishers, rst edition, 1996. [83] C. E. Metz. Basic Principles of R OC Analysis. Seminars in Nucle ar Me dicine 8(4):283{298, 1978. 97
PAGE 109
[84] S. Moa v eni. Finite Element A nalysis Pren tice Hall, 1999. [85] B. Moghaddam, T. Jebara, and A. P en tland. Ba y esian F ace Recognition. Pattern R e c o gnition 33(11):1771{1782, 2000. [86] H. Mo on and P J. Phillips. Computational and P erformance Asp ects of PCAbased F ace Recognition Algorithms. Per c eption 30:301{321, 2001. [87] H. N. Ng and R. L. Grimsdale. Computer Graphics T ec hniques for Mo deling Cloth. IEEE Computer Gr aphics and Applic ations 16(5):28{41, 1996. [88] A. J. O'T o ole, D. A. Roark, and H. Ab di. Recognizing Mo ving F aces: A Psyc hological and Neural Syn thesis. T r ends in Co gnitive Scienc e 6(6):261{266, 2002. [89] S. P am udurth y E. Guan, K. Mueller, and M. Rafailo vic h. Dynamic Approac h for F ace Recognition using Digital Image Skin Correlation. In Fifth International Confer enc e on A udioand Vide ob ase d Biometric Person A uthentic ation pages 1010{1018, 2005. [90] A. P en tland, B. Moghaddam, and T. Starner. Viewbased and Mo dular Eigenspaces for F ace Recognition. In IEEE Confer enc e on Computer Vision and Pattern R e c o gnition pages 84{91, 1994. [91] P J. Phillips, P Grother, R. J. Mic heals, D. M. Blac kburn, E. T abassi, and J. M. Bone. F ace Recognition V endor T est (FR VT) 2002: Ov erview and Summary. Online: h ttp://www.frvt.org/FR VT2002/do cumen ts.h tm. [92] S. M. Platt and N. I. Badler. Animating F acial Expression. SIGGRAPH Computer Gr aphics 15(3):245{252, 1981. [93] M. Rael, C. Willert, and M. Komp enhans. Particle Image V elo cimetry Springer, 1998. [94] N. Ramanathan and R. Chellappa. Mo deling Age Progression in Y oung F aces. In IEEE Confer enc e on Computer Vision and Pattern R e c o gnition pages 387{394, 2006. [95] S. A. Rizvi, P J. Phillips, and H. Mo on. The FERET V erication T esting Proto col for F ace Recognition Algorithms. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 48{53, 1998. [96] D. A. Roark, S. E. Barrett, A. J. O'T o ole, and H. Ab di. Learning the Mo v es: The Eect of F amiliarit y and F acial Motion on P erson Recognition Across Large Changes in Viewing F ormat. Per c eption 35:761{773, 2006. [97] D. A. Roark, S. E. Barrett, M. A. Sp ence, H. Ab di, and A. J. O'T o ole. Psyc hological and Neural P ersp ectiv es on the Role of Motion in F ace Recognition. Behavior al and Co gnitive Neur oscienc e R eviews 2(1):15{46, 2003. 98
PAGE 110
[98] K. Rohr, H. S. Stiehl, R. Sprengel, W. Beil, T. M. Buzug, J. W eese, and M. H. Kuhn. P oin tbased elastic registration of medical image data using appro ximating thinplate splines. In VBC '96: 4th International Confer enc e on Visualization in Biome dic al Computing pages 297{306, London, UK, 1996. SpringerV erlag. [99] S. H. M. Roth, M. H. Gross, S. T urello, and F. R. Carls. A BernsteinB ezier based Approac h to Soft Tissue Sim ulation. In EUR OGRAPHICS '98 v olume 17, pages 285{294, 1998. [100] M. Rydfalk. CANDIDE: A P arameterized F ace. T ec hnical Rep ort LiTHISYI0866, Link oping Univ ersit y 1987. [101] A. Sarti, R. Gori, and C. Lam b erti. A Ph ysically Based Mo del to Sim ulate MaxilloF acial Surgery from 3D CT Images. F utur e Gener ation Computer Systems 15(2):217{221, 1999. [102] T. W. Sederb erg and S. R. P arry F reeform Deformation of Solid Geometric Mo dels. SIGGRAPH Computer Gr aphics 20(4):151{160, 1986. [103] M. Shrev e, S. Go da v arth y V. Manohar, D. B. Goldgof, and S. Sark ar. T o w ards Macroand MicroExpression Sp otting in Video using Strain P atterns. In IEEE Workshop on Applic ations of Computer Vision page In press, 2009. [104] S. Soatto, G. Doretto, and Y. W u. Dynamic T extures. In International Confer enc e on Computer Vision v olume 2, pages 439{446, 2001. [105] M. Sonk a, V. Hla v ac, and R. Bo yle. Image Pr o c essing: A nalysis and Machine Vision O'Reilly 1999. [106] S. Srisuk, M. P etrou, W. Kurutac h, and A. Kadyro v. A F ace Authen tication System using the T race T ransform. Pattern A nalysis and Applic ations 8(1):50{ 61, 2005. [107] D. T erzop oulos and K. W aters. Ph ysicallyBased F acial Mo deling, Analysis, and Animation. Journal of Visualization and Computer A nimation 1(2):73{80, 1990. [108] D. T erzop oulos and K. W aters. Analysis and Syn thesis of F acial Image Sequences using Ph ysical and Anatomical Mo dels. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 15(6):569{579, 1993. [109] M. T urk and A. P P en tland. Eigenfaces for Recognition. Journal of Co gnitive Neur oscienc e 3(1):71{86, 1991. [110] P Viola and M. Jones. Robust Realtime Ob ject Detection. International Journal of Computer Vision 2002. [111] R. W ang and B. Bhan u. Learning Mo dels for Predicting Recognition P erformance. In International Confer enc e on Computer Vision pages 1613{1618, 2005. 99
PAGE 111
[112] Y. W ang, M. Gupta, S. Zhang, S. W ang, X. Gu, D. Samaras, and P Huang. High Resolution T rac king of NonRigid 3D Motion of Densely Sampled Data using Harmonic Maps. In T enth IEEE International Confer enc e on Computer Vision v olume 1, pages 388{395, 2005. [113] K. W aters. A Muscle Mo del for Animating Threedimensional F acial Expression. In SIGGRAPH '87: 14th A nnual Confer enc e on Computer Gr aphics and Inter active T e chniques pages 17{24, 1987. [114] L. Wisk ott, J. M. F ellous, N. Kr uger, and C. Malsburg. F ace Recognition b y Elastic Bunc h Graph Matc hing. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e 19(7):775{779, 1997. [115] M. H. Y ang. Kernel Eigenfaces vs. Kernel Fisherfaces: F ace Recognition using Kernel Metho ds. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 215{220, 2002. [116] A. Yilmaz, K. Shaque, and M. Shah. Estimation of Rigid and NonRigid F acial Motion using Anatomical F ace Mo del. In ICPR '02: 16th International Conferenc e on Pattern R e c o gnition (ICPR'02) V olume 1 page 10377, 2002. [117] L. Yin, X. W ei, Y. Sun, J. W ang, and M. J. Rosato. A 3D F acial Expression Database F or F acial Beha vior Researc h. In Pr o c e e dings of the 7th International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 211{216, 2006. [118] J. Y oshida, Y. F ujino, and L. C. Sujeew a. Measuremen t Metho d for Con tin ua b y Image Pro cessing. Journal of Structur al Engine ering 130(8):1145{1156, 2004. [119] L. Zhang, N. Sna v ely B. Curless, and S. M. Seitz. Spacetime F aces: High Resolution Capture for Mo deling and Animation. In A CM SIGGRAPH pages 548{558, 2004. [120] Y. Zhang. R obust A lgorithms for Pr op erty R e c overy in Motion Mo deling, Me dic al Imaging, and Biometrics PhD thesis, Univ ersit y of South Florida, 2005. [121] Y. Zhang, S. J. Kundu, D. B. Goldgof, S. Sark ar, and L. V. Tsap. Elastic F ace, An Anatom ybased Biometrics Bey ond Visible Cue. In International Confer enc e on Pattern R e c o gnition v olume 2, pages 19{22, 2004. [122] Y. Zhang, E. C. Prak ash, and E. Sung. RealTime Ph ysicallybased F acial Expression Animation using MassSpring System. In International Confer enc e on Computer Gr aphics pages 347{350, 2001. [123] Y. Zhang, E. C. Prak ash, and E. Sung. Ecien t Mo deling of an Anatom ybased F ace and F ast 3D F acial Expression Syn thesis. Computer Gr aphics F orum 22(2):159{170, 2003. 100
PAGE 112
[124] W. Zhao, R. Chellappa, and A. Krishnasw am y Discriminan t Analysis of Principal Comp onen ts for F ace Recognition. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 336{341, 1998. [125] W. Zhao, R. Chellappa, P J. Phillips, and A. Rosenfeld. F ace Recognition: A Literature Surv ey. A CM Computing Surveys 35(4):399{458, 2003. [126] S. K. Zhou and R. Chellappa. Probabilistic Human Recognition from Video. In Eur op e an Confer enc e on Computer Vision v olume 3, pages 681{697, 2002. [127] S. K. Zhou and R. Chellappa. MultipleExemplar Discriminan t Analysis for F ace Recognition. In International Confer enc e on Pattern R e c o gnition v olume 4, pages 191{194, 2004. [128] S. K. Zhou, R. Chellappa, and B. Moghaddam. In traP ersonal Kernel Space for F ace Recognition. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 235{240, 2004. [129] S. K. Zhou, V. Kr uger, and R. Chellappa. F ace Recognition from Video: A CONDENSA TION Approac h. In International Confer enc e on A utomatic F ac e and Gestur e R e c o gnition pages 221{228, 2002. [130] O. C. Zienkiewicz and R. L. T a ylor. The Finite Element Metho d v olume 1. McGra wHill, 1989. 101
PAGE 113
ABOUT THE A UTHOR V asan t Manohar receiv ed his B.E. (Honors) degree in Computer Science from the Birla Institute of T ec hnology and Science, Pilani, India and M.S. degree in Computer Science from the Univ ersit y of South Florida (USF) in 2003 and 2006 resp ectiv ely He is curren tly a Ph.D. candidate in the Computer Science and Engineering departmen t at USF. His general researc h in terest lies in the elds of computer vision, image processing, and pattern recognition. His sp ecic topics of researc h include videobased face recognition, nonrigid motion analysis, and empirical ev aluation tec hniques for ob ject detection and trac king in video. Mr. Manohar is a mem b er of Sigma Xi and T au Beta Pi.
