Face Gabor

download Face Gabor

of 130

Transcript of Face Gabor

  • 8/3/2019 Face Gabor

    1/130

    FACERECOGNITIONUSINGGABORWAVELETTRANSFORM

    ATHESISSUBMITTEDTO

    THEGRADUATESCHOOLOFNATURALSCIENCES

    OF

    THEMIDDLEEASTTECHNICALUNIVERSITY

    BY

    BURCUKEPENEKCI

    INPARTIALFULLFILMENTOFTHEREQUIREMENTSFORTHEDEGREE

    OF

    MASTEROFSCIENCE

    IN

    THEDEPARTMENTOFELECTRICALANDELECTRONICSENGINEERING

    SEPTEMBER2001

  • 8/3/2019 Face Gabor

    2/130

    ii

    ABSTRACT

    FACERECOGNITIONUSINGGABORWAVELETTRANSFORM

    Kepenekci,Burcu

    M.S,DepartmentofElectricalandElectronicsEngineering

    Supervisor:A.AydnAlatan

    Co-Supervisor:GzdeBozda Akar

    September2001,118pages

    Face recognition is emerging as an active research area with numerous

    commercial and law enforcement applications. Although existing methods

    performs well under certain conditions, the illumination changes, out of plane

    rotationsandocclusionsarestillremainaschallengingproblems.Theproposed

    algorithmdealswithtwooftheseproblems,namelyocclusionandillumination

    changes.Inourmethod,Gaborwavelettransformisusedforfacialfeaturevector

    constructionduetoitspowerfulrepresentationofthebehaviorofreceptivefields

  • 8/3/2019 Face Gabor

    3/130

    iii

    inhuman visual system (HVS). The method isbasedon selecting peaks(high-

    energizedpoints)oftheGaborwaveletresponsesasfeaturepoints.Comparedto

    predefined graph nodes of elastic graph matching, our approach has better

    representativecapabilityforGaborwavelets.Thefeaturepointsareautomatically

    extracted using the local characteristics of each individual face in order to

    decrease the effect ofoccluded features. Since thereisno trainingasinneural

    network approaches, a single frontal face for each individual is enough as a

    reference.Theexperimentalresultswithstandardimagelibraries,showthatthe

    proposedmethodperformsbettercomparedtothegraphmatchingandeigenface

    basedmethods.

    Keywords:Automaticfacerecognition,Gaborwavelettransform,humanface

    perception.

  • 8/3/2019 Face Gabor

    4/130

    iv

    Z

    GABORDALGACIKLARINIKULLANARAKYZTANIMA

    Kepenekci,Burcu

    YksekLisans, ElektrikElektronikMhendisliiBlm

    TezYneticisi:A.AydnAlatan

    YardmcTezYneticisi:GzdeBozda Akar

    Eyll2001,118 sayfa

    Yz tanma gnmzde hem ticari hem de hukuksal alanlarda artan

    saydauygulamas olan bir problemdir.Varolanyztanmametodlarkontroll

    ortamdabaarlsonularversedertme,ynlenme,veaydnlatmadeiimleri

    hala yz tanmada zlememi problemdir. nerilen metod ile bu

    problemdenaydnlanmadeiimlerivertmeetkisielealnmtr.Bualmada

    hem yze ait znitelik noktalar hem de vektrleri Gabor dalgack dn m

  • 8/3/2019 Face Gabor

    5/130

    v

    kullanlarakbulunmutur. Gabor dalgack dn m,insan grme sistemindeki

    duyumsal blgelerin davrann modellemesinden dolay kullanlmtr.nerilen

    metod, daha nceden tanmlanm izge dmleri yerine, Gabor dalgack

    tepkeleri tepelerinin (yksek enerjili noktalarnn) znitelik noktalar olarak

    seilmesine dayanmaktadr. Bylece Gabor dalgacklarnn en verimli ekilde

    kullanlmas salanmtr. znitelik noktalar otomatik olarak her yzn farkl

    yerel zellikleri kullanlarak bulunmakta, bunun sonucu olarak rtk

    zniteliklerin etkisideazaltlmaktadr.Sinir alar yaklamlarnda olduu gibi

    renme safhas olmamas nedeniyle tanma iinherkiinin sadece bir n yz

    grnts yeterli olmaktadr. Yaplan deneylerin sonucunda nerilen metodun

    varolan izge eleme ve zyzler yntemleriyle karla trldnda daha baarl

    sonular verdii gzlenmitir.

    Anahtar Kelimeler: Yz tanma, Gabor dalgack dn m, insan yz algs,

    znitelikbulma,znitelike leme,rnttanma.

  • 8/3/2019 Face Gabor

    6/130

    vi

    ACKNOWLEDGEMENTS

    IwouldliketoexpressmygratitudetoAssoc.Prof.Dr.G zdeBozda Akarand

    Assist. Prof. Dr. A. Aydn Alatan for their guidance, suggestions and insight

    throughout this research. I also thank to F. Boray Tek for his support, useful

    discussions,andgreatfriendship.Tomyfamily,Ioffersincerethanksfortheir

    unshakablefaithinme.SpecialthankstofriendsinSignalProcessingandRemote

    Sensing Laboratory for their reassurance during thesis writing. I would like to

    acknowledge Dr. Uur Murat Lelolu for his comments. Most importantly, I

    expressmyappreciationtoAssoc.Prof.Dr.MustafaKaramanforhisguidance

    duringmyundergraduatestudiesandhisencouragingmetoresearch.

  • 8/3/2019 Face Gabor

    7/130

    vii

    TABLEOFCONTENTS

    ABSTRACT.........ii

    Z....iv

    ACKNOWLEDGEMENTS....vi

    TABLEOFCONTENTS...vii

    LISTOFTABLES.......x

    LISTOFFIGURES........xi

    CHAPTER

    1. INTRODUCTION.....1

    1.1. WhyFaceRecognition?..3

    1.2. ProblemDefinition...6

    1.3. Organizationofthethesis.....8

    2. PASTRESEARCHONFACERECOGNITION.....9

    2.1. HumanFaceRecognition.9

    2.1.1. Discussion..13

    2.2. AutomaticFaceRecognition..15

    2.2.1. Representation,MatchingandStatisticalDecision...16

    2.2.2. EarlyFaceRecognitionMethods...19

    2.2.3. StatisticalApproachstoFaceRecognition....21

  • 8/3/2019 Face Gabor

    8/130

    viii

    2.2.3.1. Karhunen-LoeveExpansionBasedMethods..21

    2.2.3.1.1. Eigenfaces.21

    2.2.3.1.2. FaceRecognitionusing

    Eigenfaces..26

    2.2.3.1.3. Eigenfeatures.28

    2.2.3.1.4. Karhunen-LoeveTransformofthe

    FourierTransform.29

    2.2.3.2. LinearDiscriminantMethods-Fisherfaces.....30

    2.2.3.2.1. FishersLinearDiscriminant.30

    2.2.3.2.2. FaceRecognitionUsingLinear

    DiscriminantAnalysis...32

    2.2.3.3. SingularValueDecompositionMethods35

    2.2.3.3.1. SingularValueDecomposition.35

    2.2.3.3.2. FaceRecognitionUsingSingular

    ValueDecomposition35

    2.2.4. HiddenMarkovModelBasedMethods.38

    2.2.5. NeuralNetworksApproach....44

    2.2.6. TemplateBasedMatching......49

    2.2.7. FeatureBasedMatching.....51

    2.2.8. CurrentStateofTheArt.60

    3. FACECOMPARISONANDMATCHING

    USINGGABORWAVELETS...62

    3.1. FaceRepresentationUsingGaborWavelets..64

  • 8/3/2019 Face Gabor

    9/130

    ix

    3.1.1. GaborWavelets......64

    3.1.2. 2DGaborWaveletRepresentationofFaces..68

    3.1.3. FeatureExtraction..70

    3.1.3.1. Feature-pointLocalization.70

    3.1.3.2. Feature-vectorExtraction...73

    3.2. MatchingProcedure...74

    3.2.1. SimilarityCalculation74

    3.2.2. FaceComparison...75

    4. RESULTS...82

    4.1. SimulationSetup....82

    4.1.1. ResultsforUniversityofStirlingFaceDatabase...83

    4.1.2. ResultsforPurdueUniversityFaceDatabase84

    4.1.3. ResultsforTheOlivettiandOracleResearchLaboratory

    (ORL)faceDatabase..87

    4.1.4. ResultsforFERETFaceDatabase.....89

    5. CONCLUSIONSANDFUTUREWORK...103

    REFERENCES.....108

  • 8/3/2019 Face Gabor

    10/130

    x

    LISTOFTABLES

    4.1 Recognition performances of eigenface, eigenhills and proposed method

    onthePurduefacedatabase...85

    4.2 Performanceresultsofwell-knownalgorithmsonORLdatabase.88

    4.3 Probesetsandtheirgoalofevaluation...92

    4.4 ProbesetsforFERETperformanceevaluation..92

    4.5 FERET performance evaluation results for various face recognition

    algorithms...94

  • 8/3/2019 Face Gabor

    11/130

    xi

    LISTOFFIGURES

    2.1 AppearancemodelofEigenfaceAlgorithm.......26

    2.2 DiscriminativemodelofEigenfaceAlgorithm..27

    2.3 ImagesamplingtechniqueforHMMrecognition......40

    2.4 HMMtrainingscheme...43

    2.5 HMMrecognitionscheme......44

    2.6 Auto-associationandclassificationnetworks...45

    2.7 ThediagramoftheConvolutionalNeuralNetworkSystem..48

    2.8 A2Dimagelattice(gridgraph)onMarilynMonroesface..55

    2.9 Abrunchgraph...59

    2.10 Bunchgraphmatchedtoaface..59

    3.1 AnensembleofGaborwavelets....66

    3.2 Smallsetoffeaturescanrecognizefacesuniquely,andreceptivefields

    thataremachedtothelocalfeaturesontheface67

    3.3 Gaborfilterscorrespondto5spatialfrequencyand8orientation.....68

    3.4 ExampleofafacialimageresponsetoGaborfilters....69

    3.5 Facialfeaturepointsfoundasthehigh-energizedpointsofGaborwavelet

    responses........71

    3.6 Flowchartofthefeatureextractionstageofthefacialimages ......72

  • 8/3/2019 Face Gabor

    12/130

    xii

    3.7 Testfacesvs.matchingfacefromgallery.....81

    4.1 ExamplesofdifferentfacialexpressionsfromStirlingdatabase...83

    4.2 ExampleofdifferentfacialimagesforapersonfromPurduedatabase....85

    4.3 Wholesetoffaceimagesof40individuals10imagesperperson.. ..86

    4.4 ExampleofmisclassifiedfacesofORLdatabase..88

    4.5 ExampleofdifferentfacialimagesforapersonfromORLdatabasethat

    areplacedattrainingorprobesetsbydifferent.....89

    4.6 FERETidentificationperformancesagainstfbprobes.....96

    4.7 FERETidentificationperformancesagainstduplicateIprobes...97

    4.8 FERETidentificationperformancesagainst fcprobes..98

    4.9 FERETidentificationperformancesagainstduplicateIIprobes...99

    4.10 FERETaverageidentificationperformances...100

    4.11 FERETcurrentupperboundidentificationperformances......101

    4.12 FERETidentificationperformanceofproposedmethod

    againstfbprobes......102

    4.13 FERETidentificationperformanceofproposedmethod

    againstfcprobes.......102

    4.14 FERETidentificationperformanceofproposedmethod

    againstduplicateIprobes.....103

    4.15 FERETidentificationperformanceofproposedmethod

    againstduplicateIIprobes...103

  • 8/3/2019 Face Gabor

    13/130

    1

    CHAPTER1

    INTRODUCTION

    Machine recognition of faces is emerging as an active research area

    spanning several disciplines such as image processing, pattern recognition,

    computervisionandneuralnetworks.Facerecognitiontechnologyhasnumerous

    commercial and law enforcement applications. These applications range from

    staticmatchingofcontrolledformatphotographssuchaspassports,creditcards,

    photoIDs,driverslicenses,andmugshotstorealtimematchingofsurveillance

    videoimages[82].

    Humans seem to recognize faces in cluttered scenes with relative ease,

    having the ability to identify distorted images, coarsely quantized images, and

    faces with occluded details. Machine recognition is much more daunting task.

    Understandingthehumanmechanismsemployedtorecognizefacesconstitutesa

    challenge for psychologists and neural scientists. In addition to the cognitive

    aspects,understanding face recognitionis important, since thesame underlying

  • 8/3/2019 Face Gabor

    14/130

    2

    mechanismscouldbeusedtobuildasystemfortheautomaticidentificationof

    facesbymachine.

    AformalmethodofclassifyingfaceswasfirstproposedbyFrancisGalton

    in1888[53,54].Duringthe1980sworkonfacerecognitionremainedlargely

    dormant. Since the 1990s, the research interest in face recognition has grown

    significantlyasaresultofthefollowingfacts:

    1. Theincreaseinemphasisoncivilian/commercialresearchprojects,

    2. The re-emergence of neural network classifiers with emphasis on real time

    computationandadaptation,

    3. Theavailabilityofrealtimehardware,

    4. The increasing need for surveillance related applications due to drug

    trafficking,terroristactivities,etc.

    Still most of the access control methods, with all their legitimate

    applications in an expanding society, have a bothersome drawback. Except for

    human and voice recognition, these methods require the user to remember a

    password,toenteraPINcode,tocarryabatch,or,ingeneral,requireahuman

    action in the course of identification or authentication. In addition, the

    corresponding means (keys, batches,passwords, PINcodes) areprone to being

    lost or forgotten, whereas fingerprints and retina scans suffer from low user

    acceptance. Modern face recognition has reached an identification rate greater

    than90%withwell-controlledposeandilluminationconditions.Whilethisisa

    high rate for face recognition, it is not comparable to methods using keys,

    passwordsorbatches.

  • 8/3/2019 Face Gabor

    15/130

    3

    1.1. Whyfacerecognition?

    Within todays environment of increased importance of security and

    organization,identificationandauthenticationmethodshavedevelopedintoakey

    technology in various areas: entrance control in buildings; access control for

    computers in general or for automatic teller machines in particular; day-to-day

    affairs like withdrawing money from a bank account or dealing with the post

    office;orintheprominentfieldofcriminalinvestigation.Suchrequirementfor

    reliablepersonalidentificationin computerizedaccesscontrolhasresultedin an

    increasedinterestinbiometrics.

    Biometric identification is the technique of automatically identifying or

    verifying an individual by a physical characteristic or personal trait. The term

    automaticallymeansthebiometricidentificationsystemmustidentifyorverify

    ahumancharacteristicortraitquicklywithlittleornointerventionfromtheuser.

    Biometrictechnologywasdevelopedforuseinhigh-levelsecuritysystemsand

    lawenforcementmarkets.Thekeyelementofbiometrictechnologyisitsabilityto

    identifyahumanbeingandenforcesecurity[83].

    Biometriccharacteristicsandtraitsaredividedintobehavioralorphysical

    categories.Behavioral biometrics encompasses such behaviors as signature and

    typingrhythms.Physicalbiometricsystemsusetheeye,finger,hand,voice,and

    face,foridentification.A biometric-based system was developed by Recognition Systems Inc.,

    Campbell, California, as reported by Sidlauskas [73]. The system was called

    ID3D Handkey and used the three dimensional shape of a persons hand to

  • 8/3/2019 Face Gabor

    16/130

    4

    distinguish people. The side and top view ofa handpositionedina controlled

    captureboxwereusedtogenerateasetofgeometricfeatures.Capturingtakesless

    than two seconds and the data could be stored efficiently in a 9-byte feature

    vector.Thissystemcouldstoreupto20000differenthands.

    Another well-known biometric measure is that of fingerprints. Various

    institutions aroundtheworld have carriedout research in the field. Fingerprint

    systemsareunobtrusiveandrelativelycheaptobuy.Theyareusedinbanksand

    tocontrolentrancetorestrictedaccessareas.Fowler[51]hasproducedashort

    summaryoftheavailablesystems.

    Fingerprintsareuniquetoeachhumanbeing.Ithasbeenobservedthatthe

    iris of the eye, like fingerprints, displays patterns and textures unique to each

    humanandthatitremainsstableoverdecadesoflifeasdetailedbySiedlarz[74].

    Daugman designed a robust pattern recognition method based on 2-D Gabor

    transformstoclassifyhumanirises.

    Speechrecognitionisalsooffersoneofthemostnaturalandlessobtrusive

    biometricmeasures,whereauserisidentifiedthroughhisorherspokenwords.

    AT&Thaveproducedaprototypethatstoresapersonsvoiceonamemorycard,

    detailsofwhicharedescribedbyMandelbaum[67].

    Whileappropriateforbanktransactionsandentry intosecureareas,such

    technologies have the disadvantage that they are intrusive both physically and

    socially.Theyrequiretheusertopositiontheirbodyrelativetothesensor,then

    pauseforasecondtodeclarehimselforherself.Thispauseanddeclareinteraction

    isunlikelytochangebecauseofthefine-grainspatialsensingrequired.Moreover,

  • 8/3/2019 Face Gabor

    17/130

    5

    since people can not recognize people using this sort of data, these types of

    identification do not have a place in normal human interactions and social

    structures.

    While the pause and present interaction perception are useful in high

    security applications, they are exactly the opposite of what is required when

    buildingastorethatrecognizingitsbestcustomers,oraninformationkioskthat

    remembersyou,orahousethatknowsthepeoplewholivethere.

    A face recognition system would allow user to be identified by simply

    walkingpastasurveillancecamera.Humanbeingsoftenrecognizeoneanotherby

    uniquefacialcharacteristics.Oneofthenewestbiometrictechnologies,automatic

    facial recognition, is based on this phenomenon. Facial recognition is themost

    successful form of human surveillance. Facial recognition technology, is being

    usedtoimprovehumanefficiencywhenrecognizingfaces,isoneofthefastest

    growing fields in the biometric industry. Interest in facial recognition is being

    fueled by the availability and low cost of video hardware, the ever-increasing

    number of video cameras being placed in the workspace, and the noninvasive

    aspectoffacialrecognitionsystems.

    Althoughfacialrecognitionisstillintheresearchanddevelopmentphase,

    several commercial systems are currently available and research organizations,

    such as Harvard University and the MIT Media Lab, are working on the

    developmentofmoreaccurateandreliablesystems.

  • 8/3/2019 Face Gabor

    18/130

    6

    1.2. ProblemDefinition

    Ageneralstatementoftheproblemcanbeformulatedasfollows:givenstill

    orvideo images of a scene, identify one ormore persons in the scene using a

    storeddatabaseoffaces.

    Theenvironmentsurroundingafacerecognitionapplicationcancovera

    widespectrumfromawell-controlledenvironmenttoanuncontrolledone.Ina

    controlledenvironment,frontalandprofilephotographsaretaken,completewith

    uniform background and identical poses among the participants. These face

    images are commonly called mug shots. Each mug shot can be manually or

    automatically cropped to extract a normalized subpart called a canonical face

    image.Inacanonicalfaceimage,thesizeandpositionofthefacearenormalized

    approximatelytothepredefinedvaluesandbackgroundregionisminimal.

    Generalfacerecognition,ataskthatisdonebyhumansindailyactivities,

    comes from virtually uncontrolled environment. Systems, which automatically

    recognizefacesfromuncontrolledenvironment,mustdetectfacesinimages.Face

    detectiontaskistoreportthelocation,andtypicallyalsothesize,ofallthefaces

    from a given image and completely a different problem with respect to face

    recognition.

    Facerecognitionisadifficultproblemduetothegeneralsimilarshapeof

    facescombinedwiththenumerousvariationsbetweenimagesofthesameface.

    Recognitionoffacesfromanuncontrolledenvironmentisaverycomplextask:

    lightingconditionmayvarytremendously;facialexpressionsalsovaryfromtime

    to time; face may appear at different orientations and a face can be partially

  • 8/3/2019 Face Gabor

    19/130

    7

    occluded.Further,dependingontheapplication,handlingfacialfeaturesovertime

    (aging)mayalsoberequired.

    Although existing methods performs well under constrained conditions,

    theproblemswiththeilluminationchanges,outofplanerotationsandocclusions

    arestillremainsunsolved.Theproposedalgorithm,dealswithtwoofthesethree

    importantproblems,namelyocclusionandilluminationchanges.

    Sincethetechniquesusedinthebestfacerecognitionsystemsmaydepend

    ontheapplicationofthesystem,onecanidentifyatleasttwobroadcategoriesof

    facerecognitionsystems[19]:

    1. Findingapersonwithinlargedatabaseoffaces(e.g.inapolicedatabase).

    (Oftenonlyoneimageisavailableperperson.Itisusuallynotnecessaryfor

    recognitiontobedoneinrealtime.)

    2. Identifyingparticularpeopleinrealtime(e.g.locationtrackingsystem).

    (Multiple images per person are often available for training and real time

    recognitionisrequired.)

    Inthisthesis,weprimarilyinterestedinthefirstcase.Detectionoffaceis

    assumedtobedonebeforehand.Weaimtoprovidethecorrectlabel(e.g.name

    label)associatedwiththatfacefromalltheindividualsinitsdatabaseincaseof

    occlusionsandilluminationchanges.Databaseoffacesthatarestoredinasystem

    iscalledgallery.Ingallery,thereexistsonlyonefrontalviewofeachindividual.

    We do not considercases with high degreesof rotation, i.e. we assume that a

    minimalpreprocessingstageisavailableifrequired.

  • 8/3/2019 Face Gabor

    20/130

    8

    1.3. OrganizationoftheThesis

    Over the past 20 years extensive research has been conducted by

    psychophysicists, neuroscientists and engineers on various aspects of face

    recognitionbyhumanandmachines.Inchapter2,wesummarizetheliteratureon

    bothhumanandmachinerecognitionoffaces.

    Chapter 3 introduces the proposed approach based on Gabor wavelet

    representationoffaceimages.Thealgorithmisexplainedexplicitly.

    Performanceofour method is examined on four different standardface

    databaseswithdifferentcharacteristics.Simulationresultsandtheircomparisons

    towell-knownfacerecognitionmethodsarepresentedinChapter4.

    In chapter 5, concluding remarks are stated. Future works, which may

    followthisstudy,arealsopresented.

  • 8/3/2019 Face Gabor

    21/130

    9

    CHAPTER2

    PASTRESEARCHONFACERECOGNITION

    Thetaskofrecognizingfaceshasattractedmuchattentionbothfrom

    neuroscientistsandfromcomputervisionscientists.Thischapterreviewssomeof

    thewell-knownapproachesfromthesebothfields.

    2.1. Human Face Recognition: Perceptual andCognitiveAspects

    Themajorresearchissuesofinteresttoneuroscientistsincludethehuman

    capacity for face recognition, themodeling of this capability, and the apparent

    modularityoffacerecognition.Inthissectionsomefindings,reachedastheresult

    ofexperimentsabouthumanfacerecognitionsystem,thatarepotentiallyrelevant

    tothedesignoffacerecognitionsystemswillbesummarized

    Oneofthebasicissuesthathavebeenarguedbyseveralscientistsisthe

    existenceofadedicatedfaceprocessingsystem[82,3].Physiologicalevidence

  • 8/3/2019 Face Gabor

    22/130

    10

    indicatesthatthebrainpossessesspecializedfacerecognitionhardwareinthe

    formoffacedetectorcellsintheinferotemporalcortexandregionsinthefrontal

    right hemisphere; impairment in these areas leads to a syndrome known as

    prosapagnosia. Interestingly, prosapagnosics, although unable to recognize

    familiar faces, retain their ability to visually recognize non-face objects. As a

    resultofmanystudiesscientistscomeupwiththedecisionthatfacerecognitionis

    notlikeotherobjectrecognition[42].

    Hence,thequestioniswhatfeatureshumansuseforfacerecognition.The

    results of the relatedstudies are veryvaluable inthe algorithm design ofsome

    facerecognitionsystems.Itisinterestingthatwhenallfacialfeatureslikenose,

    mouth,eyeetc.arecontainedinanimage,butindifferentorderthanordinary,

    recognitionisnotpossibleforhuman.Explanationoffaceperceptionastheresult

    ofholisticorfeatureanalysisaloneisnotpossiblesincebotharetrue.Inhuman

    both global and local features are used in a hierarchical manner [82]. Local

    features provide a finer classification system for face recognition. Simulations

    showthatthemostdifficultfacesforhumanstorecognizearethosefaces,which

    are neither attractive nor unattractive [4]. Distinctive faces are more easily

    recognizedthantypicalones.Informationcontainedinlowfrequencybandsused

    inordertomakethedeterminationofthesexoftheindividual,whilethehigher

    frequency components are used in recognition. The low frequency components

    contribute to the global description, while the high frequency components

    contributetothefinerdetailsrequiredintheidentificationtask[8,11,13].Ithas

  • 8/3/2019 Face Gabor

    23/130

    11

    alsobeenfoundthattheupperpartofthefaceismoreusefulforrecognitionthan

    thelowerpart[82].

    In[42],Bruceexplainsanexperimentthatisrealizedbysuperimposing

    thelowspatialfrequencyMargaretThatchersfaceonthehighspatialfrequency

    components of Tony Blairs face. Although when viewed close up only Tony

    Blair was seen, viewed from distance, Blair disappears and Margaret Thatcher

    becomesvisible.Thisdemonstratesthattheimportantinformationforrecognizing

    familiarfacesiscontainedwithinaparticularrangeofspatialfrequencies.

    Another important finding is that human face recognition system is

    disrupted by changes in lighting direction and also changes of viewpoint.

    Althoughsomescientiststendtoexplainhumanfacerecognitionsystembasedon

    derivation of 3D models of faces using shape from shading derivatives, it is

    difficulttounderstandwhyfacerecognitionappearssoviewpointdependent[1].

    The effects of lighting change on face identificationandmatching suggest that

    representationsforfacerecognitionarecruciallyaffectedbychangesinlowlevel

    imagefeatures.

    BruceandLangtonfoundthatnegation(invertingbothhueandluminance

    valuesofanimage)effectsbadlytheidentificationoffamiliarfaces[124].They

    alsoobservethatnegationhasnosignificanteffectonidentificationandmatching

    ofsurfaceimagesthatlackedanypigmentedandtexturedfeatures,thisledthem

    toattributethenegationeffecttothealterationofthebrightnessinformationabout

    pigmentedareas.Anegativeimageofadark-hairedCaucasian,forexample,will

    appear to be a blonde with dark skin. Kemp et al. [125] showed that the hue

  • 8/3/2019 Face Gabor

    24/130

    12

    values of these pigmented regions do not themselves matters for face

    identification.Familiarfacespresentedinhuenegatedversions,withpreserved

    luminance values, were recognized as well as those with original hue values

    maintained,thoughtherewasadecrementinrecognitionmemoryforpicturesof

    faceswhenhuewasalteredinthisway[126].Thissuggeststhatepisodicmemory

    forpicturesofunfamiliarfacescanbesensitivetohue,thoughtherepresentations

    offamiliarfacesseemsnottobe.Thisdistinctionbetweenmemoryforpictures

    andfacesisimportant.Itisclearthatrecognitionoffamiliarandunfamiliarfaces

    isnotthesameforhumans.Itislikelythatunfamiliarfacesareprocessedinorder

    torecognizea picture whereasfamiliarfacesare fed into the face recognition

    system of human brain. A detailed discussion of recognizing familiar and

    unfamiliarfacescanbefoundin[41].

    Youngchildrentypicallyrecognizeunfamiliarfacesusingunrelatedcues

    such as glasses, clothes, hats, and hairstyle. By the age of twelve, these

    paraphernaliaareusuallyreliablyignored.Curiously,whenchildrenasyoungas

    fiveyearsareaskedtorecognizefamiliarfaces,theydoprettywellinignoring

    paraphernalia.Severalother interestingstudiesrelatedtohowchildren perceive

    invertedfacesaresummarizedin[6,7].

    Humans recognize people from their own race better than people from

    another race. Humans may encode an average face; these averages may be

    different for different races and recognition may suffer from prejudice and

    unfamiliaritywiththeclassoffacesfromanotherraceorgender[82].Thepoor

    identification ofotherraces isnot a psychophysical problem but more likely a

  • 8/3/2019 Face Gabor

    25/130

    13

    psychosocialone.Oneoftheinterestingresultsofthestudiestoquantifytherole

    of gender in face recognition is that in Japanese population, majority of the

    womensfacialfeaturesismoreheterogeneousthanthemensfeatures.Ithasalso

    beenfoundthatwhitewomensfacesareslightlymorevariablethanmens,but

    thattheoverallvariationissmall[9,10].

    2.1.1. Discussion

    Therecognitionof familiarfaces plays a fundamental role in oursocial

    interactions. Humans are able to identify reliably a large number of faces and

    psychologists are interested in understanding the perceptual and cognitive

    mechanisms at the base of the face recognition process. Those researches

    illuminatecomputervisionscientistsstudies.

    We can summarize the founding of studies on human face recognition

    systemasfollows:

    1. Thehumancapacityforfacerecognitionisadedicatedprocess,notmerelyan

    application of the general object recognition process. Thus artificial face

    recognitionsystemsshouldalsobefacespecific.

    2. Distinctivefacesaremoreeasilyrecognizedthantypicalones.

    3. Bothglobalandlocalfeaturesareusedforrepresentingandrecognizingfaces.

    4. Humansrecognizepeoplefromtheirownracebetterthanpeoplefromanother

    race.Humansmayencodeanaverageface.

  • 8/3/2019 Face Gabor

    26/130

    14

    5. Certain image transformations, such asintensitynegation,strange viewpoint

    changes, andchanges in lighting direction can severely disrupt human face

    recognition.

    Usingthepresenttechnologyitisimpossibletocompletelymodelhuman

    recognitionsystemandreachitsperformance.However, thehumanbrainhasits

    shortcomings in thetotal number of persons that it canaccuratelyremember.

    Thebenefitofacomputersystemwouldbeitscapacitytohandlelargedatasetsof

    faceimages.

    Theobservationsand findings abouthumanface recognitionsystem will

    be a good starting point for automatic face recognition methods. As it is

    mentionedaboveanautomatedfacerecognitionsystemshouldbefacespecific.It

    shouldeffectivelyusefeaturesthatdiscriminateafacefromothers,andmoreasin

    caricaturesitpreferablyamplifiessuchdistinctivecharacteristicsofface[5,13].

    Differencebetweenrecognitionoffamiliarandunfamiliarfacesmustalso

    benoticed.Firstofallweshouldfindoutwhatmakesusfamiliartoaface.Seeing

    a face in many different conditions (different illuminations, rotations,

    expressionsetc.)makeusfamiliartothatface,orbyjustfrequentlylookingat

    the same face image can we be familiar to that face? Seeing a face in many

    differentconditionsissomethingrelatedtotraininghowevertheinterestingpoint

    isthatbyusingonlythesame2Dinformationhowwecanpassfromunfamiliarity

    to familiarity. Methods, which recognize faces from a single view, should pay

    attentiontothisfamiliaritysubject.

  • 8/3/2019 Face Gabor

    27/130

    15

    Someoftheearlyscientistswereinspiredbywatchingbirdflightandbuilt

    their vehicles with mobile wings. Although a single underlying principle, the

    Bernoullieffect,explainsbothbiologicalandman-madeflight,wenotethatno

    modernaircrafthasflappingwings.Designersoffacerecognitionalgorithmsand

    systems should be aware of relevant psychophysics and neurophysiological

    studiesbutshouldbeprudentinusingonlythosethatareapplicableorrelevant

    fromapractical/implementationpointofview.

    2.2. AutomaticFaceRecognition

    Although humans perform face recognition in an effortless manner,

    underlying computations within the human visual system are of tremendous

    complexity. The seemingly trivial task of finding and recognizing faces is the

    resultofmillionsyearsofevolutionandwearefarawayfromfullyunderstanding

    howthebrainperformsthistask.

    Up to date, no complete solution has been proposed that allow the

    automatic recognition of faces in real images. In this section we will review

    existing face recognition systems in five categories: early methods, neural

    networks approaches, statistical approaches, template based approaches, and

    feature based methods. Finally current state of the art of the face recognition

    technologywillbepresented.

  • 8/3/2019 Face Gabor

    28/130

    16

    2.2.1. Representation,MatchingandStatisticalDecision

    The performance of face recognition depends on the solution of two

    problems:representationandmatching.

    Atanelementarylevel,theimageofafaceisatwodimensional(2-D)

    arrayofpixelgraylevelsas,

    x={xi,j,i,j S}, (2.1)

    where S is a square lattice. However in some cases it is more convenient to

    express the face image, x, as one-dimensional (1-D) column vector of

    concatenatedrowsofpixels,as

    x=[x1,x2,.....,x n]T (2.2)

    Wheren=Sisthetotalnumberofpixelsintheimage.Therefore x Rn,then

    dimensionalEuclideanspace.

    For a given representation, two properties are important: discriminating

    powerandefficiency;i.e.howfarapartarethefacesundertherepresentationand

    howcompactistherepresentation.

    Whilemanyprevioustechniquesrepresentfacesintheirmostelementary

    formsof(2.1)or(2.2),manyothersuseafeaturevector, F(x)=[f1(x),f2(x),....,

    fm(x)]T,wheref1(.),f2(.),...,fm(.)arelinearornonlinearfunctionals.Feature-based

    representationsareusuallymoreefficientsincegenerallymismuchsmallerthan

    n.

  • 8/3/2019 Face Gabor

    29/130

    17

    A simple way to achieve good efficiency is to use an alternative

    orthonormal basis ofRn. Specifically, suppose e1, e2,..., en are an orthonormal

    basis.ThenXcanbeexpressedas

    i

    n

    i

    i exx =

    =1

    ~ (2.3)

    where ii exx ,~ (inner product), and x can be equivalently represented by

    [ ]Tnxxxx~,...,~,~~ 21= .Twoexamplesoforthonormalbasisarethenaturalbasisused

    in(2.2)withei=[0,...,0,1,0,...,0]T,whereoneis ithposition,andtheFourierbasis

    ( ) T

    n

    inj

    n

    ij

    n

    ij

    i eeen

    e

    =

    12

    222

    2/1,...,,,1

    1 .Ifforagivenorthonormalbasis ix

    ~

    are small when mi , then the face vector x~ can be compressed into an m

    dimensionalvector, [ ]Tmxxxx~,...,~,~~ 21 .

    It is important to notice that an efficient representation does not

    necessarilyhavegooddiscriminatingpower.

    Inthematchingproblem,anincomingfaceisrecognizedbyidentifyingit

    withaprestoredface.Forexample,supposetheinputfaceis xandthereareK

    prestoredfacesck,k=1,2,...,K.Onepossibilityistoassignxto0k

    c if

    kcxkKk

    =1

    minarg0 (2.4)

  • 8/3/2019 Face Gabor

    30/130

    18

    where . representstheEuclideandistanceinRn.If ||ck|| isnormalizedsothat

    ||ck||=cforallk,theminimumdistancematchingin(2.4)simplifiestocorrelation

    matching

    kcxkKk

    ,minarg1

    0

    = . (2.5)

    Sincedistanceandinnerproductareinvarianttochangeoforthonormal

    basis, minimumdistance and correlation matching canbe performed using any

    orthonormalbasisandtherecognitionperformancewillbethesame.Todothis,

    simplyreplacexand ckin(2.4)or(2.5)by x~

    and kc~

    .Similarly(2.4)and(2.5)

    alsocouldbeusedwithfeaturevectors.

    Duetosuchfactorssuchasviewingangle,illumination,facialexpression,

    distortion, and noise, the face images for a given person can have random

    variations and therefore are better modeled as a random vector. In this case,

    maximumlikelihood(ML)matchingisoftenused,

    ( )( )kcxpkKk

    |logminarg1

    0

    = (2.6)

    wherep(x|ck)isthedensityofxconditioningonitsbeingthe kthperson.TheML

    criterion minimizes the probability of recognition error when a priori, the

    incomingfaceisequallylikelytobethatofanyoftheKpersons.Furthermoreif

    weassumethatvariationsinfacevectorsarecausedbyadditivewhiteGaussian

    noise(AWGN)

    xk=ck+wk(2.7)

  • 8/3/2019 Face Gabor

    31/130

    19

    wherewkisazero-meanAWGNwithpower2,thentheMLmatchingbecomes

    theminimumdistancematchingof(2.4).

    2.2.2. Earlyfacerecognitionmethods

    Theinitialworkinautomaticfaceprocessingdatesbacktotheendofthe

    19th century,asreportedbyBenson and Perrett[39]. Inhis lectureon personal

    identificationattheRoyalInstitutionon25May1888,SirFrancisGalton[53],an

    Englishscientist,explorerandacousinofCharlesDarwin,explainedthathehad

    frequently chafed under the sense of inability to verbally explain hereditary

    resemblance and types of features. In order to relieve himself from this

    embarrassment, he took considerable trouble and made many experiments. He

    described how French prisoners were identified using four primary measures

    (headlength,headbreadth, foot length and middle digit length ofthe footand

    hand respectively). Each measure could take one of the three possible values

    (large,medium,orsmall),givingatotalof81possibleprimaryclasses.Galton

    feltitwouldbeadvantageoustohaveanautomaticmethodofclassification.For

    thispurpose,hedevisedanapparatus,whichhecalledamechanicalselector,that

    could be used to compare measurements of face profiles. Galton reported that

    mostofthemeasureshehadtriedwerefairyefficient.

    Early face recognition methods were mostly feature based. Galtons

    proposed method, and a lot of work to follow, focused on detecting important

    facial features as eye corners, mouth corners, nose tip, etc. By measuring the

    relativedistancesbetweenthesefacialfeaturesafeaturevectorcanbeconstructed

  • 8/3/2019 Face Gabor

    32/130

    20

    todescribeeachface.Bycomparingthefeaturevectorofanunknownfacetothe

    feature vectors of known vectors from a database of known faces, the closest

    matchcanbedetermined.

    OneoftheearliestworksisreportedbyBledsoe[84].Inthissystem,a

    humanoperatorlocatedthefeaturepointson thefaceandenteredtheirpositions

    intothecomputer.Givenasetoffeaturepointdistancesofanunknownperson,

    nearest neighbor or other classification rules were used for identifying the test

    face.Sincefeature extractionis manuallydone, this systemcouldaccommodate

    widevariationsinheadrotation,tilt,imagequality,andcontrast.

    InKanadeswork[62],seriesfiducialpointsaredetectedusingrelatively

    simple image processing tools (edge maps, signatures etc.) and the Euclidean

    distancesarethenusedasafeaturevectortoperformrecognition.Thefacefeature

    pointsarelocatedintwostages.Thecoarse-grainstagesimplifiedthesucceeding

    differential operation and feature finding algorithms. Once the eyes, nose and

    mouth are approximately located, more accurate information is extracted by

    confiningtheprocessingtofoursmallergroups,scanningathigherresolution,and

    usingbestbeamintensityfortheregion.Thefourregionsaretheleftandright

    eye, nose, and mouth. The beam intensity isbasedonthe localarea histogram

    obtainedinthecoarse-gainstage.Asetof16facialparameters,whicharerations

    ofdistances,areas,andanglestocompensateforthevaryingsizeofthepictures,

    isextracted.Toeliminatescaleanddimensiondifferencesthecomponentsofthe

    resulting vector are normalized. A simple distance measure is used to check

    similaritybetweentwofaceimages.

  • 8/3/2019 Face Gabor

    33/130

    21

    2.2.3. Statisticalapproachestofacerecognition

    2.2.3.1. Karhunen-LoeveExpansionBasedMethods

    2.2.3.1.1. Eigenfaces

    Afaceimage,I(x,y),ofsizeNxN issimplyamatrixwithbeachelement

    representingtheintensityatthatparticularpixel.I(x,y)mayalsobeconsideredas

    avectoroflengthN2 orasinglepointinanN2dimentionalspace.Soa128x128

    pixelimagecanberepresentedasapointina16,384dimensionalspace.Facial

    imagesingeneralwilloccupyonlyasmallsub-regionofthishighdimensional

    imagespaceandthusarenotoptimallyrepresentedinthiscoordinatesystem.

    As mentioned in section 2.2.1, alternative orthonormal bases are often

    usedtocompressfacevectors.OnesuchbasisistheKarhunen-Loeve(KL).

    TheEigenfacesmethodproposedbyTurkandPentland[20],isbasedon

    theKarhunen-LoeveexpansionandismotivatedbytheearlierworkofSirovitch

    andKirby[63]forefficientlyrepresentingpictureoffaces.Eigenfacerecognition

    derivesitisnamefromtheGermanprefixeigen,meaningownorindividual.

    TheEigenfacemethodoffacialrecognitionisconsideredthefirstworkingfacial

    recognitiontechnology.

    TheeigenfacesmethodpresentedbyTurkandPentlandfindstheprincipal

    components (Karhunen-Loeve expansion) of the face image distribution or the

    eigenvectors of the covariance matrix of the set of face images. These

    eigenvectorscanbethoughtasasetoffeatures,whichtogethercharacterizethe

    variationbetweenfaceimages.

  • 8/3/2019 Face Gabor

    34/130

    22

    LetafaceimageI(x,y)beatwodimensionalarrayofintensityvalues,ora

    vectorofdimensionn.LetthetrainingsetofimagesbeI1,I2,...,IN.Theaverage

    face image of the set is defined by =

    =N

    i

    iIN 1

    1 . Each face differs from the

    average by the vector = ii I . This set of very large vectors is subject to

    principal component analysis which seeks a set of K orthonormal vectors vk,

    k=1,...,Kandtheirassociatedeigenvalues kwhichbestdescribethedistribution

    ofdata.

    Vectors vk and scalars k are the eigenvectors and eigenvalues of the

    covariancematrix:

    ,1

    1

    TT

    i

    N

    i

    i AAN

    C == =

    (2.9)

    where the matrixA=[1, 2,..., N]. Finding the eigenvectors of matrix Cnxn is

    computationallyintensive.However,theeigenvectorsofCcanbedeterminedby

    firstfindingtheeigenvectorsofamuchsmallermatrixofsizeNxNandtakinga

    linearcombinationoftheresultingvectors.

    kkk vCv = (2.10)

    kk

    T

    kk

    T

    k vvCvv = (2.11)

    sinceeigenvectors,vk,areorthogonalandnormalized vkT

    vk=1.

    kk

    T

    k Cvv = (2.12)

  • 8/3/2019 Face Gabor

    35/130

    23

    =

    =N

    i

    k

    T

    ii

    T

    kk vvN 1

    1 (2.13)

    =

    =

    =

    =

    =

    =

    =

    =

    =

    =

    N

    i

    T

    ik

    N

    i

    T

    ik

    T

    ik

    N

    i

    T

    ik

    N

    i

    T

    ik

    TT

    ik

    N

    i

    k

    T

    ii

    T

    k

    IvN

    IvmeanIvN

    vN

    vvN

    vvN

    1

    1

    2

    1

    2

    1

    1

    )var(1

    ))((1

    )(1

    )()(1

    1

    Thuseigenvalue krepresentsthevarianceoftherepresentativefacialimageset

    alongtheaxisdescribedbyeigenvector k.

    Thespacespannedbytheeigenvectorsvk, k=1,...,Kcorrespondingtothe

    largest K eigenvalues of the covariancematrixC,iscalled theface space.The

    eigenvectorsofmatrixC,whicharecalledeigenfacesfromabasissetfortheface

    images. A new face image is transformed into its eigenface components

    (projectedontothefacespace)by:

    )()(, >==< Tk vkvk (2.14)

    For k=1,...,K. The projections wk form the feature vector =[ w1, w2,..., wK]

    which describes the contribution of each of each eigenface in representing theinputimage.

    GivenasetoffaceclassesEqandthecorrespondingfeaturevectorsq,

    thesimplestmethodfordeterminingwhichfaceclassprovidesthebestdescription

  • 8/3/2019 Face Gabor

    36/130

    24

    ofaninputfaceimageistofindthefaceclassthatminimizestheEuclidean

    distanceinthefeaturespace:

    qq = (2.15)

    AfaceisclassifiedasbelongingtoclassEqwhentheminimumqisbelowsome

    thresholdandalso

    { }qqqE minarg= . (2.16)

    Otherwise,thefaceisclassifiedasunknown.

    Turk and Pentland [20] test how their algorithm performs in changing

    conditions,byvaryingillumination,sizeandorientationofthefaces.Theyfound

    thattheirsystemhadthemosttroublewithfacesscaledlargerorsmallerthanthe

    originaldataset.Toovercomethisproblemtheysuggestusingamulti-resolution

    methodinwhichfacesarecomparedtoeigenfacesofvaryingsizestocomputethe

    bestmatch.Alsotheynotethatimagebackgroundcanhavesignificanteffecton

    performance, which they minimize by multiplying input images with a 2-D

    Gaussiantodiminishthecontributionofthebackgroundandhighlightthecentral

    facial features. System performs face recognition in real-time. Turk and

    Pentlands paper was very seminal in the field of face recognition and their

    methodisstillquitepopularduetoitseaseofimplementation.

    MuraseandNayar[85]extendedthecapabilitiesoftheeigenfacemethod

    to general 3D-object recognition under different illumination and viewing

    conditions. Given N object images taken under P views and L different

  • 8/3/2019 Face Gabor

    37/130

    25

    illumination conditions, a universal image set is built which contains all the

    availabledata.Inthiswayasingleparametricspacedescribestheobjectidentity

    aswellastheviewingorilluminationconditions.Theeigenfacedecompositionof

    thisspacewasusedforfeatureextractionandclassification.Howeverinorderto

    insurediscriminationbetweendifferentobjectsthenumberofeigenvectorsused

    inthismethodwasincreasedcomparedtotheclassicalEigenfacemethod.

    Later,basedontheeigenfacedecomposition,Pentlandetal[86]developed

    a view based eigenspace approach for human face recognition under general

    viewingconditions.GivenNindividualsunderPdifferentviews,recognitionis

    performed over P separate eigenspaces, each capturing the variation of the

    individuals in a common view. The view based approach is essentially an

    extensionoftheeigenfacetechniquetomultiplesetsofeigenvectors,oneforeach

    face orientation. In order todeal with multiple views, in the first stage of this

    approach,theorientationofthetestfaceisdeterminedandtheeigenspacewhich

    bestdescribestheinputimageisselected.Thisisaccomplishedbycalculatingthe

    residual description error (distance from feature space: DFFS) for each view

    space. Once the proper view is determined, the image is projected onto

    appropriate view space and then recognized. The view based approach is

    computationallymoreintensivethantheparametricapproachbecausePdifferent

    sets of V projections are required (V is the number of eigenfaces selected to

    represent each eigenspace). Naturally, the view-based representation can yield

    moreaccuraterepresentationoftheunderlyinggeometry.

  • 8/3/2019 Face Gabor

    38/130

    26

    2.2.3.1.2. FaceRecognitionusingEigenfaces

    Therearetwomainapproachesofrecognizingfacesbyusingeigenfaces.

    Appearancemodel:

    1- Adatabaseoffaceimagesiscollected

    2- Asetofeigenfacesisgeneratedbyperformingprincipalcomponentanalysis

    (PCA) on the face images. Approximately, 100 eigenvectors are enough to

    codealargedatabaseoffaces.

    3- Eachfaceimageisrepresentedasalinearcombinationoftheeigenfaces.

    4- A given test image is approximated by a combination of eigenfaces. Adistancemeasureisusedtocomparethesimilaritybetweentwoimages.

    Figure2.1:Appearancemodel

    g

    U

    k

    ),( gpS

  • 8/3/2019 Face Gabor

    39/130

    27

    Figure2.2:Discriminativemodel

    Discriminativemodel:

    1- TwodatasetslandEareobtainedbycomputingintrapersonaldifferences

    (by matching two viewsofeachindividualinthe dataset) and the otherby

    computingextrapersonaldifferences(bymatchingdifferentindividualsinthe

    dataset),respectively.

    2- TwodatasetsofeigenfacesaregeneratedbyperformingPCAoneachclass.

    E

    l

    g

    ( )

    ( )1

    1

    |

    |

    P

    P

    ),( gpS

  • 8/3/2019 Face Gabor

    40/130

    28

    3- Similarity score between two images is derived by calculating S=P(l|),

    where is the difference between a pair of images. Two images are

    determinedtobethesameindividual,ifS>0.5.

    Although the recognition performance is lower than the correlation

    method, thesubstantial reduction in computational complexity of the eigenface

    methodmakesthismethodveryattractive.Therecognitionratesincreasewiththe

    number of principal components (eigenfaces) used and in the limit, as more

    principal components are used, performance approaches that of correlation. In

    [20], and [86], authors reported that the performances level off at about 45

    principalcomponents.

    Ithasbeenshownthatremovingfirstthreeprincipalcomponentsresultsin

    betterrecognitionperformances(theauthorsreportedanerrorrateof%20when

    usingtheeigenfacemethodwith30principalcomponentsonadatabasestrongly

    affectedbyilluminationvariationsandonly%10errorrateafterremovingthefirst

    three components). The recognition rates in this case were better than the

    recognitionratesobtainedusingthecorrelationmethod.Thiswasarguedbasedon

    the fact that first components are more influenced by variations of lighting

    conditions.

    2.2.3.1.3. Eigenfeatures

    Pentland et al. [86] discussed the use of facial features for face

    recognition.Thiscanbeeitheramodularora layeredrepresentationoftheface,

  • 8/3/2019 Face Gabor

    41/130

    29

    whereacoarse(low-resolution)descriptionofthewholeheadisaugmentedby

    additional (high-resolution) details in terms of salient facial features. The

    eigenfacetechniquewasextendedtodetectfacialfeatures.Foreachofthefacial

    features, a featurespace is built by selecting themost significanteigenfeatures

    (eigenvectorscorrespondingtothelargesteigenvaluesofthefeaturescorrelation

    matrix).

    After the facial features in a test image were extracted, a score of

    similarity between the detected features and the features corresponding to the

    modelimages iscomputed. A simple approach for recognition istocompute a

    cumulative score in terms of equal contribution by each of the facial feature

    scores. More elaborate weighting schemes can also be used for classification.

    Oncethecumulativescoreisdetermined,anewfaceisclassifiedsuchthatthis

    scoreismaximized.

    Theperformance of eigenfeatures method is close to that of eigenfaces,

    howeveracombinedrepresentationofeigenfacesandeigenfeaturesshowshigher

    recognitionrates.

    2.2.3.1.4. TheKarhunen-LoeveTransformoftheFourierSpectrum

    Akamatsu et. al. [87], illustrated the effectiveness of Karhunen-Loeve

    Transform of Fourier Spectrum in the Affine Transformed Target Image (KL-FSAT) for face recognition. First, the original images were standardized with

    respecttoposition,size,andorientationusinganaffinetransformsothatthree

    referencepointssatisfyaspecificspatialarrangement.Thepositionofthesepoints

  • 8/3/2019 Face Gabor

    42/130

    30

    isrelatedtothepositionofsomesignificantfacialfeatures.Theeigenfacemethod

    is applied discussed in the section 2.2.3.1.1. to the magnitude of the Fourier

    spectrum of the standardized images (KL-FSAT). Due to the shift invariance

    property of the magnitude of the Fourier spectrum, the KL-FSAT performed

    betterthanclassicaleigenfacesmethodundervariationsinheadorientationand

    shifting. However the computational complexity of KL-FSAT method is

    significantly greater than the eigenface method due to the computation of the

    Fourierspectrum.

    2.2.3.2. LinearDiscriminantMethods-Fisherfaces

    In [88], [89], the authors proposed a new method for reducing the

    dimensionalityofthefeaturespacebyusingFishersLinearDiscriminant(FLD)

    [90]. The FLD uses the class membership information and develops a set of

    feature vectors in which variations of different faces are emphasized while

    differentinstancesoffacesdue toillumination conditions, facialexpressionand

    orientationsarede-emphasized.

    2.2.3.2.1. FishersLinearDiscriminant

    Given c classes with a priori probabilities Pi, letNi be the number of

    samples of class i, i=1,...,c . Then the following positive semi-definite scatter

    matricesaredefinedas:

  • 8/3/2019 Face Gabor

    43/130

    31

    =

    =c

    i

    T

    iiiB PS1

    ))(( (2.17)

    = =

    =c

    i

    N

    j

    T

    i

    i

    ji

    i

    jiw

    i

    xxPS1 1

    ))(( (2.18)

    Where ijx denotesthejthn-dimensionalsamplevectorbelongingtoclass i,i is

    themeanofclass i:

    ,1

    1

    =

    =iN

    j

    i

    j

    i

    i xN

    (2.19)

    andistheoverallmeanofsamplevectors:

    ,1

    1 1

    1

    = =

    =

    =c

    i

    N

    j

    i

    jc

    i

    i

    i

    x

    N

    (2.20)

    Swisthewithinclassscattermatrixandrepresentstheaveragescatterofsample

    vectorofclass i;SBisthebetween-classscattermatrixandrepresentsthescatter

    ofthemeaniofclassiaroundtheoverallmeanvector.IfSwisnonsingular,

    the Linear Discriminant Analysis (LDA) selects a matrix VoptRnxk with

    orthonormal columns which maximizes the ratio of the determinant of the

    betweenclassscattermatrixoftheprojectedsamples,

    [ ],,...,,maxarg 21 kw

    T

    B

    T

    vopt vvvVSV

    VSVV =

    = (2.21)

  • 8/3/2019 Face Gabor

    44/130

    32

    Where {vi|i=1,...,k} is the set of generalized eigenvectors of SB and Sw

    correspondingtothesetofdecreasingeigenvalues {i|i=1,...,k},i.e.

    .iwiiB vSvS = (2.22)

    As shown in [91], the upper bound of kis c-1. The matrix Vopt describes the

    OptimalLinearDiscriminantTransformortheFoley-SammonTransform.While

    theKarhunen-LoeveTransformperformsarotationonasetofaxesalongwhich

    the projection of sample vectors differ most in the autocorrelation sense, the

    LinearDiscriminantTransformperformsarotationonasetofaxes [v1,v2,...,vk]

    alongwhichtheprojectionofsamplevectorsshowmaximumdiscrimination.

    2.2.3.2.2. FaceRecognitionUsingLinearDiscriminantAnalysis

    LetatrainingsetofNfaceimagesrepresentscdifferentsubjects.Theface

    image in the training set are two-dimensional arrays of intensity values,

    represented as vectors of dimension n. Different instances of a persons face

    (variationsinlighting,poseorfacialexpressions)aredefinedtobeinthesame

    classandfacesofdifferentsubjectsaredefinedtobefromdifferentclasses.

    The scatter matrices SBand Sw are defined in Equations (2.17), (2.18).

    Howeverthematrix VoptcannotbefounddirectlyfromEquation(2.21),because

    ingeneralmatrixSwissingular.ThisstemsfromthefactthattherankofSwisless

    thanN-c,andingeneral,thenumberofpixelsineachimagenismuchlargerthan

    the number of images in the learning setN. There have been presented many

    solutionsintheliteratureinordertoovercomethisproblem[92,93].In[88],the

  • 8/3/2019 Face Gabor

    45/130

    33

    authorsproposeamethodwhichiscalledFisherfaces.Theproblemof Swbeing

    singularisavoidedbyprojectingtheimagesetontoalowerdimensionalspaceso

    thattheresultingwithinclassscatterisnonsingular.Thisisachievedbyusing

    PrincipalComponentAnalysis(PCA)toreducethedimensionofthefeaturespace

    toN-c and then, applying the standard linear discriminant defined in Equation

    (2.21)toreducethedimensionto c-1.MoreformallyVoptisgivenby:

    Vopt=VfldVpca, (2.23)

    Where

    ,maxarg CVVV Tvpca = (2.24)

    and,

    =

    VSwVVV

    VSBVVVV

    pca

    T

    pca

    T

    pca

    T

    pca

    T

    vfld maxarg , (2.25)

    Where Cisthecovariancematrixofthesetoftrainingimagesandiscomputed

    fromEquation(2.9).ThecolumnsofVoptareorthogonalvectorswhicharecalled

    Fisherfaces.UnliketheEigenfaces,theFisherfacesdonotcorrespondtofacelike

    patterns.AllexamplefaceimagesEq,q=1,...,QintheexamplesetSareprojected

    ontothevectorscorrespondingtothecolumnsofthe Vfldandasetoffeaturesis

    extractedforeachexamplefaceimage.Thesefeaturevectorsareuseddirectlyfor

    classification.

  • 8/3/2019 Face Gabor

    46/130

    34

    Havingextractedacompactandefficientfeatureset,therecognitiontask

    canbeperformedbyusingtheEuclideandistanceinthefeaturespace.However,

    in [89] as a measure in the feature space, is proposed a weighted mean

    absolute/square distance with weights obtained based on the reliability of the

    decisionaxis.

    =

    =

    K

    v

    v

    vvS

    vvD1

    2

    2

    )(

    )(),( , (2.26)

    Therefore,foragivenfaceimage,thebestmatchE0isgivenby

    { }),(minarg0 = DS . (2.27)

    Theconfidencemeasureisdefinedas:

    =

    ),(

    ),(1),(

    1

    00

    D

    DConf ,(2.28)

    whereE1isthesecondbestcandidate.

    In [87], Akamatsu et. al. applied LDA to the Fourier Spectrum of the

    intensity image. The results reported by the authors showed that LDA in the

    FourierdomainissignificantlymorerobusttovariationsinlightingthantheLDA

    applieddirectlytotheintensityimages.Howeverthecomputationalcomplexityof

    this method is significantly greater than classical Fisherface method due to the

    computationoftheFourierspectrum.

  • 8/3/2019 Face Gabor

    47/130

    35

    2.2.3.3. SingularValueDecompositionMethods

    2.2.3.3.1 Singularvaluedecomposition

    MethodsbasedontheSingularValueDecompositionforfacerecognition

    usethegeneralresultstatedbythefollowingtheorem:

    Theorem: LetIpxqbearealrectangularmatrixRank(I)=r,thenthereexiststwo

    orthonormal matrices Upxp, Vqxq and a diagonal matrix pxq and the following

    formulaholds:

    =

    ==r

    i

    T

    iii

    T vvVUI1

    , (2.29)

    where

    U=(u1,u2,...,ur,ur+1,...,up),

    V=(v1,v2,...,vr,vr+1,...,vq),

    =diag(1,2,...,r,0,...,0),

    1>2>...>r>0,2i , i=1,...,raretheeigenvaluesofII

    TandITI, ui,vj , i=1,...,p,

    j=1,...,qaretheeigenvectorscorrespondingtoeigenvaluesof IITandITI.

    2.2.3.3.2. FacerecognitionUsingSingularValueDecomposition

    Let a face imageI(x,y) be a two dimensional (mxn) array of intensity

    valuesand[1,2,...,r]beitssingularvalue(SV)vector.In[93],Zhongrevealed

    the importance of using SVD for human face recognition by proving several

    importantpropertiesoftheSVvectoras:thestabilityoftheSVvectortosmall

  • 8/3/2019 Face Gabor

    48/130

    36

    perturbations caused by stochastic variation in the intensity image, the

    proportional variance of theSV vector toproportional variance of pixels inthe

    intensity image, the invariance of the SV feature vector to rotation transform,

    translationandmirrortransform.TheabovepropertiesoftheSVvectorprovide

    thetheoreticalbasisforusingsingularvaluesas imagefeatures.However,ithas

    beenshownthatcompressingtheoriginalSVvectorintoalowdimensionalspace,

    by means of various mathematic transforms leads to higher recognition

    performances.Amongvarioustransformationsofcompressingdimensionality,the

    Foley-Sammon transform based on Fisher criterion, i.e. optimal discriminant

    vectors,isthemostpopularone.GivenNfaceimages,whichpresentcdifferent

    subjects,the SVvectorsareextractedfromeachimage.AccordingtoEquations

    (2.17)and(2.18),thescattermatricesSBandSwoftheSVvectorsareconstructed.

    Ithasbeenshownthatitisdifficulttoobtaintheoptimaldiscriminantvectorsin

    thecaseofsmallnumberofsamples,i.e.thenumberofsamplesislessthanthe

    dimensionalityofthe SVvectorbecausethescattermatrix Swissingularinthis

    case.Manysolutionshavebeenproposedtoovercomethisproblem.Hong[93],

    circumvented the problem by adding a small singular value perturbation to Sw

    resultingin Sw(t)suchthat Sw(t)becomesnonsingular.Howevertheperturbation

    of Sw introduces an arbitrary parameter, and the range to which the authors

    restrictedtheperturbationisnotappropriatetoensurethattheinversionofSw(t)is

    numericallystable.Chengetal[92],solvedtheproblembyrankdecompositionof

    Sw. This is a generalization of Tians method [94], who substitute Sw- by the

    positivepseudo-inverseS w+.

  • 8/3/2019 Face Gabor

    49/130

    37

    After the set of optimal discriminant vectors {v1, v2, ..., vk} has been

    extracted,thefeaturevectorsareobtained byprojectingthe SVvectorsontothe

    spacespannedby{v1,v2,...,vk}.

    Whena test imageis acquired,its SVvectorisprojectedontothespace

    spannedby{v1,v2,...,vk}andclassificationisperformedinthefeaturespaceby

    measuringtheEuclideandistanceinthisspaceandassigningthetestimagetothe

    classofimagesforwhichtheminimumdistanceisachieved.

    Anothermethodtoreducethefeaturespaceofthe SVfeaturevectorswas

    describedbyChengetal[95].Thetrainingsetusedconsistedofasmallsampleof

    faceimagesofthesameperson.If ijI representsthejthfaceimageofpersoni,

    thentheaverageimageisgivenby =

    N

    j

    i

    jIN 1

    1.Eigenvaluesandeigenvectorsare

    determinedforthisaverageimageusingSVD.Theeigenvaluesaretresholdedto

    disregardthevaluesclosetozero.Average eigenvectors (calledfeature vectors)

    foralltheaveragefaceimagesarecalculated.Atestimageisthenprojectedonto

    thespacespannedbytheeigenvectors.TheFrobeniusnormisusedasacriterion

    todeterminewhichpersonthetestimagebelongs.

    2.2.4. HiddenMarkovModelBasedMethods

    Hidden Markov Models (HMM) are a set of statistical models used to

    characterize the statistical properties of a signal. Rabiner [69][96], provides an

    extensive and complete tutorial onHMMs.HMM are made oftwo interrelated

    processes:

  • 8/3/2019 Face Gabor

    50/130

    38

    - anunderlying,unobservableMarkovchainwithfinitenumberofstates,astate

    transitionprobabilitymatrixandaninitialstateprobabilitydistribution.

    - Asetofprobabilitydensityfunctionsassociatedtoeachstate.

    TheelementsofHMMare:

    N,thenumberofstatesinthemodel.If Sisthesetofstates,then S={S1,S2,...,

    SN}.Thestateofthemodel qttimetisgivenbyqtS, Tt 1 ,whereTisthe

    lengthoftheobservationsequence(numberofframes).

    M, the number of different observation symbols. If Visthesetofallpossible

    observationsymbols(alsocalledthecodebookofthemodel),then V={V1,V2,...,

    VM}.

    A,thestatetransitionprobabilitymatrix;A={aij}where

    NjiSqSqPA itjtij === ,1,]|[ 1 (2.30)

    10 ija , NiaN

    j

    ij ==

    1,11

    (2.31)

    B,theobservationsymbolprobabilitymatrix;B=bj(k)where,

    MkNjSqvkQPkB jttj === 1,1,]|[)( (2.32)

    AndQtistheobservationsymbolattimet.

    ,theinitialstatedistribution;=iwhere

    [ ] NiSqP ii == 1,1 (2.33)

    Usingashorthandnotation,aHMMisdefinedas:

  • 8/3/2019 Face Gabor

    51/130

    39

    =(A,B,).(2.34)

    The above characterization corresponds to a discrete HMM, where the

    observations characterized as discrete symbols chosen from a finite alphabet

    V={v1,v2,...,vM}. In a continuous density HMM,thestates arecharacterized by

    continuousobservationdensityfunctions.Themostgeneralrepresentationofthe

    modelprobabilitydensityfunction(pdf)isafinitemixtureoftheform:

    =

    =M

    k

    ikikiki NiUONcOb1

    1),,,()( (2.35)

    where cikisthemixturecoefficientforthekthmixtureinstate i.Withoutlossof

    generalityN(O,ik,Uik)isassumedtobeaGaussianpdfwithmeanvectorikand

    covariancematrixUik.

    HMMhavebeenusedextensivelyforspeechrecognition,wheredatais

    naturally one-dimensional (1-D) along time axis. However, theequivalentfully

    connected two-dimensional HMM would lead to a very high computational

    problem[97].Attempts havebeenmade tousemulti-model representations that

    lead to pseudo 2-D HMM [98]. These models are currently used in character

    recognition[99][100].

  • 8/3/2019 Face Gabor

    52/130

    40

    Figure2.3:ImagesamplingtechniqueforHMMrecognition

    In[101],Samariaetalproposedtheuseofthe1-DcontinuousHMMfor

    face recognition. Assuming that each face is in an upright, frontal position,

    featureswilloccurinapredictableorder.Thisorderingsuggeststheuseofatop-

    bottommodel,whereonlytransitionsbetweenadjacentstatesin atoptobottom

    manner are allowed [102]. The states of the model correspond to the facial

    featuresforehead,eyes,nose,mouthandchin[103].TheobservationsequenceO

    isgeneratedfromanXxYimageusinganXxLsamplingwindowwithXxMpixels

    overlap(Figure2.3).EachobservationvectorisablockofLlines.ThereisanM

    lineoverlapbetweensuccessiveobservations.Theoverlappingallowsthefeatures

    to be captured in a manner, which is independent of vertical position, while a

    disjoint partitioning of the image could result in the truncation of features

    occurring across block boundaries. In [104], the effect of different sampling

    parametershasbeendiscussed.Withnooverlap,ifasmallheightofthesampling

  • 8/3/2019 Face Gabor

    53/130

    41

    window is used, the segmented data do not correspond to significant facial

    features.However,asthewindowheightincreases,thereisahigherprobabilityof

    cuttingacrossthefeatures.

    Given cfaceimagesforeachsubjectofthetrainingset,thegoalofthe

    training stage is to optimize the parameters i=(A,B,) to describe best, the

    observations O={o1, o2,...,oT},inthesenseofmaximizingP(O|).Thegeneral

    HMMtrainingschemeisillustratedinFigure2.4andisavariantoftheK-means

    iterativeprocedureforclusteringdata:

    1. The training images are collected for each subject in the database and are

    sampledtogeneratetheobservationsequence.

    2. A common prototype (state) model is constructed with the purpose of

    specifyingthenumberofstatesintheHMMandthestatetransitionsallowed,

    A(modelinitialization).

    3. A set of initial parameter values using the training data and the prototype

    model are computed iteratively. The goal of this stage is to find a good

    estimatefortheobservationmodelprobabilitymatrixB.In[96],ithasbeen

    shownthatagoodinitialestimatesoftheparametersareessentialforrapid

    andproperconvergence(totheglobalmaximumofthelikelihoodfunction)of

    there-estimationformulas.Onthefirstcycle,thedataisuniformlysegmented,

    matchedwitheachmodelstateandtheinitialmodelparametersareextracted.

    Onsuccessivecycles,thesetoftrainingobservationsequenceswassegmented

    intostatesviatheViterbialgorithm[50].Theresultofsegmentingeachofthe

    trainingsequencesisforeachofNstates,amaximumlikelihoodestimateof

  • 8/3/2019 Face Gabor

    54/130

    42

    thesetofobservationsthatoccurwithineachstateaccordingtothecurrent

    model.

    4. Following the Viterbi segmentation, the model parameters are re-estimated

    using the Baum-Welch re-estimation procedure. This procedure adjusts the

    modelparameterssoastomaximizetheprobabilityofobservingthetraining

    data,giveneachcorrespondingmodel.

    5. Theresultingmodelisthencomparedtothepreviousmodel(bycomputinga

    distancescorethatreflectsthestatisticalsimilarityoftheHMMs).Ifthemodel

    distancescoreexceedsathreshold,thentheoldmodel isreplacedbythe

    newmodel~

    ,andtheoveralltrainingloopisrepeated.Ifthemodeldistance

    scorefallsbelowthethreshold,thenmodelconvergenceisassumedandthe

    finalparametersaresaved.

    Recognitioniscarriedoutbymatchingthetestimageagainsteachofthe

    trainedmodels(Figure2.5).Inordertoachievethis,theimageisconvertedtoan

    observationsequenceandthenmodellikelihoods P(Otest|i)arecomputedforeach

    i,i=1,...,c.Themodelwithhighestlikelihoodrevealstheidentityoftheunknown

    face,as

    [ ])|(maxarg 1 itestci OPV = . (2.36)

  • 8/3/2019 Face Gabor

    55/130

    43

    Figure2.4:HMMtrainingscheme

    The HMM based method showed significantly better performances for

    facerecognitioncomparedtotheeigenfacemethod.ThisisduetofactthatHMM

    based method offers a solution to facial features detection as well as face

    recognition.

    However the 1-D continuous HMM are computationally more complex

    thantheEigenfacemethod.Asolutioninreducingtherunningtimeofthismethod

    is the use of discrete HMM. Extremely encouraging preliminary results (error

    rates below %5) were reported in [105] when pseudo 2-D HMM are used.

    Furthermore,theauthorssuggestedthatFourierrepresentationoftheimagescan

    lead to better recognition performance as frequency and frequency-space

    representationcanleadtobetterdataseparation.

    YES

    NOTrainingData

    ModelInitialization

    StateSequenceSegmentation

    EstimationofBParameters

    ModelReestimation

    SamplingModel

    ConvergenceModelParameters

  • 8/3/2019 Face Gabor

    56/130

    44

    Figure2.5:HMMrecognitionscheme

    2.2.5. NeuralNetworksApproach

    Inprincipal,thepopularback-propagation(BP)neuralnetwork[106]can

    betrainedtorecognizefaceimagesdirectly.However,asimplenetworkcanbe

    verycomplexanddifficulttotrain.Atypicalimagerecognitionnetworkrequires

    N=mxninputneurons,oneforeachofthepixelsinannxmimage.Forexample,if

    theimagesare128x128,thenumberofinputsofthenetworkwouldbe16,384.In

    order to reduce the complexity, Cottrell and Fleming [107] used two BP nets

    (Figure2.6).Thefirstnetoperatesintheauto-associationmode[108]andextracts

    TrainingSet Sampling

    ProbabilityComputation

    ProbabilityComputation

    ProbabilityComputation

    1

    2

    N

    SelectMaximum

  • 8/3/2019 Face Gabor

    57/130

    45

    features for the second net, which operates in the more common classification

    mode.

    Theautoassociationnethas ninputs, noutputsandphiddenlayernodes.

    Usuallypismuchsmallerthann.Thenetworktakesafacevectorxasaninput

    andistrainedtoproduceanoutputythatisabestapproximationofx.Inthis

    way,thehiddenlayeroutputhconstitutesacompressedversionofx,orafeature

    vector,andcanbeusedastheinputtoclassificationnet.

    Figure2.6:Auto-associationandclassificationnetworks

    Bourland and Kamp [108] showed that under the best circumstances,

    when the sigmoidal functions at the network nodes are replaced by linear

    classificationnet

    hiddenlayeroutputs

    Auto-associationnet

  • 8/3/2019 Face Gabor

    58/130

    46

    functions (when the network is linear), the feature vector is the same as that

    producedbytheKarhunen-Loevebasis,ortheeigenfaces.Whenthenetworkis

    nonlinear,thefeaturevectorcoulddeviatefromthebest.Theproblemhereturns

    outtobeanapplicationofthesingularvaluedecomposition.

    Specifically,supposethatforeachtrainingfacevectorxk(n-dimensional),

    k=1, 2,...,N, the outputs of the hidden layer and output layer for the auto-

    association net are hk (p-dimensional, usually p

  • 8/3/2019 Face Gabor

    59/130

    47

    where

    Up=[u1,u2,...,up]T,

    Vp=[v1,v2,...,vp]T,

    ArethefirstpleftandrightsingularvectorsintheSVDofXrespectively,which

    alsoarethefirstpeigenvectorsofXXTandXTX.

    OnewaytoachievethisoptimumistohavealinearF(.)andtosettheweightsto

    p

    T UWW == 21 (2.41)

    SinceUpcontainsthefirsteigenvectorsofXXT,wehaveforanyinputx

    xUxWh p== 1 (2.42)

    which isthe same asthe feature vector inthe eigenface approach.However, it

    mustbenotedthattheauto-associationnet,whenitistrainedbytheBPalgorithm

    withannonlinearF(.),generallycannotachievethisoptimalperformance.

    In[109],thefirst50principalcomponentsoftheimagesareextractedand

    reducedto5dimensionsusinganauto-associativeneuralnetwork.Theresulting

    representationisclassifiedusingastandardmulti-layerperceptron.

    In a different approach, a hierarchical neural network, which is grown

    automaticallyandnottrainedwithgradientdescent,wasusedforfacerecognition

    byWengandHuang[110].

    The most successive face recognition with neural networks is a recent

    work of Lawrence et. al. [19] which combines local image sampling, a self

    organizing map neural network, and a convolutional neural network. In the

  • 8/3/2019 Face Gabor

    60/130

    48

    corresponding work twodifferent methodsof representing local image samples

    havebeenevaluated.Ineachmethod,awindowisscannedovertheimage.The

    firstmethodsimplycreatesavectorfromalocalwindowontheimageusingthe

    intensityvaluesateachpointinthewindow.Ifthelocalwindowisasquareof

    sides 2W+1long,centeredonxij,thenthevectorassociatedwiththiswindowis

    simply [xi-w,j-w,xi-w,j-w+1,...,xij,...,xi+w,j+w-1,xi+w,j+w].Thesecondmethodcreatesa

    representationofthelocalsamplebyformingavectoroutoftheintensityofthe

    centerpixelandthedifferenceinintensitybetweenthecenterpixelandallother

    pixelswithinthesquarewindow.Thenthevectorisgivenby[xij-xi-w ,j -w,xij-xi-w,j-

    w+1,...,wijxij,...,xij-xi+w,j+w-1,xij-xi+w,j+w],wi,jistheweightofcenterpixelvaluexi,j.

    Theresultingrepresentationbecomespartiallyinvarianttovariationsinintensity

    ofthecompletesampleandthedegreeofinvariancecanbemodifiedbyadjusting

    theweightw ij.

    Figure2.7:ThediagramoftheConvolutionalNeuralNetworkSystem

    imagClassificati

    on

    ImageSampling

    Self-

    Organizin

    Ma

    Feature

    Extractio

    nLa ers

    Multi-layerPerceptronStyleClassifier

    ConvolutionalNeural NetworkDimensionality

    Reduction

  • 8/3/2019 Face Gabor

    61/130

    49

    Theselforganizingmap,SOM,introducedbyTeudoKohonen[111,112],

    is an unsupervised learning process which learns the distribution of a set of

    patterns without any class information. The SOM defines a mapping from an

    input spaceRn

    onto a topologically ordered set of nodes, usually in a lower

    dimensional space. For classification a convolutional network is used, as it

    achievessomedegreeofshiftanddeformationinvarianceusingthreeideas:local

    receptive field, shared weights, and spatial subsampling. The diagram of the

    systemisshowninFigure2.7.

    2.2.6. TemplateBasedMethods

    Themostdirectoftheproceduresusedforfacerecognitionisthematching

    between the test images and a set of training images based on measuring the

    correlation. The matching technique is based on the computation of the

    normalizedcrosscorrelationcoefficient CNdefinedby,

    { } { } { }

    { } { }GT

    TGTG

    NII

    IEIEIIEC

    = (2.43)

    WhereIG isthegalleryimagewhichmustbematchedtothetestimage,IT.IGITis

    the pixel bypixelproduct,E is the expectation operator and is the standard

    deviationovertheareabeingmatched.Thisnormalizationrescalesthetestand

    gallery images energy distribution so that their variances and averages match.

    Howevercorrelationbasedmethodsarehighlysensitivetoillumination,rotation

    andscalechanges.Thebestresultsforthereductionoftheilluminationchanges

  • 8/3/2019 Face Gabor

    62/130

    50

    wereobtainedusingtheintensityofgradient ( )GYGX II + .Correlationmethod

    is computationally expensive, so the dependency of the recognition on the

    resolutionoftheimagehasbeeninvestigated.

    In[16],BrunelliandPoggiodescribeacorrelationbasedmethodforface

    recognition in which templates corresponding to facial features of relevant

    significance as the eyes, nose and mouth are matched. In order to reduce

    complexity,inthismethodfirstpositionsofthosefeaturesaredetected.Detection

    of facial features is also subjected to a lot of studies [36, 81, 113, 114]. The

    methodpurposedbyBrunelliandPoggiousesasetoftemplatestodetecttheeye

    position in a new image, by looking for the maximum absolute values of the

    normalized correlation coefficient of these templates at each point in the test

    image.Inordertohandlescalevariations,fiveeyetemplatesatdifferentscales

    wereused.However,thismethodiscomputationallyexpensive,andalsoit must

    benotedthateyesofdifferentpeoplecanbemarkedlydifferent.Suchdifficulties

    canbereducedbyusingahierarchicalcorrelation[115].

    Afterfacialfeaturesaredetectedforatestface,theyarecomparedtothose

    ofgalleryfacesreturningavectorofmatchingscores(oneperfeature)computed

    throughnormalizedcrosscorrelation.

    The similarity scores of different features can be integrated to obtain a

    globalscore.Thiscumulativescorecanbecomputedinseveralways:choosethe

    score of the most similar feature or sum the feature scores or sum the feature

    scores using weights. After cumulative scores are computed, a test face is

    assignedtothefaceclassforwhichthisscoreismaximized.

  • 8/3/2019 Face Gabor

    63/130

    51

    Therecognitionratereportedin[16]ishigherthan96%.Thecorrelation

    method as described above requires a robust feature detection algorithm with

    respect to variations in scale, illumination and rotations. Moreover the

    computationalcomplexityofthismethodisquitehigh.

    Beymer [116] extended the correlation based approach to a view based

    approachforrecognizingfacesundervaryingorientations,includingrotationsin

    depth.Inthefirststagethreefacialfeatures(eyesandnoselobe)aredetectedto

    determinefacepose.Althoughfeaturedetectionissimilartopreviouslydescribed

    correlation method to handle rotations, templates from different views and

    differentpeopleareused.Afterfaceposeisdetermined,matchingproceduretakes

    placewiththecorrespondingviewofthegalleryfaces.Inthiscase,asthenumber

    of model views for each person in the database is increased, computational

    complexityisalsoincreased.

    2.2.7. FeatureBasedMethods

    Since most face recognition algorithmsareminimumdistance classifiers

    insomesense,itisimportanttoconsidermorecarefullyhowadistanceshould

    bedefined.In thepreviousexamples(eigenface,neuralnetsetc.)the distance

    betweenanobservedfacexandagalleryfacecisthecommonEuclideandistance

    cxcxd =),( , and this distance is sometimes computed using an alternative

    orthonormalbasisas cxcxd ~~),( = .

  • 8/3/2019 Face Gabor

    64/130

    52

    Whilesuchanapproachiseasytocompute,italsohassomeshortcomings.

    When there is an affine transformation between two faces (shift and dilation),

    d(x,c)willnotbezero;infactitcanbequitelarge.Asanotherexample,when

    there is local transformationsand deformations (x isa smiling version of c),

    again d(x,c)willnotbezero.Moreover,itisveryusefultostoreinformationonly

    aboutthekeypointsoftheface.Featurebasedapproachescanbeasolutiontothe

    aboveproblems.

    Manjunathetal.[36]proposedamethodthatrecognizesfacesbyusing

    topologicalgraphsthatareconstructedfromfeaturepointsobtained fromGabor

    waveletdecompositionoftheface.Thismethodreducesthestoragerequirements

    bystoringfacialfeaturepointsdetectedusingtheGaborwaveletdecomposition.

    Comparison of two faces begins with the alignment of those two graphs by

    matching centroids of the features. Hence, this method has some degree of

    robustnesstorotation indepthbutonlyunderrestrictedly controlledconditions.

    Moreover,illuminationchangesandoccludedfacesarenottakenintoaccount.

    In the proposed method by Manjunaht et. al. [36], the identification

    processutilizestheinformationpresentinatopologicalgraphrepresentationof

    thefeaturepoints.ThefeaturepointsarerepresentedbynodesVi i={1,2,3,},in

    a consistent numbering technique. The information about a feature point is

    containedin {S,q},where Srepresentsthespatiallocationsandqisthefeature

    vectordefinedby,

    [ ]),,(),...,,,( 1 Niii yxQyxQq = (2.44)

  • 8/3/2019 Face Gabor

    65/130

    53

    corresponding to ith feature point. The vector qiisasetofspatialandangular

    distances from feature point itoitsN nearest neighbors denoted by Qi(x,y,j),

    where j is the jth of the N neighbors. Ni represents a set of neighbors. The

    neighborssatisfyingbothmaximumnumberNandminimumEuclideandistance

    dijbetweentwopointsViandVjaresaidtobeofconsequencefor ith featurepoint.

    In order to identify an input graph with a stored one, which might be

    differenteitherintotalnumberoffeaturepointsorinthelocationoftherespective

    faces,twocostvaluesareevaluated[36].Oneisthetopologicalcostandtheother

    isasimilaritycost.If i,jrefertonodesintheinputgraphIand nmyx ,,, refer

    tonodesinthestoredgraph Othenthetwographsarematchedasfollows[36]:

    1. ThecentroidsofthefeaturepointsofIandOarealigned.

    2. LetNibetheithfeaturepoint{Vi}ofI.Searchforthebestfeaturepoint {Vi}

    inOusingthecriterion

    miNmii

    ii

    ii Sqq

    qq

    S i

    == min

    .

    1 (2.45)

    3. Aftermatching,thetotalcostiscomputedtakingintoaccountthetopologyof

    thegraphs.Letnodesi andjoftheinputgraphmatchnodes i and j ofthe

    stored graph and let iNj (i.e., Vj is a neighbor of Vi). Let

    =

    ij

    ji

    ji

    ij

    jjiid

    d

    d

    d

    ,min .Thetopologycostisgivenby

    .1 jjijjii = (2.46)

  • 8/3/2019 Face Gabor

    66/130

    54

    4. Thetotalcostiscomputedas

    +=i Nj

    jjiit

    i

    ii

    i

    SC 1 (2.47)

    where t isa scalingparameterassigningrelativeimportanceto thetwocost

    functions.

    5. The totalcost isscaledappropriatelyto reflectthepossible differenceinthe

    totalnumberofthefeaturepointsbetweentheinputandthestoredgraph.Ifni,

    no are the numbers of the feature points in the input and stored graph,

    respectively, then scaling factor

    =

    I

    O

    O

    If

    n

    n

    n

    ns ,max and the scaled cost is

    C(I,O)=sfC1(I,O).

    6. Thebestcandidateistheonewiththeleastcost,

    ),(min),( * OICOICO

    =

    (2.48)

    Therecognizedfaceistheonethathastheminimumofthecombinedcost

    value. In this method [36], since comparison of two face graphs begins with

    centroid alignment, occluded cases will cause a great performance decrease.

    Moreover, directly using the number offeaturepoints offaces can beresult in

    wrongclassificationswhilethenumberoffeaturepointscanbechangeddueto

    exteriorfactors(glassesetc).

    Anotherfeaturebasedapproachistheelasticmatchingalgorithmproposed

    by Lades et al. [117], which has roots in aspect-graph matching. Let S be the

  • 8/3/2019 Face Gabor

    67/130

    55

    originaltwo-dimensionalimagelattice(Figure2.8).Thefacetemplateisavector

    fieldbydefininganewtypeofrepresentation,

    c={c i,i S1} (2.49)

    where S1isalatticeembeddedin Sand ci isafeaturevectoratposition i. S1 is

    much coarser and smaller than S. c should contain only the most critical

    informationabouttheface,sinceci is composed ofthemagnitudeof the Gabor

    filterresponsesatposition iS1.TheGaborfeatures ciprovidemulti-scaleedge

    strengthsatposition i.

    Figure2.8:A2Dimagelattice(gridgraph)onMarilynMonroesface.

    AnobservedfaceimageisdefinedasavectorfieldontheoriginalimagelatticeS.

    x={x j,jS} (2.50)

    wherexjisthesametypeoffeaturevectorascibutdefinedonthefine-gridlattice

    S.

  • 8/3/2019 Face Gabor

    68/130

    56

    Intheelasticgraphmatchingapproach[23,27,117],thedistanced(c,x)is

    definedthroughbestmatchbetween xandc.Amatchbetweenxandccanbe

    describeduniquelythroughamappingbetween S1andS.

    M:S1S (2.51)

    Howeverwithoutrestriction,thetotalnumberofsuchmappingsis||S||||s1||.

    Recognizingaface,thebestmatchshouldpreservebothfeaturesandlocal

    geometry.If iS1andj=M(i),thenfeaturepreservingmeansthatxjisnotmuch

    different from ci. In addition, if i1and i2 are close in S1 then preserving local

    geometrymeansj1=M(i1)tobeclosetoj2=M(i2).Suchamatchiscalledelastic

    sincethepreservationcanbeapproximateratherthanexact,thelattice S1canbe

    stretchedunevenly.

    Finding the best match is based on minimizing the following energy function

    [117],

    [ ] .)()(,

    )(21 ,

    22121 +

    =

    iii ji

    jijjii

    xc

    xcME (2.52)

    Duetothelargenumberofpossiblematches,thismightbedifficulttoobtaininan

    acceptabletime.Hence,anapproximatesolutionhastobefound.Thisisachieved

    intwostages:Rigidmatchinganddeformablematching[117].Inrigidmatchingc

    ismovedaroundinxlikeconventionaltemplatematchingandateachposition

    xc iscalculated.Here x isthepartofxthat,afterbeingmatchedwith c,

    does not lead to any deformation of S1. In deformable matching, lattice S1 is

  • 8/3/2019 Face Gabor

    69/130

    57

    stretched through random local perturbations to further reduce the energy

    function.

    There is still a question of face template construction. An automated

    schemehasbeenshowntoworkquitewell.Theprocedureisasfollows[117]:

    1. Pickthe faceimageofanindividual and generate a facevector fieldusing

    Gaborfilters.Denotethisasvectorfieldx.

    2. Place a coarse grid lattice S1 byhandonxsuchthatverticesarecloseto

    importantfacialfeaturessuchastheeyes.

    3. CollectthefeaturevectorsatverticesofS1toformthetemplatec.

    4. Pick another face image (of a different person) and generate a face vector

    field,denotedasy.

    5. Performtemplatematchingbetweenthecwithyusingrigidmatching.

    6. Whenabestmatchisfound,collectthefeaturevectorsatverticesofS1toform

    thetemplateofy.

    7. Repeatsteps4-6.

    Malsburg et al. [23, 27] developed a system based on graph matching

    approach on with several major modifications. First, they suggested using not

    onlymagnitudebutalsophaseinformationoftheGaborfilterresponses.Due to

    phaserotation,vectorstakenfromimagepointsonlyafewpixelsapartfromeach

    otherhaveverydifferentcoefficients,althoughrepresentingalmostthesamelocal

    feature.Therefore,phaseshouldeitherbeignoredorcompensatedforitsvariation

    explicitly.

  • 8/3/2019 Face Gabor

    70/130

    58

    As mentioned in [23], using phase information has two potential

    advantages,

    1. Phase information is required to discriminate between patterns with similar

    magnitudes.

    2. Sincephasevariesquicklywithlocationprovidesameansforaccuratevector

    localizationinanimage.

    Malsburgetal.proposedtocompensatephaseshiftsbyestimatingsmall

    relative displacements between two feature vectors to use a phase sensitive

    similarityfunction[23].

    Second modification is the use ofbunch graphs (Figure 2.9, 2.10). The

    face bunch graph is able to represent a wide variety of faces, which allows

    matchingonfaceimagesofpreviouslyunseenindividuals.Theseimprovements

    makeitpossibletoextractanimagegraphfromanewfaceimageinonematching

    process. Computational efficiency, and ability to deal with different poses

    explicitlyarethemajoradvantagesofthesystemin[23],comparedto[117].

  • 8/3/2019 Face Gabor

    71/130

    59

    (a) (b)

    Figure 2.9: A brunch graph from the a) artistic point of view (``Unfinished

    Portrait''byTullioPericoli(1985)),b)scientificpointofview

    Figure2.10:Bunchgraphmatchedtoaface.

  • 8/3/2019 Face Gabor

    72/130

    60

    2.2.8. CurrentStateoftheArt

    Comparing recognition results of different face recognition systems is a

    complextask,becausegenerallyexperimentsarecarriedoutondifferentdatasets.

    However, the following 5 questions can help while reviewing different face

    recognitionmethods:

    1. Wereexpression,headorientationandlightingconditionscontrolled?

    2. Where subjects allowed wearing glasses and having beards or other facial

    masks?

    3. Was the subject sample balanced? Where gender, age and ethnic origin

    spannedevenly?

    4. How many subjects there in database? How many images were used for

    trainingandtesting?

    5. Werethefacefeatureslocatedmanually?

    Answering theabovequestions contributes tobuilding betterdescription

    oftheconstraintswithineachapproachoperated.Thishelpstomakeamorefair

    comparisonbetweendifferentsetofexperimentalresults.Howeverthemostdirect

    and reliable comparison between different approaches is obtained by

    experimentingwiththesamedatabase.

    By 1993, there were several algorithms claiming to have accurate

    performanceinminimallyconsternatedenvironments.Forabettercomparisonof

    thosealgorithmsDARPAandArmyResearchLaboratoryestablishedtheFERET

    program(seesection4.1.4)withthegoalsofbothevaluatingtheirperformance

    andencouragingadvancesinthetechnology[83].

  • 8/3/2019 Face Gabor

    73/130

    61

    Today,therearethreealgorithmsthathavedemonstratedthehighestlevel

    ofrecognitionaccuracyonlargedatabases(1196peopleormore)underdouble

    blind testing conditions. These are the algorithms from University of Southern

    California(USC),UniversityofMaryland(UMD),andMITMediaLab[83,128].

    The MIT, Rockefeller and UMD algorithms all use a version of the eigenface

    transformfollowed bydiscriminativemodeling (section 2.2.3.1.2). Howeverthe

    USC system uses a very different approach. It begins by computing Gabor

    Wavelettransformoftheimageanddoesthecomparisonbetweenimagesusinga

    grap