Skip to main content

A maximum uncertainty LDA-based approach for limited sample size problems — with application to face recognition


A critical issue of applying Linear Discriminant Analysis (LDA) is both the singularity and instability of the within-class scatter matrix. In practice, particularly in image recognition applications such as face recognition, there are often a large number of pixels or pre-processed features available, but the total number of training patterns is limited and commonly less than the dimension of the feature space. In this study, a new LDA-based method is proposed. It is based on a straightforward stabilisation approach for the within-class scatter matrix. In order to evaluate its effectiveness, experiments on face recognition using the well-known ORL and FERET face databases were carried out and compared with other LDA-based methods. The classification results indicate that our method improves the LDA classification performance when the within-class scatter matrix is not only singular but also poorly estimated, with or without a Principal Component Analysis intermediate step and using less linear discriminant features. Since statistical discrimination methods are suitable not only for classification but also for characterisation of differences between groups of patterns, further experiments were carried out in order to extend the new LDA-based method to visually analyse the most discriminating hyper-plane separating two populations. The additional results based on frontal face images indicate that the new LDA-based mapping provides an intuitive interpretation of the two-group classification tasks performed, highlighting the group differences captured by the multivariate statistical approach proposed.


  1. [1]

    P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997.

    Article  Google Scholar 

  2. [2]

    N.A. Campbell, “Shrunken estimator in discriminant and canonical variate analysis”,Applied Statistics, vol. 29, pp. 5–14, 1980.

    MATH  Article  Google Scholar 

  3. [3]

    L. Chen, H. Liao, M. Ko, J. Lin, and G. Yu, “A new LDA-based face recognition system which can solve the small sample size problem”,Pattern Recognition, 33 (10), pp. 1713–1726, 2000.

    Article  Google Scholar 

  4. [4]

    P.J. Di Pillo, “Biased Discriminant Analysis: Evaluation of the optimum probability of misclassification”,Communications in Statistics-Theory and Methods, vol. A8, no. 14, pp. 1447–1457, 1979.

    Article  Google Scholar 

  5. [5]

    P.A. Devijver and J. Kittler,Pattern Classification: A Statistical Approach. Prentice-Hall, Englewood Cliffs, N. J., 1982.

    Google Scholar 

  6. [6]

    J.H. Friedman, “Reguralized Discriminant Analysis”,Journal of the American Statistical Association, vol. 84, no. 405, pp. 165–175, March 1989.

    Article  MathSciNet  Google Scholar 

  7. [7]

    K. Fukunaga,Introduction to Statistical Pattern Recognition, second edition. Boston: Academic Press, 1990.

    MATH  Google Scholar 

  8. [8]

    T. Greene and W.S. Rayens, “Covariance pooling and stabilization for classification”,Computational Statistics & Data Analysis, vol. 11, pp. 17–42, 1991.

    Article  MathSciNet  Google Scholar 

  9. [9]

    A. K. Jain and B. Chandrasekaran, “Dimensionality and SampleSize Considerations in Pattern Recognition Practice”,Handbook of Statistics, P.R. Krishnaiah and L.N. Kanal Eds, vol. 2, pp. 835–855, North Holland, 1982.

  10. [10]

    A. K. Jain, R. P. W. Duin and J. Mao, “Statistical Pattern Recognition: A Review”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4–37, January 2000.

    Article  Google Scholar 

  11. [11]

    R.A. Johnson and D.W. Wichern,Applied Multivariate Statistical Analysis, fourth edition. New Jersey: Prentice Hall, 1998.

    Google Scholar 

  12. [12]

    K. Liu, Y. Cheng, and J. Yang, “Algebraic feature extraction for image recognition based on an optimal discriminant criterion”,Pattern Recognition, 26 (6), pp. 903–911, 1993.

    Article  Google Scholar 

  13. [13]

    Y. Li, J. Kittler, and J. Matas, “Effective Implementation of Linear Discriminant Analysis for Face Recognition and Verification”,Computer Analysis of Images and Patterns: 8th International Conference CAIPV9, Springer-Verlag LNCS 1689, pp. 232–242, Ljubljana, Slovenia, September 1999.

  14. [14]

    S.L. Marple,Digital Spectral Analysis with Applications. Englewood Cliffs, N.J: Prentice-Hall, 1987.

    Google Scholar 

  15. [15]

    J.R. Magnus and H. Neudecker,Matrix Differential Calculus with Applications in Statistics and Econometrics, revised edition. Chichester: John Wiley & Sons Ltd., 1999.

    MATH  Google Scholar 

  16. [16]

    S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K.-R. Muller, “Fisher discriminant analysis with kernels”,IEEE Neural Networks for Signal Processing IX, pp. 41–48, 1999.

  17. [17]

    R. Peck and J. Van Ness, “The use of shrinkage estimators in linear discriminant analysis”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 4, no. 5, pp. 531–537, September 1982.

    Article  Google Scholar 

  18. [18]

    P. J. Phillips, H. Wechsler, J. Huang and P. Rauss, “The FERET database and evaluation procedure for face recognition algorithms”,Image and Vision Computing Journal, vol. 16, no. 5, pp. 295–306, 1998.

    Article  Google Scholar 

  19. [19]

    W.S. Rayens, “A Role for Covariance Stabilization in the Construction of the Classical Mixture Surface”,Journal of Chemometrics, vol. 4, pp. 159–169, 1990.

    Article  Google Scholar 

  20. [20]

    A. Samal and P. Iyengar, “Automatic Recognition and Analysis of Human Faces and Facial Expressions: A Survey”,Pattern Recognition, 25 (1), pp. 65–77, 1992.

    Article  Google Scholar 

  21. [21]

    D. L. Swets and J. J. Weng, “Using Discriminant Eigenfeatures for Image Retrieval”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 831–836, 1996.

    Article  Google Scholar 

  22. [22]

    S. Tadjudin, “Classification of High Dimensional Data With Limited Training Samples”, PhD thesis, Purdue University, West Lafayette, Indiana, 1998.

    Google Scholar 

  23. [23]

    C. E. Thomaz, D. F. Gillies and R. Q. Feitosa. “A New Covariance Estimate for Bayesian Classifiers in Biometrie Recognition”,IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on Image- and Video-Based Biometrics, vol. 14, no. 2, pp. 214–223, February 2004.

    Google Scholar 

  24. [24]

    M. Turk and A. Pentland, “Eigenfaces for Recognition”,Journal of Cognitive Neuroscience, vol. 3, pp. 71–86, 1991.

    Article  Google Scholar 

  25. [25]

    J. Yang and J. Yang, “Optimal FLD algorithm for facial feature extraction”,SPIE Proceedings of the Intelligent Robots and Computer Vision XX: Algorithms, Techniques, and Active Vision, vol. 4572, pp. 438–444, 2001.

    Google Scholar 

  26. [26]

    J. Yang and J. Yang, “Why can LDA be performed in PCA transfoimed space?”,Pattern Recognition, vol. 36, pp. 563–566, 2003.

    Article  Google Scholar 

  27. [27]

    H. Yu and J. Yang, “A direct LDA algorithm for high dimensional data — with application to face recognition”,Pattern Recognition, vol. 34, pp. 2067–2070, 2001.

    MATH  Article  Google Scholar 

  28. [28]

    W. Zhao, R. Chellappa and A. Krishnaswamy, “Discriminant Analysis of Principal Components for Face Recognition”, inProc. 2 nd International Conference on Automatic Face and Gesture Recognition, 336–341,1998.

Download references

Author information



Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Thomaz, C.E., Kitani, E.C. & Gillies, D.F. A maximum uncertainty LDA-based approach for limited sample size problems — with application to face recognition. J Braz Comp Soc 12, 7–18 (2006).

Download citation

  • Issue Date:

  • DOI:


  • Linear Discriminant Analysis (LDA)
  • small
  • sample size
  • face recognition