Skip to main content

Statistical learning approaches for discriminant features selection


Supervised statistical learning covers important models like Support Vector Machines (SVM) and Linear Discriminant Analysis (LDA). In this paper we describe the idea of using the discriminant weights given by SVM and LDA separating hyperplanes to select the most discriminant features to separate sample groups. Our method, called here as Discriminant Feature Analysis (DFA), is not restricted to any particular probability density function and the number of meaningful discriminant features is not limited to the number of groups. To evaluate the discriminant features selected, two case studies have been investigated using face images and breast lesion data sets. In both case studies, our experimental results show that the DFA approach provides an intuitive interpretation of the differences between the groups, highlighting and reconstructing the most important statistical changes between the sample groups analyzed.


  1. R. Beale and T. Jackson.Neural Computing. MIT Press, 1994.

  2. C. J. C. Burges. A tutorial on support vector machines for pattern recognition.Data Mining and Knowledge Discovery, 2(2):121–167, 1998.

    Article  Google Scholar 

  3. W. Chang. On using principal components before separating a mixture of two multivariate normal distributions.Appl. Statist., 32(3):267–275, 1983.

    MATH  Article  Google Scholar 

  4. L. Chen, H. Liao, M. Ko, J. Lin, and G. Yu. A new lda-based face recognition system which can solve the small sample size problem.Paterns Recognition, 33:1713–1726, 2000.

    Article  Google Scholar 

  5. T. F. Cootes, G. J. Edwards, and C. J. Taylor. Active appearance models. InECCV’98, pages 484–498, 1998.

  6. T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham. Active shape models- their training and application.Computer Vision and Image Understanding, 61(1):38–59, 1995.

    Article  Google Scholar 

  7. T. F. Cootes, K.N. Walker, and C.J. Taylor. Viewbased active appearance models. In4th International Conference on Automatic Face and Gesture Recognition, pages 227–232, 2000.

  8. P.A. Devijver and J. Kittler.Pattern Classification: A Statistical Approach. Prentice-Hall, 1982.

  9. T. Fawcett. An introduction to roc analysis.Pattern Recogn. Lett., 27(8):861–874, 2006.

    Article  MathSciNet  Google Scholar 

  10. K. Fukunaga. Introduction to statistical pattern recognition.Boston:Academic Press, second edition, 1990.

    MATH  Google Scholar 

  11. A. Gelman and J. Hill.Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, 2007.

  12. P. Golland, W. Grimson, M. Shenton, and R. Kikinis. Detection and analysis of statistical differences in anatomical shape.Medical Image Analysis, 9:69–86, 2005.

    Article  Google Scholar 

  13. P. Golland, W. Eric L. Grimson, Martha E. Shenton, and Ron Kikinis. Deformation analysis for shape based classification.Lecture Notes in Computer Science, 2082, 2001.

  14. T. Hastie, R. Tibshirani, and J.H. Friedman.The Elements of Statistical Learning. Springer, 2001.

  15. C. J. Huberty.Applied Discriminant Analysis. John Wiley & Sons, INC., 1994.

  16. I. T. Jolliffe, B. J. T. Morgan, and P. J. Young. A simulation study of the use of principal component in linear discriminant analysis.Journal of Statistical Computing, 55:353–366, 1996.

    MATH  Article  Google Scholar 

  17. P. S. Rodrigues, G. A. Giraldi, Ruey-Feng Chang, and J. S. Suri. Non-extensive entropy for cad systems of breast cancer images. InIn Proc. of International Symposium on Computer Graphics, Image Processing and Vision — SIBGRAPI’06, Manaus, Amazonas, Brazil, 2006.

  18. B. W. Silverman.Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.

  19. J. S. Suri and R. M. Ragayyan.Recent Advances in Breast Imaging, Mammography and Computer Aided Diagnosis of Breast Cancer. SPIE Press, April 2006.

  20. D. Swets and J. Weng. Using discriminants eigenfeatures for image retrieval.IEEE Trans. Patterns Anal. Mach Intell., 18(8):831–836, 1996.

    Article  Google Scholar 

  21. C. E. Thomaz, N. A. O. Aguiar, S. H. A. Oliveira, F. L. S. Duran, G. F. Busatto, D. F. Gillies, and D. Rueckert. Extracting discriminative information from medical images: A multivariate linear approach. InSIBGRAPI’06, IEEE CS Press, pages 113–120, 2006.

  22. C. E. Thomaz, J. P. Boardman, S. Counsell, D.L.G. Hill, J. V. Hajnal, A. D. Edwards, M. A. Rutherford, D. F. Gillies, and D. Rueckert. A whole brain morphometric analysis of changes associated with preterm birth. InSPIE International Symposium on Medical Imaging: Image Processing, volume 6144, pages 1903–1910, 2006.

  23. C. E. Thomaz, J. P. Boardman, S. Counsell, D.L.G. Hill, J. V. Hajnal, A. D. Edwards, M. A. Rutherford, D. F. Gillies, and D. Rueckert. A multivariate statistical analysis of the developing human brain in preterm infants.Image and Vision Computing, 25(6):981–994, 2007.

    Article  Google Scholar 

  24. C. E. Thomaz, J. P. Boardman, D. L. G. Hill, J. V. Hajnal, D. D. Edwards, M. A. Rutherford, D. F. Gillies, and D. Rueckert. Using a maximum uncertainty lda-based approach to classify and analyse mr brain images. InInternational Conference on Medical Image Computing and Computer Assisted Intervention MICCAI04, pages 291–300, 2004.

  25. C. E. Thomaz and D. F. Gillies. A maximum uncertainty lda-based approach for limited sample size problems — with application to face recognition. InSIBGRAPI’05, IEEE CS Press, pages 89–96, 2005.

  26. C. E. Thomaz, D. F. Gillies, and R. Q. Feitosa. A new covariance estimate for bayesian classifiers in biometric recognition.IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on Image- and Video-Based Biometrics, 14(2):214–223, 2004.

    Google Scholar 

  27. C. E. Thomaz, E. C. Kitani, and D. F. Gillies. A maximum uncertainty lda-based approach for limited sample size problems — with application to face recognition.Journal of the Brazilian Computer Society (JBCS), 12(2):7–18, 2006.

    Article  Google Scholar 

  28. C. E. Thomaz, P. S. Rodrigues, and G. A. Giraldi. Using face images to investigate the differences between lda and svm separating hyper-planes. InII Workshop de Visao Computacional, 2006.

  29. M. Turk and A. Pentland. Eigenfaces for recognition.Journal of Cognitive Neuroscience, 3:71–86, 1991.

    Article  Google Scholar 

  30. V. N. Vapnik.Statistical Learning Theory. John Wiley & Sons, INC., 1998.

Download references

Author information



Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Giraldi, G.A., Rodrigues, P.S., Kitani, E.C. et al. Statistical learning approaches for discriminant features selection. J Braz Comp Soc 14, 7–22 (2008).

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI:


  • Supervised statistical learning
  • Discriminant features selection
  • Separating hyperplanes