Volume 4 (1), June 2021, Pages 60-90

Mehshan Ahad1, Muhammad Fayyaz2

COMSATS University, Islamabad, Pakistan, 1This email address is being protected from spambots. You need JavaScript enabled to view it., 2This email address is being protected from spambots. You need JavaScript enabled to view it.


Human gender recognition is one the most challenging task in computer vision, especially in pedestrians, due to so much variation in human poses, video acquisition, illumination, occlusion, and human clothes, etc. In this article, we have considered gender recognition which is very important to be considered in video surveillance. To make the system automated to recognize the gender, we have provided a novel technique based on the extraction of features through different methodologies. Our technique consists of 4 steps a) preprocessing, b) feature extraction, c) feature fusion, d) classification. The exciting area is separated in the first step, which is the full body from the images. After that, images are divided into two halves on the ratio of 2:3 to acquire sets of upper body and lower body. In the second step, three handcrafted feature extractors, HOG, Gabor, and granulometry, extract the feature vectors using different score values. These feature vectors are fused to create one strong feature vector on which results are evaluated. Experiments are performed on full-body datasets to make the best configuration of features. The features are extracted through different feature extractors in different numbers to generate their feature vectors. Those features are fused to create a strong feature vector. This feature vector is then utilized for classification. For classification, SVM and KNN classifiers are used. Results are evaluated on five performance measures: Accuracy, Precision, Sensitivity, Specificity, and Area under the curve. The best results that have been acquired are on the upper body, which is 88.7% accuracy and 0.96 AUC. The results are compared with the existing methodologies, and hence it is concluded that the proposed method has significantly achieved higher results.


Handcrafted Features, Feature Ensembles Pedestrian Gender Recognition, and Visual Surveillance.





Agaian, S. S., Silver, B., & Panetta, K. A. (2007). Transform coefficient histogram-based image enhancement algorithms using contrast entropy. IEEE transactions on image processing, 16(3), 741-758.

Ali, H., Sharif, M., Yasmin, M., Rehmani, M. H., & Riaz, F. (2020). A survey of feature extraction and fusion of deep learning for detection of abnormalities in video endoscopy of gastrointestinal-tract. Artificial Intelligence Review, 53(4), 2635-2707.

Almudhahka, N., Nixon, M., & Hare, J. (2016, February). Human face identification via comparative soft biometrics. In 2016 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA) (pp. 1-6). IEEE.

Alterman, R., Zito-Wolf, R., & Carpenter, T. (1998). Pragmatic action. Cognitive Science, 22(1), 53-105.

Amin, J., Sharif, M., et al. (2018). Diabetic retinopathy detection and classification using hybrid feature set. Microscopy research and technique, 81(9), 990-996.

Amin, J., Sharif, M., et al. (2019). Brain tumor detection using statistical and machine learning method. Computer methods and programs in biomedicine, 177, 69-79.

Amin, J., Sharif, M., et al. (2019, April). Brain tumor classification: feature fusion. In 2019 international conference on computer and information sciences (ICCIS) (pp. 1-6). IEEE.

Amin, J., Sharif, M., et al. (2020). An Integrated Design for Classification and Localization of Diabetic Foot Ulcer based on CNN and YOLOv2-DFU Models. IEEE Access.

Amin, J., Sharif, M., et al. (2020d). Convolutional neural network with batch normalization for glioma and stroke lesion detection using MRI. Cognitive Systems Research, 59, 304-311.

Amin, J., Sharif, M., et al. (2020e). Use of machine intelligence to conduct analysis of human brain data for detection of abnormalities in its cognitive functions. Multimedia Tools and Applications, 79(15), 10955-10973.

Amin, J., Sharif, M., Gul, N., Raza, M., Anjum, M. A., Nisar, M. W., & Bukhari, S. A. C. (2020). Brain tumor detection by using stacked autoencoders in deep learning. Journal of medical systems, 44(2), 1-12.

Amin, J., Sharif, M., Raza, M., & Yasmin, M. (2018). Detection of brain tumor based on features fusion and machine learning. Journal of Ambient Intelligence and Humanized Computing, 1-17.

Amin, J., Sharif, M., Raza, M., Saba, T., Sial, R., & Shad, S. A. (2020). Brain tumor detection: A long short-term memory (LSTM)-based learning model. Neural Computing and Applications, 32(20), 15965-15973.

Ansari, G. J., Shah, J. H., Sharif, M., & ur Rehman, S. (2020). A novel approach for scene text extraction from synthesized hazy natural images. Pattern Analysis and Applications, 23(3), 1305-1322.

Antipov, G., Berrani, S. A., et al. (2015, October). Learned vs. hand-crafted features for pedestrian gender recognition. In Proceedings of the 23rd ACM international conference on Multimedia (pp. 1263-1266).

Arshad, H., Khan, M. A., et al. (2019). Multi-level features fusion and selection for human gait recognition: an optimized framework of Bayesian model and binomial distribution. International Journal of Machine Learning and Cybernetics, 10(12), 3601-3618.

Ayyaz, M. N., Javed, I., & Mahmood, W. (2016). Handwritten character recognition using multiclass svm classification with hybrid feature extraction. Pakistan Journal of Engineering and Applied Sciences.

Azeem, A., Sharif, M., et al. (2015). Hexagonal scale invariant feature transform (H-SIFT) for facial feature extraction. Journal of applied research and technology, 13(3), 402-408.

Basha, A. F., et al. (2012). Face gender image classification using various wavelet transform and support vector machine with various kernels. International Journal of Computer Science Issues (IJCSI), 9(6), 150.

Bekios-Calfa, J., Buenaposada, J. M., & Baumela, L. (2014). Robust gender recognition by exploiting facial attributes dependencies. Pattern recognition letters, 36, 228-234.

Buades, A., Coll, B., & Morel, J. M. (2005, June). A non-local algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (Vol. 2, pp. 60-65). IEEE.

Buhr, J. D., Goodwin, R. M., et al. (2000). U.S. Patent No. 6,097,470. Washington, DC: U.S. Patent and Trademark Office.

Cai, L., Zhu, J., Zeng, H., Chen, J., & Cai, C. (2018). Deep-learned and hand-crafted features fusion network for pedestrian gender recognition. In Proceedings of ELM-2016 (pp. 207-215). Springer, Cham.

Cai, L., Zhu, J., Zeng, H., et al. (2018). HOG-assisted deep feature learning for pedestrian gender recognition. Journal of the Franklin Institute, 355(4), 1991-2008.

Cao, C., Schultz, A. B., et al. (1998). Sudden turns and stops while walking: kinematic sources of age and gender differences. Gait & Posture, 7(1), 45-52.

Cao, L., Dikmen, M., Fu, Y., & Huang, T. S. (2008, October). Gender recognition from body. In Proceedings of the 16th ACM international conference on Multimedia (pp. 725-728).

Chen, S., Lach, J., Lo, B., & Yang, G. Z. (2016). Toward pervasive gait analysis with wearable sensors: A systematic review. IEEE journal of biomedical and health informatics, 20(6), 1521-1537.

Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05) (Vol. 1, pp. 886-893). Ieee.

Derpanis, K. G. (2007). Gabor filters.

Dibra, E., Jain, H., Oztireli, C., Ziegler, R., & Gross, M. (2017). Human shape from silhouettes using generative hks descriptors and cross-modal neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4826-4836).

Dougherty, E. (2018). Mathematical morphology in image processing. CRC press.

Enzweiler, M., Eigenstetter, A., et al. (2010, June). Multi-cue pedestrian classification with partial occlusion handling. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 990-997). IEEE.

Esteban, A. et al. (2003). A rigorous and efficient full-wave analysis of uniform bends in rectangular waveguide under arbitrary incidence. IEEE Transactions on microwave theory and techniques, 51(2), 397-405.

Fayyaz, M., Yasmin, M., et al. (2020). Person re-identification with features-based clustering and deep features. Neural Computing and Applications, 32(14), 10519-10540.

Fayyaz, M., Yasmin, M., Sharif, M., & Raza, M. (2021). J-LDFR: joint low-level and deep neural network feature representations for pedestrian gender classification. Neural Computing and Applications, 33, 361-391.

Ferrari, S., Piuri, V., & Scotti, F. (2008, July). Image processing for granulometry analysis via neural networks. In 2008 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications (pp. 28-32). IEEE.

Ge, W., Collins, R. T., & Ruback, B. (2009, December). Automatically detecting the small group structure of a crowd. In 2009 Workshop on Applications of Computer Vision (WACV) (pp. 1-8). IEEE.

Golomb, B. A., Lawrence, D. T., & Sejnowski, T. J. (1990, November). SEXNET: A Neural Network Identifies Sex From Human Faces. In NIPS (Vol. 1, p. 2).

Gromski, P. S., Muhamadali, H., et al. (2015). A tutorial review: Metabolomics and partial least squares-discriminant analysis–a marriage of convenience or a shotgun wedding. Analytica chimica acta, 879, 10-23.

Guo, G., Dyer, C. R., Fu, Y., & Huang, T. S. (2009, September). Is gender recognition affected by age?. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops (pp. 2032-2039). IEEE.

Hadid et al. (2008). An experimental comparison of gender classification methods. pattern recognition letters, 29(10), 1544-1556.

Hadid, A., & Pietikäinen, M. (2009). Combining appearance and motion for face and gender recognition from videos. Pattern Recognition, 42(11), 2818-2827.

Hamamoto, Y., Uchimura, S., et al. (1998). A Gabor filter-based method for recognizing handwritten numerals. Pattern recognition, 31(4), 395-400.

Herweg, J. A., Kerekes, J. P., Ientilucci, E. J., & Eismann, M. T. (2011, June). Spectral variations in HSI signatures of thin fabrics for detecting and tracking of pedestrians. In Active and Passive Signatures II (Vol. 8040, p. 80400G). International Society for Optics and Photonics.

Hussain, S. J., Chen, Z., et al. (2016, January). Color Based Pre-rank Categorization for Person Re-identification. In 2016 International Conference on Intelligent Control and Computer Application (ICCA 2016). Atlantis Press.

Irum, I., Shahid, M. A., Sharif, M., & Raza, M. (2015). A Review of Image Denoising Methods. Journal of Engineering Science & Technology Review, 8(5).

Jahromi, M. N., Bonderup, M. B., et al. (2018, March). Automatic access control based on face and hand biometrics in a non-cooperative context. In 2018 IEEE Winter Applications of Computer Vision Workshops (WACVW) (pp. 28-36). IEEE.

Keerthi, S. S., Shevade, S. K., et al. (2001). Improvements to Platt’s SMO algorithm for SVM classifier design. Neural computation, 13(3), 637-649.

Khan, M. A., Akram, T., Sharif, M., Javed, K., Raza, M., & Saba, T. (2020). An automated system for cucumber leaf diseased spot detection and classification using improved saliency method and deep features selection. Multimedia Tools and Applications, 1-30.

Khan, M. A., et al. (2019). An integrated framework of skin lesion detection and recognition through saliency method and optimal deep neural network features selection. Neural Computing and Applications, 1-20.

Khan, M. A., Kadry, S., et al. (2021). Prediction of COVID-19-pneumonia based on selected deep features and one class kernel extreme learning machine. Computers & Electrical Engineering, 90, 106960.

Khan, M. A., Sharif, M. I., Raza, M., Anjum, A., Saba, T., & Shad, S. A. (2019). Skin lesion segmentation and classification: A unified framework of deep neural network features fusion and selection. Expert Systems, e12497.

Khan, M. A., Sharif, M., et al. (2020). Hand-crafted and deep convolutional neural network features fusion and selection strategy: an application to intelligent human action recognition. Applied Soft Computing, 87, 105986.

Khan, M. A., Zhang, Y. D., Sharif, M., & Akram, T. (2021). Pixels to classes: intelligent learning framework for multiclass skin lesion localization and classification. Computers & Electrical Engineering, 90, 106956.

Li, B., Lian, X. C., & Lu, B. L. (2012). Gender classification by combining clothing, hair and facial component classifiers. Neurocomputing, 76(1), 18-27.

Li, D., Zhang, Z., Chen, X., et al. (2016). A richly annotated dataset for pedestrian attribute recognition. arXiv preprint arXiv:1603.07054.

Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2(3), 18-22.

Liu, C., & Wechsler, H. (2002). Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Transactions on Image processing, 11(4), 467-476.

M. Simonelli and A. Quaglio (2015). Surveillance camera, ed: Google Patents.

MageshKumar, C., Thiyagarajan, R., et al. (2011, March). Gabor features and LDA based face recognition with ANN classifier. In 2011 International Conference on Emerging Trends in Electrical and Computer Technology (pp. 831-836). IEEE.

Masood, S., Sharif, M., Raza, M., Yasmin, M., Iqbal, M., & Younus Javed, M. (2015). Glaucoma disease: A survey. Current Medical Imaging, 11(4), 272-283.

Mathanker, S. K., Weckler, P. R., et al. (2011). AdaBoost classifiers for pecan defect classification. Computers and electronics in agriculture, 77(1), 60-68.

Meyer, D., & Wien, F. T. (2015). Support vector machines. The Interface to libsvm in package e1071, 28.

Munir, A., Hussain, A., et al. (2018). Illumination invariant facial expression recognition using selected merged binary patterns for real world images. Optik158, 1016-1025.

Naz, J., Sharif, M., Yasmin, M., Raza, M., & Khan, M. A. (2020). Detection and Classification of Gastrointestinal Diseases using Machine Learning. Current Medical Imaging.

Ng, C. B., Tay, Y. H., & Goi, B. M. (2012). Vision-based human gender recognition: A survey. arXiv preprint arXiv:1204.1611.

Papanikolopoulos, N. P., Krantz, D. G., et al. (2003). U.S. Patent No. 6,548,982. Washington, DC: U.S. Patent and Trademark Office.

Persoon, E., & Fu, K. S. (1977). Shape discrimination using Fourier descriptors. IEEE Transactions on systems, man, and cybernetics, 7(3), 170-179.

Piccardi, M. (2004, October). Background subtraction techniques: a review. In 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583) (Vol. 4, pp. 3099-3104). IEEE.

Prodanov, D., Heeroma, J., & Marani, E. (2006). Automatic morphometry of synaptic boutons of cultured cells using granulometric analysis of digital images. Journal of neuroscience methods, 151(2), 168-177.

Qureshi, I., Sharif, M., et al. (2016). Computer aided systems for diabetic retinopathy detection using digital fundus images: A survey. Current Medical Imaging, 12(4), 234-241.

R. L. Tucker (2000), Sediment filtering system, ed: Google Patents.

Rashid, M., Khan, M. A., et al. (2019). Object detection and classification: a joint selection and fusion strategy of deep convolutional neural network and SIFT point features. Multimedia Tools and Applications, 78(12), 15751-15777.

Rauf, H. T., Saleem, B. A., et al. (2019). A citrus fruits and leaves dataset for detection and classification of citrus diseases through machine learning. Data in brief, 26, 104340.

Raza, M., Chen, Z., et al. (2018). Framework for estimating distance and dimension attributes of pedestrians in real-time environments using monocular camera. Neurocomputing, 275, 533-545.

Raza, M., et al. (2018). Appearance based pedestrians’ gender recognition by employing stacked auto encoders in deep learning. Future Generation Computer Systems, 88, 28-39.

Raza, M., Sharif, M., et al. (2018). Appearance based pedestrians’ gender recognition by employing stacked auto encoders in deep learning. Future Generation Computer Systems, 88, 28-39.

Raza, M., Zonghai, C., et al. (2017). Part-wise pedestrian gender recognition via deep convolutional neural networks.

Raza, M., Zonghai, C., et al. (2017, August). Pedestrian classification by using stacked sparse autoencoders. In 2017 2nd International Conference on Advanced Robotics and Mechatronics (ICARM) (pp. 37-42). IEEE.

Rehman, S. U., Chen, Z., et al. (2018). Person re-identification post-rank optimization via hypergraph-based learning. Neurocomputing, 287, 143-153.

Rehman, S., Chen, Z., et al. (2016, July). Multi-feature fusion based re-ranking for person re-identification. In 2016 International Conference on Audio, Language and Image Processing (ICALIP) (pp. 213-216). IEEE.

Ross, L., & Russ, J. C. (2011). The image processing handbook. Microscopy and Microanalysis, 17(5), 843.

Russell, S., & Norvig, P. (2002). Artificial intelligence: a modern approach.

Saba, T., Bokhari, S. T. F., et al. (2018). Fundus image classification methods for the detection of glaucoma: A review. Microscopy research and technique, 81(10), 1105-1121.

Saba, T., et al. (2019). Lung nodule detection based on ensemble of hand crafted and deep features. Journal of medical systems, 43(12), 1-12.

Saba, T., Rehman, A., Jamail, N. S. M., et al. (2021). Categorizing the Students’ Activities for Automated Exam Proctoring Using Proposed Deep L2-GraftNet CNN Network and ASO Based Feature Selection Approach. IEEE Access, 9, 47639-47656.

Schumann, A., & Stiefelhagen, R. (2017). Person re-identification by deep learning attribute-complementary information. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 20-28).

Seo, C., Lee, K. Y., & Lee, J. (2001). GMM based on local PCA for speaker identification. Electronics Letters, 37(24), 1486-1488.

Shah, G. A., Khan, A., et al. (2015). A Review on Image Contrast Enhancement Techniques using Histogram Equalization. Science International, 27(2).

Shan, C. (2012). Learning local binary patterns for gender classification on real-world face images. Pattern recognition letters, 33(4), 431-437.

Sharif, M. I., Khan, M. A., Alhussein, M., Aurangzeb, K., & Raza, M. (2021). A decision support system for multimodal brain tumor classification using deep learning. Complex & Intelligent Systems, 1-14.

Sharif, M., Amin, J., et al. (2020a). Recognition of Different Types of Leukocytes Using YOLOv2 and Optimized Bag-of-Features. IEEE Access, 8, 167448-167459.

Sharif, M., Amin, J., et al. (2020b). Brain tumor detection based on extreme learning. Neural Computing and Applications, 1-13.

Sharif, M., Amin, J., et al. (2020c). An integrated design of particle swarm optimization (PSO) with fusion of features for detection of brain tumor. Pattern Recognition Letters, 129, 150-157.

Sharif, M., Irum, I., Yasmin, M., & Raza, M. (2017). Salt & Pepper Noise Removal from Digital Color Images Based on Mathematical Morphology and Fuzzy Decision. Nepal Journal of Science and Technology, 18(1), 1-7.

Sharif, M., Khan, S., et al. (2019, April). Improved Video Stabilization using SIFT-Log Polar Technique for Unmanned Aerial Vehicles. In 2019 International Conference on Computer and Information Sciences (ICCIS) (pp. 1-7). IEEE.

Sharif, M., Raza, M., et al. (2019). An overview of biometrics methods. Handbook of Multimedia Information Security: Techniques and Applications, 15-35.

Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence, 22(8), 888-905.

Struijk, L., Bavinck, J. N. B., et al. (2003). Presence of human papillomavirus DNA in plucked eyebrow hairs is associated with a history of cutaneous squamous cell carcinoma. Journal of Investigative Dermatology, 121(6), 1531-1535.

Szegedy, C., Liu, W., Jia, Y., et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).

Ueki, K., Komatsu, H., Imaizumi, S., et al. (2004, August). A method of gender classification by integrating facial, hairstyle, and clothing images. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. (Vol. 4, pp. 446-449). IEEE.

Vincent, L. (1994). Fast grayscale granulometry algorithms. In Mathematical morphology and its applications to image processing (pp. 265-272). Springer, Dordrecht.

Wen, X., Shao, L., et al. (2015). A rapid learning algorithm for vehicle classification. Information sciences, 295, 395-406.

Williams, L. M., Mathersul, D., et al. (2009). Explicit identification and implicit recognition of facial emotions: I. Age effects in males and females across 10 decades. Journal of Clinical and Experimental Neuropsychology, 31(3), 257-277.

Won, A. S., Yu, L., et al. (2012). Tracking gesture to detect gender. In Proc. of the International Society for Presence Research Annual Conference (pp. 24-26).

Xia, B., et al. (2013, September). Enhancing gender classification by combining 3D and 2D face modalities. In 21st European Signal Processing Conference (EUSIPCO 2013) (pp. 1-5). IEEE.

Xu, J., Liu, J., Yin, J., & Sun, C. (2016). A multi-label feature extraction algorithm via maximizing feature variance and feature-label dependence simultaneously. Knowledge-Based Systems, 98, 172-184.

Yang, J., Liu, L., Jiang, T., & Fan, Y. (2003). A modified Gabor filter design method for fingerprint image enhancement. Pattern Recognition Letters, 24(12), 1805-1817.

Yu, Z., Chen, H., Liu, J., et al. (2015). Hybrid $ k $-nearest neighbor classifier. IEEE transactions on cybernetics, 46(6), 1263-1275.

Yuan et al. (2016). Human gender classification: a review. International Journal of Biometrics, 8(3-4), 275-300.

Zhang, Z. (2012). Microsoft kinect sensor and its effect. IEEE multimedia, 19(2), 4-10.

Zhu, W., Miao, J., et al. (2015, July). Hierarchical extreme learning machine for unsupervised representation learning. In 2015 international joint conference on neural networks (ijcnn) (pp. 1-8). IEEE.

Ziems, M., Breitkopf, U., Heipke, C., & Rottensteiner, F. (2012, August). Multiple-model based verification of road data. In XXII ISPRS Congress (Vol. 25).