A Review on Deep Learning Algorithms for Speech and Facial Emotion Recognition

Charlyn Pushpa Latha, Mohana Priya

Abstract


Deep Learning is the recent machine learning technique that tries to model high level abstractions in data by using multiple processing layers with complex structures. It is also known as deep structured learning, hierarchical learning or deep machine learning. The term “deep learning" indicates the method used in training multi-layered neural networks. Deep Learning technique has obtained remarkable success in the field of face recognition with 97.5% accuracy. Facial Electromyogram (FEMG) signals are used to detect the different emotions of humans. Some of the deep learning techniques discussed in this paper are Deep Boltzmann Machine (DBM), Deep Belief Networks (DBN), Convolutional Neural Networks (CNN) and Stacked Auto Encoders respectively. This paper focuses on the review of some of the deep learning techniques used by various researchers which paved the way to improve the classification accuracy of the FEMG signals as well as the speech signals.


Full Text:

PDF

References


Adarsh Pardhan, Neelam Agarwalla, “Deep Learning using Restricted Boltzmann Machines”, International Journal on Advanced Computer Theory and Engineering (IJACTE), ISSN (Print): 2319-2526, Volume -4, Issue -3, 2015, pp : 10-15.

Chenchen Huang et al, “A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM”, Mathematical Problems in Engineering Hindawi Publishing Corporation, Volume 2014, http://dx.doi.org/10.1155/2014/749604, pp:1-7.

Ankit Awasthi, “Facial Emotion Recognition using Deep Learning”, Project Report submitted to Indian Institute of Technology Kanpur.

Labmaster, “Automatic Emotion Recognition system using Deep Belief Network”, Jan 2015.

E. M. Albornoz et al, “Spoken emotion recognition using deep learning”, 19th Iberoamerican Congress on Pattern Recognition (CIARP 2014), Nov, 2014.

Zheng-wei HUANG et al, “Speech emotion recognitionwith unsupervised feature learning”, Frontiers of Information Technology & Electronic Engineering, 2015, 16(5): pp: 358-366.

Xiaowei Jia et al, “A Novel Semi-supervised Deep Learning Framework for Affective State Recognition on EEG Signals”.

Kihyuk Sohn et al, “Improved Multimodal Deep Learning with Variation of Information”.

John Goddard, “Spoken Emotion Recognition using Restricted Boltzmann Machines and Deep Learning”.

Branislav Popović, Stevan Ostrogonac, Vlado Delić et al,” Deep Architectures for Automatic Emotion Recognition Based on Lip Shape”, INFOTEH-JAHORINA Vol. 12, March 2013,pp:939-943.

Hiranmayi Ranganathan et al, “Multimodal Emotion Recognition using Deep Learning Architectures”. EEE Winter Conference on Applications of Computer Vision (WACV) (2016).

Xuanyang Xia et al,” A biologically inspired model mimicking the memory and two distinct pathways of face perception”.

Yelin Kim et al,” Deep Learning for Robust Feature Generation in Audiovisual Emotion Recognition”.

Yan Zhang et al,” Speech Recognition Using Deep Learning Algorithms”

Kevin Terusaki et al, “Emotion Detection using Deep Belief Networks”, May 9, 2014.

Geoffrey E. Hinton and Simon Osindero et al,” A fast learning algorithm for deep belief nets”, Neural Computation 2006.

Bu Chen et al,” A Study of Deep Belief Network Based Chinese Speech Emotion Recognition”, Computational Intelligence and Security (CIS), 2014 Tenth International Conference on 15-16 Nov. 2014, pp: 180 – 184, ISBN:978-1-4799-7433-7, IEEE publisher.

Wei-Long Zheng et al,” Eeg-Based Emotion Classification Using Deep Belief Networks”,.

Ping Liu et al,” Facial Expression Recognition via a Boosted Deep Belief Network”, Computer Vision Foundation, IEEE Explore.

Erik M. Schmidt et al, “Modeling and Predicting Emotion in Music”,

Jaiswal et al, “Deep learning the dynamic appearance and shape of facial action units”, In: Winter Conference on Applications of Computer Vision (WACV), 7-9 March 2016, Lake Placid, USA. (In Press).

Quanzeng You and Jiebo Luo et al,” Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark”, Association for the Advancement of Artificial Intelligence.

Quanzeng You and Jiebo Luo et al,” Cross-modality Consistent Regression for Joint Visual-Textual Sentiment Analysis of Social Multimedia”, WSDM’16, February 22–25, 2016, San Francisco, CA, USA,2016 ACM. ISBN 978-1-4503-3716-8/16/02.

Steven K. Esser et al, “Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing”, 2016, pp:1-7.

Carlos Argueta,” Facial Emotion Recognition: Single-Rule 1–0 DeepLearning”, Feb 2016.

Hong-Wei Ng et al,” Deep Learning for Emotion Recognition on Small Datasets Using Transfer Learning”, ICMI’15, November 09-13, 2015, Seattle, WA, USA, 2015 ACM. ISBN 978-1-4503-3912-4/15/11.

VICTOR-EMIL NEAGOE et al,” Deep Learning Approach for Subject Independent Emotion Recognition from Facial Expressions”, Recent Advances in Image, Audio and Signal Processing”, ISBN: 978-960-474-350-6, pp:93- 98.

George Trigeorgis et al,” Adieu Features? End-To-End Speech Emotion Recognition Using A Deep Convolutional Recurrent Network”.

Samira Ebrahimi Kahou et al,” Recurrent Neural Networks for Emotion Recognition in Video”, Nov 2015.

Gil Levi,” Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns”, EmotiW’15, November 9, 2015, Seattle, WA, USA, 2015 ACM. ISBN 978-1-4503-3983-4/15/11 ...$15.00.

Amogh Gudi et al,” Deep Learning based FACS Action Unit Occurrence and Intensity Estimation”,Vicarious Perception Technologies, Amsterdam, The Netherlands.

Amogh Gudi et al, “Recognizing Semantic Features in Faces using Deep Learning”, Master's Thesis, University of Amsterdam.

Pablo Barros et al,” Multimodal emotional state recognition using sequence-dependent deep hierarchical features”, Neural Networks, 72 (2015),pp: 140–151.

Yilin Yan et al,” Deep Learning for Imbalanced Multimedia Data Classification”.

Li Wang et al,” Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey”.

Moez Baccouche et al,” Sequential Deep Learning for Human Action Recognition”, HBU 2011, LNCS 7065, pp. 29–39, 2011, Springer-Verlag Berlin Heidelberg 2011.

Claudio Aracena et al,” Towards An Unified Replication Repository for EEG-based Emotion Classification”,.

Wei Liu et al,” Multimodal Emotion Recognition Using Multimodal Deep Learning”, Intelligent Interaction and Cognition Engineering.

Otkrist Gupta et al,” Deep video gesture recognition using illumination invariants”, Massachusetts Institute of Technology, 21 Mar 2016.

Suwicha Jirayucharoensak et al,” EEG-Based Emotion Recognition Using Deep Learning Network with Principal Component Based Covariate Shift Adaptation”, The Scientific World Journal Volume 2014, pp:1-10.

Salah Rifai et al,” Disentangling factors of variation for facial expression recognition”.

Zheng-wei HUANG et al, “Speech emotion recognition with unsupervised feature learning”, Frontiers of Information Technology & Electronic Engineering, 2015 16(5),pp:358-366, ISSN 2095-9230.

Jun Deng et al,” Sparse Autoencoder-based Feature Transfer Learning for Speech Emotion Recognition”.

H´ector P et al,” Learning Deep Physiological Models of Affect”, IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE”.

Kun Han et al,” Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine”, INTERSPEECH 2014, Work at Research internship at Microsoft Research, ISCA 2014, 14-18 September 2014, Singapore.

Weihong Deng et al,” DeepEmo: Real-world facial expression analysis via deep learning”.

Leimin Tian et al,” Emotion Recognition in Spontaneous and Acted Dialogues”,.

Mohamed R. Amer et al,” Emotion Detection In Speech Using Deep Networks”, Work done while being an intern at SRI International.ICASSP 2014.

Leimin Tian et al,” Recognizing Emotions in Dialogues with Acoustic and Lexical Features”.

Natalia Neverova et al,” Multi-scale deep learning for gesture detection and localization”,pp:1-16




DOI: https://doi.org/10.11591/APTIKOM.J.CSIT.118

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 APTIKOM Journal on Computer Science and Information Technologies



ISSN: 2722-323X, e-ISSN: 2722-3221

CSIT Stats

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.