Deep Learning-based Automated Lip-Reading: A Survey

Souheil Fenghour, Daqing Chen, Kun Guo, Perry Xiao

Research output: Contribution to journalReview articlepeer-review

38 Citations (Scopus)
2 Downloads (Pure)

Abstract

A survey on automated lip-reading approaches is presented in this paper with the main focus being on deep learning related methodologies which have proven to be more fruitful for both feature extraction and classification. This survey also provides comparisons of all the different components that make up automated lip-reading systems including the audio-visual databases, feature extraction, classification networks and classification schemas. The main contributions and unique insights of this survey are: 1) A comparison of Convolutional Neural Networks with other neural network architectures for feature extraction; 2) A critical review on the advantages of Attention-Transformers and Temporal Convolutional Networks to Recurrent Neural Networks for classification; 3) A comparison of different classification schemas used for lip-reading including ASCII characters, phonemes and visemes, and 4) A review of the most up-to-date lip-reading systems up until early 2021.
Original languageEnglish
Article number9522117
Pages (from-to)121184-121205
Number of pages22
JournalIEEE Access
Volume9
DOIs
Publication statusPublished - 25 Aug 2021

Keywords

  • Visual speech recognition
  • deep learning
  • natural language processing
  • feature extraction
  • computer vision
  • lip-reading
  • classification

Fingerprint

Dive into the research topics of 'Deep Learning-based Automated Lip-Reading: A Survey'. Together they form a unique fingerprint.

Cite this