Chinese sentence level lip language recognition method combining DenseNet and resBi-LSTM

A recognition method and sentence-level technology, applied in speech recognition, character and pattern recognition, speech analysis, etc., can solve the problem of lack of large-scale sentence-level Chinese lip language recognition public datasets, etc., to encourage feature reuse, reduce difficulty, Enhance the effect of feature propagation

Active Publication Date: 2019-12-31
HUAQIAO UNIVERSITY
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] As far as we know, nowadays, the public sentence-level data sets for lip language recognition are only in English, such as LRS, LRS3, etc., and there is no large-scale sentence-level Chinese lip language recognition public data set

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese sentence level lip language recognition method combining DenseNet and resBi-LSTM
  • Chinese sentence level lip language recognition method combining DenseNet and resBi-LSTM
  • Chinese sentence level lip language recognition method combining DenseNet and resBi-LSTM

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The present invention will be further described below in conjunction with accompanying drawing:

[0039] The present invention uses a self-made Chinese sentence-level lip recognition data set (from news broadcast and Luo Ji thinking program) to conduct lip recognition research. The method flowchart that the present invention proposes can be seen figure 1 , divided into two models, namely the pinyin prediction model ( figure 2 ) and language translation model ( image 3 ). Among them, the pinyin translation model is divided into three steps: visual feature extraction, feature sequence processing, and time series data classification.

[0040] Step 1, visual feature extraction:

[0041] The input of the pinyin prediction model is a sequence of lip pictures. Assuming that the input sequence is T×H×W (time×height×width), first use spatiotemporal convolution to extract spatiotemporal features and capture the short-term motion features of the lip area. The use of 64 thre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese sentence level lip language recognition method combining DenseNet and resBi-LSTM. The Chinese sentence level lip language recognition method divides the lip languagerecognition into two parts: Pinyin prediction and language translation, and reduces the lip language recognition difficulty. The Chinese sentence level lip language recognition method extracts visualfeatures by using DenseNet, and fully utilizes shallow features, thus effectively solving the problem of gradient disappearance, and reducing network parameters. The Chinese sentence level lip language recognition method uses 1 * 1 convolution for replacing full connection to achieve a feature dimension reduction function, and maintains space information in features, wherein the space informationplays an important role in lip language recognition technology research. The Chinese sentence level lip language recognition method uses the resBi-LSTM for processing the visual features, and finally,obtains the complex features combining the visual features and the semantic features, so that the loss of effective information is reduced, and the lip language recognition accuracy is improved.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a Chinese sentence-level lip recognition method combining DenseNet and resBi-LSTM. Background technique [0002] With the development of information technologies such as big data, cloud computing, the Internet, and the Internet of Things, and the promotion of computing platforms such as ubiquitous perception data and graphics processors, artificial intelligence technology represented by deep neural networks is developing rapidly, and artificial intelligence is becoming a driving force. The decisive force for human beings to enter the age of intelligence. The popularization of artificial intelligence is beneficial to society. The influence is becoming more and more prominent, and it has had a positive impact in the fields of image classification, speech recognition, knowledge quiz, man-machine game, unmanned driving, etc., making it usher in a new climax of explosive growth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04G10L15/25
CPCG10L15/25G06V40/171G06V40/172G06N3/045
Inventor 杜吉祥陈雪娟张洪博雷庆
Owner HUAQIAO UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products