Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Sentence-level lip language recognition method based on channel attention and time convolutional network

A convolutional network and convolutional neural network technology, applied in the field of computer machine learning and artificial intelligence, can solve problems such as unfixed sentence structure and variable length of sentences

Active Publication Date: 2022-07-01
HEFEI UNIV OF TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For CMLR, due to the diversity of Chinese grammatical structure, its sentence structure is not fixed, the length of sentences is different, and the frequency of occurrence of each Chinese character is also different, so the CMLR data set is more challenging.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sentence-level lip language recognition method based on channel attention and time convolutional network
  • Sentence-level lip language recognition method based on channel attention and time convolutional network
  • Sentence-level lip language recognition method based on channel attention and time convolutional network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In this example, a sentence-level lip language recognition method based on channel attention and temporal convolutional neural network is to identify the content expressed by the speaker according to the movement of the speaker's lip region in the video, and map it into a text language, Thereby realizing lip reading based on deep learning. First, download the sentence-level lip language recognition datasets GRID and CMLR, and obtain the image of the speaker's lip area after facial feature detection, build a complete lip language recognition model, and speed up the model training speed through batch standardization and optimization algorithms. ; Integrate the channel attention mechanism to improve the effect of the model; use the Adam optimization algorithm to update the optimized model parameters; send the data set used for prediction into the final trained model, and the model extracts features according to the movement of the speaker's lips in the video, and then The ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a sentence-level lip language recognition method based on channel attention and a time convolutional network, which comprises the following steps: 1, downloading a data set GRID for training a model and a more challenging CMLR, and preprocessing the data set; 2, building a lip language recognition network disclosed by the invention, sending the preprocessed data set into the network for training, and adjusting network parameters to obtain an optimal lip language recognition network model; and 4, performing lip language recognition on the video by using the trained model. According to the method, multi-scale feature extraction of the lip language video in the time domain and the space domain is carried out, and a high-quality lip language recognition feature map is obtained in combination with an attention mechanism, so that the lip language recognition accuracy can be improved, and the corresponding evaluation index on a more challenging CMLR data set is excellent.

Description

technical field [0001] The invention belongs to the technical field of computer machine learning and artificial intelligence, and mainly relates to a deep neural network lip language recognition method. Background technique [0002] Lip reading plays a vital role in human communication and speech comprehension, and studies have shown that humans have poor lip reading ability, while hearing-impaired people can only get less than 30% accuracy. Therefore, good lip language recognition technology can be used to improve hearing aids, improve the acquisition of language information in silent, safe, and noisy environments, etc. Before the advent of deep learning, most of the work in lip reading was based on hand-designed feature learning, which is computationally expensive and low in accuracy. In recent years, with the progress and development of deep learning, lip language recognition methods based on deep learning have received extensive attention. The use of deep neural network...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V40/20G06N3/04G06N3/08G06V10/82
CPCG06N3/049G06N3/084G06N3/045Y02D10/00
Inventor 薛峰郭昊李宏博储德军谢胤岑
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products