Sign language recognition method and system based on double-flow space-time diagram convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of convolutional neural network and recognition method, applied in the cross field of machine translation, which can solve the problems of weak feature robustness, interference of visual information, and inability to describe information in time domain.

Active Publication Date: 2020-06-23

NANJING UNIV OF POSTS & TELECOMM

View PDF2 Cites 35 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

At present, although the two mainstream methods have made important progress, human behavior recognition based on video is affected by many factors such as different lighting conditions, diverse viewing angles, complex backgrounds, and large intra-class changes, making human behavior recognition an image video. Understand the thorny and challenging research directions in the task

[0003] Sign language is the primary language of the deaf, and despite its widespread use as a "language," this particular group has difficulty communicating with those who do not understand sign language

In the current related literature, the RGB, depth map and other modal data used in sign language recognition tasks are easily disturbed by the visual information in the scene, especially in complex scenes, feature extraction is performed on data such as RGB pictures or depth maps , on the one hand, the large amount of calculation cannot meet the real-time requirements; on the other hand, the extracted features are not robust enough, and the representation ability is not enough, especially the information in the time domain cannot be described

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0072] The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:

[0073] like figure 1 As shown, a sign language recognition method based on two-stream spatio-temporal graph convolution disclosed in the present invention uses a bottom-up human body pose estimation method and a hand sign model to detect sign language action videos and extract human skeleton joint point information to construct a human body Skeleton key point map data; use the spatio-temporal graph convolutional neural network to extract the global spatio-temporal feature sequence and local spatio-temporal feature sequence of the video sequence from the upper torso skeleton map data and the hand map data respectively, and perform feature splicing to obtain the global-local spatio-temporal feature sequence Feature sequence; then use the self-attention encoding and decoding network to serialize the spatio-temporal features; finally, obtain the maximum cla...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a sign language recognition method and system based on a double-flow space-time diagram convolutional neural network, and the method comprises the steps: firstly segmenting a sign language motion video into video frames, extracting the upper body and hand skeleton points of a person in a sign language motion video segment, and constructing global and local diagram data; respectively extracting global and local spatial-temporal features by utilizing a double-flow spatial-temporal graph volume network, and obtaining global-local features through feature splicing; meanwhile, texts corresponding to the videos are encoded into word vectors after word segmentation processing, the word vectors and the texts are mapped to the same implicit space through feature transformation, and model training is conducted through a dynamic time warping algorithm; and for the global-local feature sequence, a self-attention mechanism coding and decoding network is adopted to perform serialized modeling on the global-local feature sequence, and a softmax classifier is adopted to obtain words corresponding to each video clip by the output of a decoder, and corresponding text sentences are formed. According to the method, the accuracy of text sentence generation can be improved, and the method has important application value in scenes such as caption generation and human-computerinteraction.

Description

technical field [0001] The invention belongs to the intersection field of behavior recognition in computer vision and machine translation in natural language processing, and specifically relates to a sign language recognition method and system based on a two-stream spatio-temporal graph convolutional neural network. Background technique [0002] Human behavior recognition is a high-level task based on target detection, recognition, and tracking. It is still extremely challenging to build a robust human behavior recognition system with a wide range of applications. The research on human behavior recognition based on computer vision contains rich research content, involving image processing, computer vision, pattern recognition, artificial intelligence and many other aspects of knowledge. At present, human behavior recognition based on computer vision mainly includes traditional methods based on manual features and deep learning methods based on convolutional neural networks. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/62G06N3/04

CPCG06V40/28G06N3/045G06F18/24

Inventor 刘天亮王焱章鲍秉坤谢世朋戴修斌

Owner NANJING UNIV OF POSTS & TELECOMM

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Sign language recognition method and system based on double-flow space-time diagram convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology