Sign language recognition method and system based on double-flow space-time diagram convolutional neural network

A technology of convolutional neural network and recognition method, applied in the cross field of machine translation, which can solve the problems of weak feature robustness, interference of visual information, and inability to describe information in time domain.

Active Publication Date: 2020-06-23
NANJING UNIV OF POSTS & TELECOMM
View PDF2 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, although the two mainstream methods have made important progress, human behavior recognition based on video is affected by many factors such as different lighting conditions, diverse viewing angles, complex backgrounds, and large intra-class changes, making human behavior recognition an image video. Understand the thorny and challenging research directions in the task
[0003] Sign language is the primary language of the deaf, and despite its widespread use as a "language," this particular group has difficulty communicating with those who do not understand sign language
In the current related literature, the RGB, depth map and other modal data used in sign language recognition tasks are easily disturbed by the visual information in the scene, especially in complex scenes, feature extraction is performed on data such as RGB pictures or depth maps , on the one hand, the large amount of calculation cannot meet the real-time requirements; on the other hand, the extracted features are not robust enough, and the representation ability is not enough, especially the information in the time domain cannot be described

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sign language recognition method and system based on double-flow space-time diagram convolutional neural network
  • Sign language recognition method and system based on double-flow space-time diagram convolutional neural network
  • Sign language recognition method and system based on double-flow space-time diagram convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:

[0073] like figure 1 As shown, a sign language recognition method based on two-stream spatio-temporal graph convolution disclosed in the present invention uses a bottom-up human body pose estimation method and a hand sign model to detect sign language action videos and extract human skeleton joint point information to construct a human body Skeleton key point map data; use the spatio-temporal graph convolutional neural network to extract the global spatio-temporal feature sequence and local spatio-temporal feature sequence of the video sequence from the upper torso skeleton map data and the hand map data respectively, and perform feature splicing to obtain the global-local spatio-temporal feature sequence Feature sequence; then use the self-attention encoding and decoding network to serialize the spatio-temporal features; finally, obtain the maximum cla...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sign language recognition method and system based on a double-flow space-time diagram convolutional neural network, and the method comprises the steps: firstly segmenting a sign language motion video into video frames, extracting the upper body and hand skeleton points of a person in a sign language motion video segment, and constructing global and local diagram data; respectively extracting global and local spatial-temporal features by utilizing a double-flow spatial-temporal graph volume network, and obtaining global-local features through feature splicing; meanwhile, texts corresponding to the videos are encoded into word vectors after word segmentation processing, the word vectors and the texts are mapped to the same implicit space through feature transformation, and model training is conducted through a dynamic time warping algorithm; and for the global-local feature sequence, a self-attention mechanism coding and decoding network is adopted to perform serialized modeling on the global-local feature sequence, and a softmax classifier is adopted to obtain words corresponding to each video clip by the output of a decoder, and corresponding text sentences are formed. According to the method, the accuracy of text sentence generation can be improved, and the method has important application value in scenes such as caption generation and human-computerinteraction.

Description

technical field [0001] The invention belongs to the intersection field of behavior recognition in computer vision and machine translation in natural language processing, and specifically relates to a sign language recognition method and system based on a two-stream spatio-temporal graph convolutional neural network. Background technique [0002] Human behavior recognition is a high-level task based on target detection, recognition, and tracking. It is still extremely challenging to build a robust human behavior recognition system with a wide range of applications. The research on human behavior recognition based on computer vision contains rich research content, involving image processing, computer vision, pattern recognition, artificial intelligence and many other aspects of knowledge. At present, human behavior recognition based on computer vision mainly includes traditional methods based on manual features and deep learning methods based on convolutional neural networks. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06K9/62G06N3/04
CPCG06V40/28G06N3/045G06F18/24
Inventor 刘天亮王焱章鲍秉坤谢世朋戴修斌
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products