Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Gesture recognition method based on 3D-CNN and convolutional LSTM

A 3D-CNN, gesture recognition technology, applied in character and pattern recognition, neural learning methods, instruments, etc., can solve the problems of only extracting short-term spatiotemporal features, background interference, low recognition rate, etc., to reduce overfitting, reduce time and improve the recognition accuracy

Inactive Publication Date: 2017-12-08
BEIJING UNION UNIVERSITY
View PDF2 Cites 93 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to overcome the shortcomings and deficiencies in the prior art, the present invention provides a gesture recognition method based on 3D-CNN and convolutional LSTM, which solves the problem of only extracting short-term spatio-temporal features or long-term spatio-temporal features in current traditional gesture recognition methods , complex background interference, time-consuming and low recognition rate, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gesture recognition method based on 3D-CNN and convolutional LSTM
  • Gesture recognition method based on 3D-CNN and convolutional LSTM
  • Gesture recognition method based on 3D-CNN and convolutional LSTM

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0080]1. Use the IsoGD public gesture data set for network training, and preprocess the RGB images and Depth images in the training set according to the requirements of step S1. The IsoGD dataset contains 249 gestures, 35878 RGB videos and Depth videos for training, 5784 RGB videos and Depth videos for verification, and 6271 RGB videos and Depth videos for testing. The downsampling method is used to normalize the size of the image frame to 112*112 pixels. The RGB video and Depth video in the training set are normalized by using the time dithering strategy, so that the length of each gesture video is 32 frames. For example, a gesture video has a total of 170 frames of images, according to The first frame of the preprocessed video is Idx 1 =170 / 32*(1+random(-1,1) / 2), if random(-1,1)=0.5, then Idx 1 =170 / 32*(1+0.5 / 2)=7, which is the seventh frame image in the original video, and so on. It should be noted that uniform sampling with a temporal jitter strategy is used during tr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a gesture recognition method based on 3D-CNN and convolution LSTM. The method comprises the steps that the length of a video input into 3D-CNN is normalized through a time jitter policy; the normalized video is used as input to be fed to 3D-CNN to study the short-term temporal-spatial features of a gesture; based on the short-term temporal-spatial features extracted by 3D-CNN, the long-term temporal-spatial features of the gesture are studied through a two-layer convolutional LSTM network to eliminate the influence of complex backgrounds on gesture recognition; the dimension of the extracted long-term temporal-spatial features are reduced through a spatial pyramid pooling layer (SPP layer), and at the same time the extracted multi-scale features are fed into the full-connection layer of the network; and finally, after a latter multi-modal fusion method, forecast results without the network are averaged and fused to acquire a final forecast score. According to the invention, by learning the temporal-spatial features of the gesture simultaneously, the short-term temporal-spatial features and the long-term temporal-spatial features are combined through different networks; the network is trained through a batch normalization method; and the efficiency and accuracy of gesture recognition are improved.

Description

technical field [0001] The invention relates to the technical field of machine vision and pattern recognition, in particular to a gesture recognition method based on 3D-CNN and convolutional LSTM. Background technique [0002] Gestures, as a kind of human body language, play a very important role in daily life. In many computer vision application fields such as human-computer interaction, sign language recognition, and virtual reality, gesture recognition technology will undoubtedly have a huge impact. Vision-based gesture recognition aims to recognize and understand meaningful movement information of the human body through machine vision. Recognition remains a very challenging problem. [0003] Traditional gesture recognition methods can be roughly divided into methods based on artificially designed features and traditional machine learning methods. Most of these methods design a feature map that can be used to describe hand movements, and then extract these features fro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04G06N3/08
CPCG06N3/08G06V40/28G06N3/045
Inventor 袁家政刘宏哲张宏源
Owner BEIJING UNION UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products