Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A multi-modal motion recognition method based on depth neural network

A technology of deep neural network and action recognition, applied in the field of multi-modal action recognition based on deep neural network, can solve problems such as time information loss, achieve the effect of improving accuracy and precision, improving precision, and reducing computing time

Inactive Publication Date: 2019-03-12
SOUTH CHINA UNIV OF TECH
View PDF4 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although this method improves the performance of single-stream methods by clearly capturing local temporal motion, mid- and long-term temporal information is still lost in the learned features since video-level prediction is obtained by averaging the prediction scores of sampled clips.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A multi-modal motion recognition method based on depth neural network
  • A multi-modal motion recognition method based on depth neural network
  • A multi-modal motion recognition method based on depth neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0039] Such as figure 1 As shown, this embodiment discloses a multi-modal action recognition method based on a deep neural network.

[0040] The deep neural network used in this embodiment has three branches in the lower layer - a convolutional neural network for extracting temporal features, a convolutional neural network for extracting spatial features, and a convolutional neural network for processing skeleton path integral features Fully connected network. At a high level, the three branches are merged into one branch through feature fusion, and the classification id of the video action is predicted by the softmax activation function. In the image branch, a pooling structure based on the attention mechanism is introduced, which can help the network structure to focus on features that are conducive to recognizing actions without changing the existing network structure, thereby reducing the interference of irrelevant features and improving the existing network structure. T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-modal action recognition method based on a depth neural network. The method comprehensively utilizes multi-modal information such as video images, optical flow diagramsand human skeleton. The specific steps are as follows: firstly, a series of preprocessing and compression are performed on the video; Obtaining an optical flow graph based on adjacent frames of the video; The human skeleton is obtained from the video frame by using the attitude estimation algorithm, and the path integral characteristics of the skeleton sequence are calculated. The obtained optical flow graph, skeleton path integration feature and original video image are input into a depth neural network with multi-branch structure to learn the abstract spatio-temporal representation of humanmotion and correctly judge its motion category. In addition, the pooling layer based on attention mechanism is connected to the video image branch, which enhances the abstract features closely related to the final motion classification results and reduces the irrelevant interference. The invention comprehensively utilizes multimodal information, and has the advantages of strong robustness and high recognition rate.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to a multimodal action recognition method based on a deep neural network. Background technique [0002] Action recognition is a very popular research direction recently. By recognizing human body actions in videos, it can be used as a new interactive input to processing devices, and can be widely used in daily contact applications such as games and movies. The task of action recognition involves identifying different actions from video clips, where the action may run through the entire video, which is a natural extension of the image classification task, that is, to perform image recognition in multiple frames of video, and then calculate from each frame The predicted outcome of the final action. [0003] Traditional video action recognition techniques often rely on hand-designed feature extractors to extract spatiotemporal features of actions. With the advent of deep lea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04
CPCG06V40/25G06V40/20G06V40/28G06V20/42G06V20/46G06N3/045
Inventor 许泽珊余卫宇
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products