Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video behavior category identification method based on time domain inference graph

A recognition method and time-domain technology, applied in character and pattern recognition, instruments, biological neural network models, etc., can solve problems such as unfavorable capture action structures, and achieve the effect of improving the accuracy of category recognition

Active Publication Date: 2020-04-17
CHENGDU KOALA URAN TECH CO LTD
View PDF13 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] These two types of existing methods are easy to identify actions with strong spatial dependence in practice, and are not conducive to capturing the action structure dominated by temporal action changes and dependencies.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video behavior category identification method based on time domain inference graph
  • Video behavior category identification method based on time domain inference graph
  • Video behavior category identification method based on time domain inference graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] A video behavior category recognition method based on temporal inference graphs. According to the action dependencies between video frames, a multi-headed temporal adjacency matrix of multiple temporal inference graphs is constructed to infer the implicit relationship between actions and actions, and a semantic fusion device is constructed at the same time. The time-domain features of actions with different dependencies are extracted at multiple time scales and fused into a semantic feature with strong semantics for video action category recognition.

[0055] A basic behavior has both long-range and short-range dependencies, and often in a video frame-to-frame dependencies can abstract multiple relationships, e.g. consider a video of a human behavior "throwing a ball into the air, then catching it" , this behavior has many short-range and long-range basic dependencies. First, there are short-range relationships "throw", "throw into the air", "drop", "catch"; there are also...

Embodiment 2

[0061] like figure 2 As shown, a video behavior category recognition method based on time-domain inference graph, specifically includes the following steps:

[0062] Step S1: video is sampled;

[0063] Step S2: Utilize the convolutional network to extract the spatial feature X of the video frame sequence;

[0064] Step S3: construct the multi-head temporal domain adjacency matrix A that action dependency is arranged;

[0065] Step S4: use time-domain graph convolution network to reason;

[0066] Step S5: supervised training is carried out to the entire network;

[0067] Step S6: Carry out test classification to video.

Embodiment 3

[0069] like Figure 1 to Figure 4 As shown, a video behavior category recognition method based on temporal domain inference graph includes the following steps:

[0070] Step S1: Sampling the video.

[0071] A video usually has a large number of frames. If all of them are used as input for subsequent calculations, it will take a huge computational cost, and a lot of information in it is similar and redundant, so the video needs to be sampled first.

[0072] In this implementation, there are two sampling methods: the first one is global sparse sampling if the feature map is extracted by a 2D convolutional network; the second one is local dense sampling if the feature map is extracted by a 3D convolutional network.

[0073] Step S2: Use the convolutional network to extract the spatial feature X of the video frame sequence.

[0074] For the sampled video frames, use a convolutional network for feature extraction, such as 2D Inception or ResNet-50 based on 3D expansion technology...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of machine identification, and particularly relates to a video behavior category identification method based on a time domain inference graph. According tothe video behavior category identification method, a multi-head time domain adjacency matrix of a plurality of time domain inference graphs is constructed according to a video inter-frame action dependency relationship to infer an implicit relationship between behavior successive actions; and meanwhile, a semantic fusion device is constructed to extract action time domain features of different dependency relationships in a plurality of time scales and fuse the action time domain features into a semantic feature with strong semantics to perform video behavior category identification. Accordingto the invention, the category identification accuracy of video behaviors is improved through sequential modeling.

Description

technical field [0001] The invention belongs to the technical field of video behavior recognition, and specifically relates to a behavior category recognition method for inferring action dependencies between video time domains. Background technique [0002] In the era of mobile Internet, videos are very easy to obtain and share. Analyzing video content can not only prevent crimes, but also make corresponding recommendations to improve user experience. Behavior recognition in video, as a research direction in this field, not only has important academic significance, but also has extensive commercial potential value, such as traffic, building, and school monitoring video behavior analysis. [0003] The goal of video behavior recognition is to identify the type of behavior that occurs in the video. There are two commonly used network structures in deep network-based video behavior analysis methods: [0004] 1) 2D convolutional neural network for spatial modeling of video frame...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/62G06N3/04
CPCG06V20/40G06N3/045G06F18/254G06F18/214Y02D10/00
Inventor 徐行张静然沈复民贾可申恒涛
Owner CHENGDU KOALA URAN TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products