Depth convolution network model of multi-motion streams for video prediction

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep convolution and network model technology, applied in the fields of artificial intelligence technology and video analysis, can solve problems such as difficulty in achieving clear and accurate long-term predictions, fuzzy prediction results, fuzzy prediction images, etc., to achieve clear prediction results and improve robustness. Sticky, extended frame rate effect

Active Publication Date: 2018-12-21

PEKING UNIV SHENZHEN GRADUATE SCHOOL

View PDF7 Cites 22 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, since such methods generate all pixel values directly based on the features extracted by the convolutional neural network, the prediction effect is directly affected by the feature extraction effect, so the prediction image is very prone to blur

In addition, although this method can theoretically achieve relatively long-term forecasts, it is not easy to achieve clear and accurate long-term forecasts due to the problem of error accumulation

[0005] It can be seen that the existing video prediction methods have the problems of blurred prediction results and insufficient clarity, and it is difficult to achieve long-term prediction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0036] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0037] The present invention proposes a multi-motion flow deep convolutional network model method (MMF for short) for video prediction, which is mainly used to realize the prediction of several frames of video in the future by several frames of video sequences; figure 1 and figure 2 They are respectively the network structure diagram of the multi-motion stream video prediction deep convolutional network model provided by the present invention and the processing flow diagram of the multi-motion stream mechanism and base image method. It mainly includes the following steps:

[0038] 1) Using a convolutional auto-encoding network, several input frames are sequentially input into the encoder for encoding, and the feature map is extracted, and the feature map of the previous frame is input into the LSTM...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-motion stream depth convolution network model method for video prediction, which comprises the following steps: constructing a new convolution automatic coding networkframework fusing long-term and short-term memory network modules; the motion stream is proposed as the motion conversion from input frame to output frame. The method of generating multiple motion streams at the same time to learn more exquisite motion information is adopted to improve the prediction effect effectively. The base image is proposed as a pixel-level supplement to the motion flow method, which improves the robustness of the model and the overall effect of the prediction. A bilinear interpolation method is used to interact with multiple motion streams on the input frame to obtain multiple motion prediction maps, and then the motion prediction maps are linearly combined with the base image according to the weight matrix to obtain the final prediction results. By adopting the technical proposal of the invention, the time information in the video sequence can be more fully extracted and transmitted, thereby realizing longer-term, clearer and more accurate video prediction.

Description

technical field [0001] The invention belongs to the fields of artificial intelligence technology and video analysis technology, and in particular relates to a motion flow for video prediction and a deep convolutional network model method for video prediction by generating multiple motion flows. Background technique [0002] Video prediction is an important and challenging task in computer vision. Compared with the rise of deep learning in 2012, the video prediction task has a long history. Motion estimation in traditional video codecs already has a prototype of video prediction. However, when the wave of deep learning swept the world, the development of artificial intelligence gave new meaning and new requirements to video prediction. Video prediction in the field of artificial intelligence usually refers to the use of deep learning methods to predict and generate several predicted frames based on the motion information in several frames of video. Generally speaking, video...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06T7/579G06T7/207G06N3/04H04N5/14

CPCH04N5/145G06T7/207G06T7/579G06T2207/20084G06T2207/20081G06T2207/10016G06N3/045

Inventor 王文敏吴倩陈雄涛王荣刚李革高文

Owner PEKING UNIV SHENZHEN GRADUATE SCHOOL

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Depth convolution network model of multi-motion streams for video prediction

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology