Video retrieval method based on multi-mode and self-supervised representation learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-modal, video technology, applied in the field of computer technology and image processing, to achieve high accuracy and recall rate, high information carrying capacity and robustness, and reduce complexity.

Pending Publication Date: 2022-01-18

ZHEJIANG UNIV

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] At the same time, we have also observed that a large number of videos are actually stealing content from others, making secondary edits in violation of regulations, and obtaining huge illegal benefits at low cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0022] The method of the present invention will be further described below in conjunction with the accompanying drawings.

[0023] The present invention proposes a video retrieval method based on multimodal and self-supervised characterization learning, which does not rely on task-oriented labeling data, and only needs to collect image data within the platform or the Internet to train the characterization network. Given a search video, videos with similar images or similar events can be found in the tens of millions of video databases. This technology can be a solution to issues such as news event aggregation, copyright protection infringement retrieval, and multi-modal retrieval on short video platforms.

[0024] A video retrieval method based on multimodal and self-supervised representation learning, the specific implementation steps are as follows:

[0025] Step 1: Collect a sufficient number of images and corresponding text information. The text information includes title...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video retrieval method based on multi-mode and self-supervised representation learning. The method is applied to the field of video retrieval. When a search video is given, videos with similar pictures or events can be found in a ten-million-level video library. The method can be used for solving the problems of news event aggregation, copyright protection infringement retrieval, multi-modal retrieval and the like of a short video platform. The method mainly comprises the following steps: 1, constructing a supervision data set through unlabeled picture data and picture-text pair data, and training a picture feature extraction network by using the supervision data set; 2, constructing a feature frequency library through a method of performing feature extraction on video frames and calculating domain density; and 3, extracting video representation and constructing a video library, and performing video retrieval by using a neighbor retrieval method. The video retrieval method based on multi-mode and self-supervised representation learning provided by the invention has relatively high accuracy and recall rate in a test data set, and has good robustness.

Description

technical field [0001] The invention belongs to the field of computer technology and image processing, in particular to a video retrieval method based on multimodal and self-supervised representation learning. Background technique [0002] Before 2015, image retrieval and image-text search were one of the most important technologies on the Internet. It is very important to search for pictures by text on search engines, search for pictures by pictures, and search for product pictures on e-commerce platforms. Search technology also urgently needs to quickly move from graphic demand to video demand. [0003] Video retrieval is a very important yet challenging problem, and in recent years, we have witnessed a dramatic increase in the amount of video generated over the Internet, exacerbated by the rapid development of social media applications and video sharing platforms. Due to the large number of videos released by users in a very short period of time on video platforms, thes...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/73G06F16/783G06V20/40G06V10/70G06N20/00

CPCG06F16/73G06F16/7844G06F16/7847G06N20/00Y02D10/00

Inventor 丁勇朱子奇徐晓舒汤峻

Owner ZHEJIANG UNIV

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video retrieval method based on multi-mode and self-supervised representation learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology