Method and system for generating video by using cross-modal characters based on dual learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for generating videos and cross-modalities, applied in the fields of electronic digital data processing, digital data information retrieval, instruments, etc., can solve the problems of unstable learning process and lack of diversity, and achieve reduced information loss and good time continuity. , the effect of stable performance

Active Publication Date: 2020-01-21

TSINGHUA UNIV

View PDF4 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Such a learning process is unstable, and the generated videos are usually similar without diversity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0030] The following describes the method and system for generating videos based on dual learning based on cross-modal text according to the embodiments of the present invention with reference to the accompanying drawings. Methods.

[0031] figure 1 It is a flow chart of a method for generating videos based on dual learning cross-modal text in an embodiment of the present invention.

[0032] Such as figure 1 As shown, the method for generating video based on dual learning cross-modal text includes the following steps:

[0033...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a dual learning-based cross-modal text video generation method and system, and the method comprises the following steps: constructing a text-to-video generation model; constructing a video-to-text mapping model; utilizing a dual learning mechanism to jointly train the generation model and the mapping model to obtain a training model; inputting preset characters into the training model to generate a corresponding initial video; and mapping the initial video into new characters by utilizing the mapping model, and feeding back the new characters to the generation model tojudge whether the new characters are matched with preset characters or not, and then repairing the initial video to obtain a final mapped video. The bidirectional mapping between the text informationand the video information is considered, so that the text-to-video generation is better realized, and meanwhile, the generated video is higher in quality and higher in matching degree with user requirements.

Description

technical field [0001] The invention relates to the technical field of multi-modal generation models, in particular to a method and system for generating video from cross-modal text based on dual learning. Background technique [0002] Currently, user experience is very important in terms of language and visual interaction scenarios between users and machines. The user inputs text or language, and the machine can generate corresponding videos according to the user input, but there are still some problems in whether the generated video is realistic and whether it is consistent with the user input. For example, the existing method of generating video from text only considers the one-way mapping from text to video, maps text data and video data to the same latent space, and then reconstructs the video according to the data points in the latent space to achieve the purpose of generating video from text . At the technical level, the specific steps are to map the text to the lat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/435G06F16/438G06F40/289

CPCG06F16/435G06F16/438

Inventor 朱文武刘月王鑫

Owner TSINGHUA UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method and system for generating video by using cross-modal characters based on dual learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology