Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

System and method for generation of unseen composite data objects

a composite data and image technology, applied in the field of image and video generation using generative models, can solve the problems of difficult to generate high resolution images and models such as wgan and lsgan, their ability to adapt (e.g., generalize) to unseen scene compositions has not received as much attention, and the training cost can thus be reduced. , the effect of avoiding the combinatorial explosion

Active Publication Date: 2022-02-08
ROYAL BANK OF CANADA
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The patent text talks about a method called "zero-shot learning" which allows for the generation of new data without needing to observe every possible combination of characteristics in training. This method is useful for generating complex data objects like human activity videos, as it reduces training costs and allows for faster training cycles. Overall, this technique helps to make data object generation more efficient and effective.

Problems solved by technology

While most approaches focus on the expressivity and controllability of the underlying generative models, their ability to adapt (e.g., generalize) to unseen scene compositions has not received as much attention.
However, image datasets cannot incorporate the dynamics of interactions with objects in videos which is a more realistic setting.
Nonetheless, even such relatively large datasets would involve a small subset of objects that humans interact with in everyday lives.
However, training instability in GANs makes it difficult to generate high resolution images and models such as WGAN and LSGAN.
Extending existing generative modeling efforts (both GANs and VAEs) to videos is not straightforward since generating a video would involve modeling of both spatial and temporal variations.
While these models predict future frames, they models have limited accuracy in the case of long duration sequences that possess high spatio-temporal variability.
As described in some embodiments herein, although an approach conditions the generation of the video on an observed frame, the problem is substantially different since the input frame is used to provide background information to the networks during video generation instead of predicting few future frames.
Additionally, these methods are designed for cases in which the missing areas are small and have limited capacity when the video has a full frame or a sequence of frames is missing.
However, these methods are heavily driven by the spatio-temporal content of the given video (with missing frames / regions).
Training for unseen compositions is an important technical problem to be solved in machine learning, as there will not always be training data that covers a particular composition.
This problem is compounded as the number of possible dimensions for a composition increases (e.g., “wash”“eggplant” in “kitchen” during “afternoon”), or where training data is expensive to generate or obtain.
Generating new composite data objects that representing an unseen composition is technically challenging for a machine learning system.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for generation of unseen composite data objects
  • System and method for generation of unseen composite data objects
  • System and method for generation of unseen composite data objects

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0082]Despite the promising success of generative models in the field of image and video generation, the capability of video generation models is limited to constrained settings. Task-oriented generation of realistic videos is a natural next challenge for video generation models. Human activity videos are a good example of realistic videos and serve as a proxy to evaluate action recognition models.

[0083]Current action recognition models are limited to the predetermined categories in the dataset. Thus, it is valuable to be able to generate video corresponding to unseen categories and thereby enhancing the generalizability of action recognition models even with limited data collection. Embodiments described herein are not limited to videos, and rather extend to other types of composites generated based on unseen combinations of categories.

[0084]FIG. 1 is an example generative adversarial network system, according to some embodiments. The generative adversarial network system 100 is ad...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A computer implemented system for generating one or more data structures is described, the one or more data structures representing an unseen composition based on a first category and a second category observed individually in a training data set. During training of a generator, a proposed framework utilizes at least one of the following discriminators—three pixel-centric discriminators, namely, frame discriminator, gradient discriminator, video discriminator; and one object-centric relational discriminator. The three pixel-centric discriminators ensure spatial and temporal consistency across the frames, and the relational discriminator leverages spatio-temporal scene graphs to reason over the object layouts in videos ensuring the right interactions among objects.

Description

CROSS-REFERENCE[0001]This application is a non-provisional of, and claims all benefit, including priority, of U.S. Application No. 62 / 822,517, filed 22 Mar. 2019, entitled “SYSTEM AND METHOD FOR GENERATION OF UNSEEN COMPOSITE DATA OBJECTS”, incorporated herein by reference in its entirety.INTRODUCTION[0002]Recent successes in the field of image and video generation using generative models are promising. Visual imagination and prediction are components of human intelligence. Arguably, the ability to create realistic renderings from symbolic representations are considered prerequisite for broad visual understanding.[0003]While most approaches focus on the expressivity and controllability of the underlying generative models, their ability to adapt (e.g., generalize) to unseen scene compositions has not received as much attention. However, such ability to adapt is an important cornerstone of robust visual imagination as it demonstrates the capacity to reason over elements of a scene.[00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(United States)
IPC IPC(8): G06K9/62G06N3/08
CPCG06K9/6256G06K9/6209G06K9/6228G06N3/08H04N5/222G06V20/44G06V20/41G06V10/82G06V10/776G06V10/774G06N3/045G06F18/214G06F18/211
Inventor NAWHAL, MEGHAZHAI, MENGYAOSIGAL, LEONIDMORI, GREGORYLEHRMANN, ANDREAS STEFFEN MICHAEL
Owner ROYAL BANK OF CANADA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products