Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text image generation method based on fusion compensation generative adversarial network

A technology for generating images and synthesizing images, which is applied in the generation of 2D images, biological neural network models, and image data processing. It can reduce the space complexity and time complexity, reduce the training difficulty and time, and avoid the expensive calculation.

Active Publication Date: 2021-07-20
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The main difficulties in generating images from text are: (1) the visual quality of images is low, including clarity, naturalness, and recognition; (2) the semantic similarity between images and given texts is low, that is, the generated images cannot accurately reflect the content of text descriptions. The visual semantic details of
(3) The model is complex, leading to training difficulties, including problems such as unstable training and long training time
[0004] However, compared with the early foundational single-architecture model, the stacking model and the hierarchical nesting model are more complex, and need to rely on additional network structures to improve the semantic richness of the synthetic graph, such as AttnGAN and DM-GAN respectively. Using cross-modal attention mechanism and memory network to introduce word-level fine-grained text vectors to improve semantic fineness, but at the same time further increase the amount of model parameters and calculations
In addition, the above models do not consider the fusion of text and image features, only the simple concatenation of the two is used as the input of the generator, and only one text vector is used. The feedforward process of the generator continuously loses information, resulting in less semantic details of the final composite image.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text image generation method based on fusion compensation generative adversarial network
  • Text image generation method based on fusion compensation generative adversarial network
  • Text image generation method based on fusion compensation generative adversarial network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to facilitate those skilled in the art to understand the technical content of the present invention, the content of the present invention will be further explained below in conjunction with the accompanying drawings.

[0029] Such as figure 1 As shown, the text generation image method based on fusion compensation generation confrontation network of the present invention comprises the following steps:

[0030] S1. Establish a data set and perform preprocessing;

[0031] The dataset used by the text-to-image task consists of multiple text-image pairs, where the text is a natural language description of the subject in that image. An image can correspond to more than ten different text descriptions, and each sentence uses different words to describe the image from different angles. Such as figure 2 The images shown correspond to the text descriptions of the following 10 different angles:

[0032] 1. The medium sized bird has a dark gray color, a black downward...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text image generation method based on a fusion compensation generative adversarial network, is applied to the field of conditional image generation, and aims to solve the problems that a model is complex, the resolution of a composite image is low and text image feature fusion is not considered in the prior art. According to a fusion compensation generative adversarial network model built by the invention, a sampling block on a generator comprises an affine modulation fusion block, text vectors are introduced into the affine modulation fusion block for multiple times through a condition convolution layer to serve as input, text condition information is repeatedly utilized for multiple times in the feedforward process of the generator and fused into generated image features, compensation of lost information in the feedforward process of the neural network is achieved, and therefore, the model can generate a 256 * 256 resolution image in a single framework at one time, and introduction of an extra network with high calculation cost is avoided.

Description

technical field [0001] The invention belongs to the field of conditional image generation, and in particular relates to an image generation technology for supplementing text conditional information multiple times during the generation process. Background technique [0002] The text generation image task originated in 2016. The content of the task is to convert the natural language description written by humans such as "this bird is black and white, with a short beak" into an image that conforms to the semantics of the text, and its essence is conditional image generation. , that is, image generation with text information as the control, supervision or guidance condition. The main difficulties in generating images from text are: (1) the visual quality of images is low, including clarity, naturalness, and recognition; (2) the semantic similarity between images and given texts is low, that is, the generated images cannot accurately reflect the content of text descriptions. vis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06T11/00G06K9/62G06F40/211G06N3/04
CPCG06T11/00G06T11/001G06F40/211G06N3/048G06N3/044G06N3/045G06F18/253
Inventor 罗俊海吴蔓王芝燕
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products