Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Image subtitle generation method based on multi-attention generative adversarial network

An attention and network technology, applied in biological neural network models, image communication, neural learning methods, etc., can solve problems such as lack of capturing global information

Pending Publication Date: 2019-08-16
CHINA UNIV OF PETROLEUM (EAST CHINA)
View PDF3 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to solve the problem that the features extracted in the image subtitle generation method based on the generative confrontation network only contain local conditions, but lack of capturing global information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image subtitle generation method based on multi-attention generative adversarial network
  • Image subtitle generation method based on multi-attention generative adversarial network
  • Image subtitle generation method based on multi-attention generative adversarial network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0080] The accompanying drawings are for illustrative purposes only and should not be construed as limiting the patent.

[0081] The present invention will be further elaborated below in conjunction with the accompanying drawings and embodiments.

[0082] figure 1 Schematic diagram of the adversarial network architecture for multi-attention generation. Such as figure 1 As shown, the multi-attention generation confrontation network includes two multi-attention generators (XE-Generator, RL-Generator) and a multi-attention discriminator, where the cross-entropy-generator (XE-Generator) and reinforcement learning-generation Both RL-Generators are multi-attention generators with the same structure, but different training strategies, and both training strategies are trained based on the proposed multi-attention generator structure.

[0083] figure 2 It is a schematic diagram of the multi-attention mechanism network structure. Such as figure 2 As shown, the top of the figure ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image subtitle generation method based on a multi-attention generative adversarial network, belongs to the technical field of image caption generation, and solves the problem that features extracted in the image caption generation method based on the generative adversarial network only contain local points and global information is not captured. A multi-attention mechanism based on local and global information is put forward for the first time to be used for image subtitle generation, and on the basis, a multi-attention generation confrontation image subtitle generation network is put forward and comprises a multi-attention generator and a discriminator. The multi-attention generator is used for generating more accurate sentences, and the multi-attention discriminator is used for judging whether the generated sentences are manually described or generated by a machine. According to the invention, a large number of experimental verifications are carried out onthe proposed framework on the basis of the MSCOCO reference data set, and a very competitive evaluation result is obtained through the evaluation of the MSCOCO subtitle challenge evaluation server.

Description

technical field [0001] The invention relates to the technical fields of computer vision and natural language processing, in particular to an image subtitle generation method based on a multi-attention generation confrontation network. Background technique [0002] The goal of image captioning technology is to generate human-friendly description sentences for a given image. Image subtitle generation technology has set off a research boom in the academic circle, and it is widely used in video retrieval and infant education and other fields. Unlike other computer vision tasks (image classification, object detection, etc.), training an effective image captioning model is more challenging because it requires a comprehensive understanding of the basic entities and their relationships in images. The traditional image subtitle generation model uses an encoder-decoder framework as the core, which uses a convolutional neural network-based encoder to encode pixel-level information int...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/04G06N3/08H04N5/278H04N21/488H04N21/81
CPCG06N3/049G06N3/08H04N5/278H04N21/4884H04N21/8133G06N3/045
Inventor 曹海文魏燚伟吴春雷王雷全邵明文
Owner CHINA UNIV OF PETROLEUM (EAST CHINA)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products