Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Image paragraph description method based on relation coding and hierarchical attention mechanism

A technique of attention and coding, applied in the field of image processing

Active Publication Date: 2022-03-15
BEIJING UNIV OF POSTS & TELECOMM
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is a serious problem with this simple fusion method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image paragraph description method based on relation coding and hierarchical attention mechanism
  • Image paragraph description method based on relation coding and hierarchical attention mechanism
  • Image paragraph description method based on relation coding and hierarchical attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070] In order to enable those skilled in the art to better understand the technical solution of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

[0071] The image paragraph description method (DualRel) based on the relationship coding and hierarchical attention mechanism of the present invention, the details of the DualRel model are as follows figure 2 shown. Our DualRel model consists of two main modules, a relation encoding module and a hierarchical attention decoding module. The relational encoding module inputs the regional feature V, the regional position B and the regional category O, and generates the spatial relational encoding feature V through the spatial relational encoder and the semantic relational encoder respectively. P and semantic relation encoding features V s , In addition, in order to supervise the model to learn prior knowledge about semantic relation...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image paragraph description method based on relation coding and a hierarchical attention mechanism. A method model is composed of a relation coding module and a hierarchical attention decoding module. The relation encoding module captures and encodes space relation information and semantic relation information through two encoders, wherein the prior knowledge of the semantic relation is learned by training a supervised semantic classifier when the semantic relation is encoded. The hierarchical attention of the hierarchical attention decoding module uses hierarchical attention with a relational gate and a visual gate to dynamically fuse relation information and object region features, the relational gate is used for switching between spatial relation information and semantic relation information, and the visual gate is used for determining whether to embed and use visual information; the model fuses visual information in a paragraph generation process by adopting a strategy of a spatial and semantic relationship from a coarse-grained region to a fine-grained region. A large number of experiments on a Steiner paragraph description data set show that the method is obviously superior to an existing method in multiple evaluation indexes in the field.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to an image paragraph description method based on relational coding and hierarchical attention mechanism. Background technique [0002] Image description is the task of automatically generating a descriptive sentence for a given image, also called image single-sentence description. This fundamental cross-modal task may have diverse applications such as image / video retrieval, early childhood education, and helping the visually impaired understand image content. Therefore, this task has attracted great attention in the artificial intelligence community. [0003] Over the past few years, many studies have made impressive progress on the task of generating one-sentence image descriptions. However, due to the limitation of describing an image in one sentence, it is usually not enough to summarize various details in an image in one sentence, because "a picture is worth a thousa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06N3/04G06N3/08
CPCG06F40/30G06N3/049G06N3/08G06N3/045
Inventor 李睿凡刘云石祎晖冯方向马占宇王小捷
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products