Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism

A technique of attention and encoding, applied in the field of image processing

Active Publication Date: 2022-08-02
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is a serious problem with this simple fusion method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism
  • An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism
  • An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070] In order to make those skilled in the art better understand the technical solutions of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.

[0071] The image paragraph description method (DualRel) based on relational coding and hierarchical attention mechanism of the present invention, the details of the DualRel model are as follows figure 2 shown. Our DualRel model contains two main modules, a relational encoding module and a hierarchical attention decoding module. The relation encoding module inputs the region feature V, the region position B and the region category O, and generates the spatial relationship encoding feature V through the spatial relation encoder and the semantic relation encoder, respectively. P and semantic relation encoding features V s , and in order to supervise the model to learn prior knowledge about semantic relations, we propose a novel semantic ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image paragraph description method based on relational coding and hierarchical attention mechanism. The method model is composed of a relational coding module and a hierarchical attention decoding module. The relational encoding module captures the encoded spatial relational information and semantic relational information through two encoders, where the prior knowledge of the semantical relation is learned by training a supervised semantic classifier during semantic relational encoding. The hierarchical attention of the hierarchical attention decoding module uses hierarchical attention with relational gates and visual gates to dynamically fuse relational information and object region features. The relational gates are used to switch between spatial relational information and semantic relational information. To decide whether to use visual information for embedding, the model adopts a strategy from coarse-grained regions to fine-grained spatial and semantic relations to fuse visual information during paragraph generation. Extensive experiments on the Stanford paragraph description dataset show that the method of the present invention is significantly better than the existing methods in multiple evaluation indicators in the field.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to an image paragraph description method based on relational coding and hierarchical attention mechanism. Background technique [0002] Image captioning is the task of automatically generating a descriptive sentence for a given image, also known as image single-sentence captioning. This basic cross-modality task may have multiple applications, such as image / video retrieval, early childhood education, and helping visually impaired people understand image content. Therefore, this task has attracted a lot of attention from the AI ​​community. [0003] In the past few years, many studies have made impressive progress on the task of generating one-sentence image descriptions. However, due to the limitation of describing an image in one sentence, it is usually not enough to summarize various details in an image, because "a picture is worth a thousand words". To address the lim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/30G06N3/04G06N3/08
CPCG06F40/30G06N3/049G06N3/08G06N3/045
Inventor 李睿凡刘云石祎晖冯方向马占宇王小捷
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products