Image description generation method fusing visual common sense and enhancing multilayer global features

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A global feature and image description technology, applied in the field of computer vision, can solve problems such as insufficient mining of visual semantic relations, redundant information of multi-layer global features, etc.

Active Publication Date: 2021-09-10

CHONGQING NORMAL UNIVERSITY

View PDF9 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The purpose of the present invention is to provide an image description generation method that integrates visual common sense and enhances multi-layer global features. There is a technical problem of redundant information in the extracted multi-layer global features

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0033] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0034] In describing the present invention, it should be understood that the terms "length", "width", "upper", "lower", "front", "rear", "left", "right", "vertical", The orientation or positional relationship indicated by "horizontal", "top", "bottom", "inner", "outer", etc. are based on the orientation or positional relationship shown in the drawings, and are only for the convenience of describing the present invention and simplifying the description, rather than Nothing indicating or implying that a referenced device or element...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of computer vision, and particularly discloses an image description generation method fusing visual common sense and enhancing multi-layer global features, and the method comprises the steps of fusing visual common sense features extracted by a VCR-CNN and local features extracted by a Faster R-CNN, and obtaining fusion features; mining a visual semantic relationship between the objects by adopting an X linear attention mechanism to obtain high-level local features and multi-level global features; enhancing the multi-layer global features by adopting an AoA mechanism, and performing linear mapping to obtain fused global features; screening the fusion global features by using long and short term memory of visual selection, weighing related information and adaptively selecting for high-level local features by using an X linear attention mechanism, and finally, using a semantic decoding gated linear unit to generate an output word sequence. The problems that an image description generation model of local features is insufficient in visual semantic relation mining, and redundant information exists in multi-layer global features extracted by an attention mechanism are solved.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to an image description generation method that integrates common sense of vision and enhances multi-layer global features. Background technique [0002] Image description generation is one of the advanced tasks in the field of computer vision, and its purpose is to enable the computer to automatically generate a natural language description of a given image. Compared with low-level and mid-level tasks such as image classification and target detection, it not only needs to recognize the salient objects and their attributes in the image, understand the relationship between them, but also express them in accurate and fluent natural language. A very challenging task. When humans acquire information, the visual system will actively focus on the target area of interest and extract relevant important information. Inspired by the human visual system, attention mechanisms have be...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06K9/46G06N3/04G06N3/08

CPCG06N3/08G06N3/044G06N3/045G06F18/2411G06F18/253

Inventor 杨有方小龙尚晋胡峻滔姚露边雅琳

Owner CHONGQING NORMAL UNIVERSITY

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Image description generation method fusing visual common sense and enhancing multilayer global features

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology