Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Image description method and system based on vision and semantic attention combined strategy

A visual attention and image description technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as large dependencies and affecting sentence structure and accuracy

Inactive Publication Date: 2018-01-09
CHINA UNIV OF PETROLEUM (EAST CHINA)
View PDF2 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the words generated at the current moment are more dependent on the words generated at the previous moment in the process of generating sentences, when the words generated at the previous moment are inaccurate, it will affect the structure and accuracy of the entire sentence

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image description method and system based on vision and semantic attention combined strategy
  • Image description method and system based on vision and semantic attention combined strategy
  • Image description method and system based on vision and semantic attention combined strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] It should be pointed out that the following detailed description is exemplary and intended to provide further explanation to the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

[0075] It should be noted that the terminology used here is only for describing specific implementations, and is not intended to limit the exemplary implementations according to the present application. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that when the terms "comprising" and / or "comprising" are used in this specification, they mean There are features, steps, operations, means, components and / or combinations thereof.

[0076] The purpose of this invention is to reduce the dependence on the word generated at the previous time for ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image description method and system based on a vision and semantic attention combined strategy. The steps include utilizing a convolutional neural network (CNN) to extract image features from an image whose image description is to be generated; utilizing a visual attention model of the image to process the image features, feeding the image features processed by the visual attention model to a first LSTM network to generate words, then utilizing a semantic attention model to process the generated words and predefined labels to obtain semantic information, then utilizing a second LSTM network to process semantics to obtain words generated by the semantic attention model, repeating the abovementioned steps, and finally performing series combination on all the obtained words to generate image description. The method provided by the invention not only utilizes a summary of the input image, but also enriches information in the aspects of vision and semantics, and enables a generated sentence to reflect content of the image more truly.

Description

technical field [0001] The invention relates to computer vision technology and natural language processing technology, in particular to an image description method and system based on a strategy of combining vision and semantic attention. Background technique [0002] The research on image description has attracted much attention in the field of machine learning and computer vision. The significance of this research is not only because it has important practical applications, but also because it is a research project on image understanding in the field of computer vision. huge challenge. Generating a meaningful language description of an image requires a computer to have a certain understanding of the image, which is far more complicated than the tasks of image classification and object detection. Image description successfully integrates the two main technologies in the field of artificial intelligence, natural language processing and Computer vision combined. [0003] Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08G06K9/46
Inventor 王雷全褚晓亮魏燚伟吴春雷崔学荣
Owner CHINA UNIV OF PETROLEUM (EAST CHINA)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products