Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Visual dialogue generation method based on context perceptual map neural network

A contextual and visual technology, applied in the field of computer vision, can solve problems such as not considering word-level semantics, not considering interdependence, and lack of learning of semantic dependencies of visual objects

Active Publication Date: 2019-12-24
HEFEI UNIV OF TECH
View PDF6 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] For example, in 2017, Jiasen Lu and other authors published an image based on historical dialogue in the article "Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model" published at the top international conference Conference and Workshop on Neural Information Processing Systems (NIPS 2017). Attention method, this method first performs sentence-level attention processing on historical dialogues, and then performs attention learning on image features based on the processed text features, but this method only considers sentence-level information when processing the text information of the current problem. Semantics, without considering the semantics at the word level, and in the actual question sentence, usually only some keywords are most relevant to the predicted answer
Therefore, this method has certain limitations in practical application.
[0006] 2. Existing methods lack the learning of semantic dependencies between visual objects when processing image information
Although the method proposed in this paper effectively models the semantic dependencies between different dialogue segments, this method only considers the interdependence at the text level, and does not consider the interdependence between different visual objects in image information. Dependency, so that the visual semantic information can not be learned more fine-grained, and there are limitations in the final prediction answer generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual dialogue generation method based on context perceptual map neural network
  • Visual dialogue generation method based on context perceptual map neural network
  • Visual dialogue generation method based on context perceptual map neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0086] In this example, if figure 1 As shown, a visual dialogue generation method based on context-aware graph neural network is carried out as follows:

[0087] Step 1. Preprocessing of text input in visual dialogue and construction of word list:

[0088] Step 1.1. Obtain visual dialogue datasets from the Internet. The currently public datasets mainly include VisDialDataset, which is collected by relevant researchers from the Georgia Institute of Technology. The visual dialogue dataset contains sentence text and images;

[0089] Perform word segmentation processing on all sentence texts in the visual dialogue dataset to obtain segmented words;

[0090] Step 1.2, screen out all words whose word frequency is greater than the threshold from the word after segmentation, the size of the threshold can be set to 4, and build the word index table Voc; the method for creating the word index table Voc: the word table can contain words, punctuation marks ; Count the number of words an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a visual dialogue generation method based on a context perceptual map neural network. The visual dialogue generation method comprises the following steps of 1, preprocessing the text input in a visual dialogue and constructing a word list; 2, extracting the features of a dialogue image and the features of a dialogue text; 3, obtaining a context feature vector of the historical dialogue; 4, constructing a context perceptual map; 5, iteratively updating the context perceptual map; 6, carrying out attention processing on the nodes of the context perceptual map based on a current problem; 7, performing multi-modal semantic fusion and decoding to generate an answer feature sequence; 8, generating the parameter optimization of a network model based on the visual dialogueof the context perceptual map neural network; 9, generating a prediction answer. According to the method, the context perceptual map neural network is constructed on the visual dialogue, and the implicit relationship between different objects in the image can be reasoned by using the text semantic information with finer granularity, so that the reasonability and accuracy of the answers generated by an intelligent agent for question prediction are improved.

Description

technical field [0001] The invention belongs to the technical field of computer vision, relates to technologies such as pattern recognition, natural language processing, and artificial intelligence, and specifically relates to a visual dialogue generation method based on a context-aware graph neural network. Background technique [0002] Visual dialogue is a method of human-computer interaction, the purpose of which is to enable machine agents and humans to conduct reasonable and correct natural dialogues in the form of questions and answers on a given daily scene graph. Therefore, how to make the agent correctly understand the multi-modal semantic information composed of images and texts so as to give reasonable answers to the questions raised by humans is the key to the visual dialogue task. Visual dialogue is currently one of the hot research topics in the field of computer vision, and its application scenarios are also very extensive, including: helping visually impaired...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/332G06F16/583G06F17/27G06N3/04G06N3/08
CPCG06F16/3329G06F16/5846G06N3/08G06N3/044
Inventor 郭丹王辉汪萌
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products