Multimode recurrent neural network picture description method based on FCN feature extraction

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of cyclic neural network and image description, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as inability to generate more complete image descriptions, loss, and inability to generate image descriptions

Inactive Publication Date: 2017-06-13

SYSU CMU SHUNDE INT JOINT RES INST +1

View PDF2 Cites 45 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Although M-RNN can achieve good results under various test standards, the model can only generate descriptions for large-area targets in the image.

For some areas that occupy a small area in the image, their information has been lost when the convolutional neural network extracts image features, so image descriptions of these lost areas cannot be generated

Therefore, the model ignores more detailed information in the image and cannot generate a more complete image description

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0049] Such as Figure 1-2 As shown, a multimodal recurrent neural network image description method based on FCN feature extraction includes the following steps:

[0050] S1 construction and training of fully convolutional network FCN

[0051] S1.1 Acquire images: Download the PASCAL VOC dataset from the Internet, which provides a set of standard excellent datasets for image recognition and image classification. And use this data set to fine-tune and test the model;

[0052] S1.2 Adjust the existing trained convolutional neural network model Alex Net to obtain a preliminary full convolutional network model;

[0053] S1.3 Delete the classification layer of the Alex Net convolutional neural network, and convert the fully connected layer to a convolutional layer;

[0054] S1.4 Perform 2x upsampling on the result of the convolution of the highest pooling layer 5 to obtain an upsampling prediction of the pooling layer 5, and the prediction result has rough image information. Th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a multimode recurrent neural network picture description method based on FCN feature extraction. A multimode model composed of three parts, namely, a recurrent neural network (RNN), a fully convolutional neural network (FCN) and a multimode layer is obtained by training massive images labeled with text description, and automatic generation of text description of any input test image is achieved. By means of the method, image features can be effectively extracted, more detail information of the images can be retained, and the relation between words in the text description and the images can be better established. The method has significant advantages on semantics-based description between image salient targets or scenes.

Description

technical field [0001] The present invention relates to the field of artificial intelligence, and more specifically, relates to a multimodal recurrent neural network image description method based on FCN feature extraction. Background technique [0002] In recent years, the cyclic neural network RNN and convolutional neural network CNN have achieved success in natural language processing and image classification processing respectively, which has led to the emergence of a combination of cyclic neural network and convolutional neural network for automatic image description in the field of machine learning. method. Automatic generation of image description is an important branch of artificial intelligence, which can be widely used in image retrieval, blind navigation and so on. Therefore, it has attracted the attention of more and more researchers. In 2011, Mikolov et al. proposed a recurrent neural network model for natural language processing, which achieved the best res...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30G06N3/04G06N3/08

CPCG06F16/51G06F16/5866G06N3/084G06N3/045

Inventor 胡海峰王伟轩张俊轩杨梁王腾

Owner SYSU CMU SHUNDE INT JOINT RES INST

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multimode recurrent neural network picture description method based on FCN feature extraction

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology