Fine-grained cross-media retrieval method based on multi-model network

A cross-media, fine-grained technology, applied in the fields of multimedia retrieval, fine-grained recognition, natural language processing, and computer vision, can solve the problems of media heterogeneity, small differences between classes, and large differences within classes, etc., to reduce heterogeneity Gap problem, the effect of improving accuracy

Pending Publication Date: 2020-10-16
NANJING UNIV OF SCI & TECH
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, there are still some deficiencies in the research on fine-grained cross-media retrieval, the most important of which are two points, one is the heterogeneous gap between media, that is, the feature representation of data samples of different media types is very different, so directly Measuring the similarity between them is a very difficult problem
Another problem is that the existing research does not fully consider the fine-grained level caused by small inter-class differences (different fine-grained categories are very similar, such as gray-winged gulls, red birds), and large intra-class differences (objects of the same category are very similar). Because of obvious differences such as pose lighting), this problem makes fine-grained level retrieval more challenging than coarse-grained level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fine-grained cross-media retrieval method based on multi-model network
  • Fine-grained cross-media retrieval method based on multi-model network
  • Fine-grained cross-media retrieval method based on multi-model network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] Such as figure 1 As shown, a fine-grained cross-media retrieval method based on multi-model network, including the following steps:

[0062]Step 1. Obtain the PKU FG-XMedia dataset, which is currently the only fine-grained cross-media dataset, containing 200 fine-grained categories of birds, including four media types: image, video, text and audio. And preprocess the data set to obtain cross-media data.

[0063] Specifically, the preprocessing method is as follows: for pictures and texts, no processing is required; for videos, 25 frames are equally spaced for each video as video data; for audio, short-time Fourier transform is used to obtain a spectrogram as audio data.

[0064] Step 2, respectively extracting the proprietary features of each media data, specifically including:

[0065] Using a feature extractor based on bilinear CNN to extract image and video data features, the specific process is:

[0066] Such as figure 2 As shown, two CNN networks can be regar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fine-grained cross-media retrieval method based on a multi-model network, and the method comprises the steps of obtaining a cross-media data set, and carrying out the preprocessing of the cross-media data set to obtain cross-media data; respectively extracting special features of each media data; extracting public features of the media data; performing weighted summationon the special features and the public features of the cross-media data to obtain final joint features; and measuring the similarity between different media features by using the cosine distance, andsorting the media features according to the similarity. According to the invention, a media private network and a public network are constructed, and the media private network comprises a feature extractor of each media and is used for extracting a private feature of each media; the public network comprises a unified network capable of learning four kinds of media at the same time and used for extracting public features of all the media, the two networks are combined to achieve the purpose that heterogeneous gaps among all the media are eliminated while the features of all the media are reserved to the maximum extent, and therefore effective cross-media retrieval is achieved, and the method has wide application prospects.

Description

technical field [0001] The invention belongs to the technical fields of computer vision, natural language processing, fine-grained recognition, multimedia retrieval, etc., and specifically relates to a fine-grained cross-media retrieval method based on a multi-model network. Background technique [0002] In recent years, with the rapid growth of multimedia data, multimedia data such as images, texts, audio and video has become the main form for people to understand the world. The research on multimedia data has been going on for several years. Past researches usually focus on a single media type, that is, the results of query and retrieval belong to the same media type. At present, the correlation between massive multimedia data is constantly improving, and at the same time, the user's demand for multimedia data retrieval has become very flexible, not only satisfied with a single media type retrieval, so how to realize cross-media retrieval is the key problem that needs to b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/483G06K9/62G06N3/04G06N3/08G06F40/284G06F17/14
CPCG06F16/483G06N3/08G06F40/284G06F17/14G06N3/044G06N3/045G06F18/22G06F18/253
Inventor 王琼柏洁咪姚亚洲唐振民
Owner NANJING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products