A supervised fast discrete multimodal hash retrieval method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-modal and supervised technology, applied in the field of cross-modal retrieval, can solve the problems of time-consuming and high computational complexity, and achieve the effect of enhancing discriminant, high learning efficiency, and avoiding computational complexity.

Inactive Publication Date: 2019-03-08

SHANDONG NORMAL UNIV

View PDF0 Cites 21 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, the hash codes obtained by these methods contain only limited semantic information

[0005] 2) High computational complexity

This means that such methods have to learn the hash code bit by bit, which can be time consuming when dealing with large datasets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0054] This embodiment discloses a supervised fast discrete multimodal hash retrieval method, comprising the following steps:

[0055] Step 1: Obtain the multimodal training data set O train , where each sample contains pairs of multimodal data features, such as image and text;

[0056] Step 2: Using the joint multimodal feature map, the multimodal training dataset O train Projecting to a joint multimodal intermediate representation;

[0057] Described step 2 specifically comprises:

[0058] First, the multimodal training data set O train The data features of each modality in are transformed into a nonlinear embedding φ m (x (m) ):

[0059]

[0060] Among them, {x (m)} m＝1,...,M is the training data set of the mth mode, and there are M modes in total, is the anchor point set (we randomly select a part of the samples in the training samples of the corresponding modality to form the anchor point set), N is the total number of training samples of the modality, P is t...

Embodiment 2

[0117] The purpose of this embodiment is to provide a computer system.

[0118] A computer system, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the program, it realizes:

[0119] Receive a multimodal training dataset, where each sample contains pairs of multimodal data features;

[0120] Using the joint multimodal feature map, project the multimodal training dataset into a joint multimodal intermediate representation;

[0121] For the joint multimodal intermediate representation of the multimodal training data set, construct a supervised fast discrete multimodal hash objective function; solve the objective function to obtain a hash function;

[0122] Receive the multimodal retrieval data set and the multimodal test data set, project the samples in it into a joint multimodal intermediate representation, and then project to the Hamming space according to the hash function to obtain the h...

Embodiment 3

[0124] The purpose of this embodiment is to provide a computer-readable storage medium.

[0125] A computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the following steps are performed:

[0126] Receive a multimodal training dataset, where each sample contains pairs of multimodal data features;

[0127] Using the joint multimodal feature map, project the multimodal training dataset into a joint multimodal intermediate representation;

[0128] For the joint multimodal intermediate representation of the multimodal training data set, construct a supervised fast discrete multimodal hash objective function; solve the objective function to obtain a hash function;

[0129] Receive the multimodal retrieval data set and the multimodal test data set, project the samples in it into a joint multimodal intermediate representation, and then project to the Hamming space according to the hash function to obtain the hash co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a supervised fast discrete multi-modal hash retrieval method and system. The method includes receiving a multi-modal training data set, wherein each sample contains a pair of multi-modal data features; projecting the multi-modal training dataset to a joint multi-modal intermediate representation by using a joint multi-modal feature map; for the joint multimodal intermediaterepresentation of multimodal training datasets, constructing a supervised fast discrete multimodal hash objective function; solving the objective function to obtain a hash function; receiving multimodal retrieval data set and multimodal test data set, projecting samples into joint multimodal middle representation, and then projecting them into Hamming space to obtain hash code according to hash function; based on hash codes, retrieving samples from multimodal test datasets in multimodal retrieval datasets. The invention learns discrete hash codes for heterogeneous multi-modal data, and ensures learning efficiency and retrieval precision at the same time.

Description

technical field [0001] The invention belongs to the technical field of cross-modal retrieval, and in particular relates to a supervised fast discrete multi-modal hash retrieval method and system. Background technique [0002] Due to its fast similarity calculation efficiency and low storage cost, hashing can significantly improve the speed of large-scale data retrieval. Therefore, many researchers have devoted to learning hashing techniques, especially applying them to single-modal and cross-modal retrieval. [0003] In multimedia retrieval, target data objects are usually described by heterogeneous multimodal features, where different modal features have their own attributes and can exhibit unique data characteristics from different aspects. For example, an image is usually represented by heterogeneous image and text features. A video can be fully represented by multiple features (such as image, text, audio and time channel, etc.). In order to support large-scale multime...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/43G06K9/62

CPCG06F18/214

Inventor 张化祥芦旭李静朱磊刘丽王振华郭培莲

Owner SHANDONG NORMAL UNIV

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A supervised fast discrete multimodal hash retrieval method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology