Cross-modal retrieval method based on multilevel feature representation alignment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-level, global feature technology, applied in the field of cross-modal retrieval, can solve the problems of insufficient cross-modal correlation and insufficiently precise representation of cross-modal retrieval methods, and achieve a wide range of market demands and application prospects, and improve accuracy. Effect

Pending Publication Date: 2021-12-14

JIAXING UNIV

View PDF6 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] In order to solve the above-mentioned problems existing in the prior art, the present invention provides a cross-modal retrieval method based on multi-level feature representation alignment, which can accurately measure the similarity between images and texts through cross-modal multi-level representation associations, effectively Provide retrieval accuracy, so as to solve the technical problems that the existing cross-modal retrieval methods are not precise enough, and the cross-modal correlation is not sufficient. At the same time, the training of the retrieval model is supervised by using the cross-modal structural constraint information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0066] In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with specific embodiments (but not limited to the cited embodiments) and accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention , but not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0067] The embodiments of the present invention may be applicable to various scenarios, and the implementation environment involved may include an input / output scenario of a single server, or an interaction scenario between a terminal and a server. When the implementation environment is the input and output scene of a single server, the image data and text data are acquired and sto...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a cross-modal retrieval method based on multilevel feature representation alignment, and relates to the technical field of cross-modal retrieval. In the cross-modal fine-grained accurate alignment stage, the global similarity, the local similarity and the relation similarity between two different modal data of an image and a text are calculated respectively, the image-text comprehensive similarity is obtained through fusion, in the neural network training stage, a corresponding loss function is designed, cross-modal structure constraint information is mined, parameter learning of a retrieval model is constrained and supervised from multiple angles, and finally a retrieval result of a test query sample is obtained according to the image-text comprehensive similarity, so that the accuracy of cross-modal retrieval is effectively improved by introducing a fine-grained association relationship between two different modal data of an image and a text, and the method has wide market requirements and application prospects in the fields of image-text retrieval, pattern recognition and the like.

Description

technical field [0001] The invention relates to the technical field of cross-modal retrieval, in particular to a cross-modal retrieval method based on multi-level feature representation alignment. Background technique [0002] With the rapid development of new-generation Internet technologies such as mobile Internet and social networks, multi-modal data such as text, images, and videos have shown explosive growth. The cross-modal retrieval technology aims to realize the spanning retrieval between different modal data by mining and utilizing the associated information between different modal data, and its core is to realize the similarity measurement between cross-modal data. In recent years, cross-modal retrieval technology has become a research hotspot at home and abroad, and has attracted extensive attention from academia and industry. It is one of the important research fields of cross-modal intelligence and an important direction for the future development of information...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/953G06F16/2458G06F40/30G06K9/46G06K9/62G06N3/04G06N3/08

CPCG06F16/953G06F16/2465G06F40/30G06N3/084G06N3/044G06N3/045G06F18/22Y02D10/00

Inventor 张卫锋周俊峰王小江

Owner JIAXING UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Cross-modal retrieval method based on multilevel feature representation alignment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology