Cross-modal retrieval method based on multilevel feature representation alignment

A multi-level, global feature technology, applied in the field of cross-modal retrieval, can solve the problems of insufficient cross-modal correlation and insufficiently precise representation of cross-modal retrieval methods, and achieve a wide range of market demands and application prospects, and improve accuracy. Effect

Pending Publication Date: 2021-12-14
JIAXING UNIV
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to solve the above-mentioned problems existing in the prior art, the present invention provides a cross-modal retrieval method based on multi-level feature representation alignment, which can accurately measure the similarity between images and texts through cross-modal multi-level representation associations,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal retrieval method based on multilevel feature representation alignment
  • Cross-modal retrieval method based on multilevel feature representation alignment
  • Cross-modal retrieval method based on multilevel feature representation alignment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with specific embodiments (but not limited to the cited embodiments) and accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention , but not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0067] The embodiments of the present invention may be applicable to various scenarios, and the implementation environment involved may include an input / output scenario of a single server, or an interaction scenario between a terminal and a server. When the implementation environment is the input and output scene of a single server, the image data and text data are acquired and sto...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-modal retrieval method based on multilevel feature representation alignment, and relates to the technical field of cross-modal retrieval. In the cross-modal fine-grained accurate alignment stage, the global similarity, the local similarity and the relation similarity between two different modal data of an image and a text are calculated respectively, the image-text comprehensive similarity is obtained through fusion, in the neural network training stage, a corresponding loss function is designed, cross-modal structure constraint information is mined, parameter learning of a retrieval model is constrained and supervised from multiple angles, and finally a retrieval result of a test query sample is obtained according to the image-text comprehensive similarity, so that the accuracy of cross-modal retrieval is effectively improved by introducing a fine-grained association relationship between two different modal data of an image and a text, and the method has wide market requirements and application prospects in the fields of image-text retrieval, pattern recognition and the like.

Description

technical field [0001] The invention relates to the technical field of cross-modal retrieval, in particular to a cross-modal retrieval method based on multi-level feature representation alignment. Background technique [0002] With the rapid development of new-generation Internet technologies such as mobile Internet and social networks, multi-modal data such as text, images, and videos have shown explosive growth. The cross-modal retrieval technology aims to realize the spanning retrieval between different modal data by mining and utilizing the associated information between different modal data, and its core is to realize the similarity measurement between cross-modal data. In recent years, cross-modal retrieval technology has become a research hotspot at home and abroad, and has attracted extensive attention from academia and industry. It is one of the important research fields of cross-modal intelligence and an important direction for the future development of information...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/953G06F16/2458G06F40/30G06K9/46G06K9/62G06N3/04G06N3/08
CPCG06F16/953G06F16/2465G06F40/30G06N3/084G06N3/044G06N3/045G06F18/22
Inventor 张卫锋周俊峰王小江
Owner JIAXING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products