Pedestrian identification method based on local feature perception image-text cross-modal model and model training method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of local features and training methods, applied in the field of pattern recognition, can solve the problems of insufficient precision, complex feature extraction process, difficult to put into practical application scenarios, etc., and achieve the effect of high accuracy and simple structure

Pending Publication Date: 2022-07-12

NANJING UNIV OF INFORMATION SCI & TECH

View PDF2 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the above methods still have the problems of complex feature extraction process and insufficient precision, and it is difficult to put them into practical application scenarios.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0052] This embodiment provides a training method for a local feature-aware graphic-text cross-modal model, wherein the local-feature-aware graphic-text cross-modal model is based on the Pytorch deep learning framework, and is used to mine the feature information of pedestrian images and text descriptions. The perceptual image-text cross-modal model includes a visual feature extraction module and a text feature extraction module. The visual feature extraction module includes a PCB structure for extracting local images, and the text feature extraction module includes a multi-branch convolution structure for extracting text features. Each branch of the multi-branch convolutional structure is aligned with one of the local images.

[0053] Specifically, as figure 1 As shown, the training method of the local feature-aware image-text cross-modal model is as follows.

[0054] 1. Prepare the graphic data set

[0055] Construct an image and text data set, which includes a training s...

Embodiment 2

[0113] This embodiment provides a pedestrian recognition method based on a local feature-aware graphic and text cross-modal model, such as image 3 and Figure 4 As shown, the pedestrian identification method includes:

[0114] Get the graphic data of pedestrians,

[0115] Input the pedestrian's graphic data into the pre-trained local feature-aware graphic-text cross-modal model for feature extraction, and output the pedestrian recognition result.

[0116] The construction and training of the local feature-aware graphic and text cross-modal model have been clearly described in Embodiment 1, and will not be repeated here.

[0117] As will be appreciated by one skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

PUM

Login to view more

Abstract

The invention discloses a pedestrian recognition method based on a local feature perception image-text cross-modal model and a model training method, and belongs to the technical field of mode recognition. The pedestrian recognition method comprises the following steps: acquiring image-text data of pedestrians, and inputting the image-text data of the pedestrian into a pre-trained local feature perception image-text cross-modal model for feature extraction, and outputting a pedestrian recognition result. The local feature perception image-text cross-modal model comprises a visual feature extraction module and a text feature extraction module, PCB local feature learning is introduced to visual feature extraction, a multi-branch convolution structure is introduced to text feature extraction, and image-text local features can be efficiently extracted without introducing semantic segmentation, attribute learning and the like. Cross-modal matching is carried out on three levels of shallow features, local features and global features, and image-text feature distribution is gradually pulled in. The method is simple in structure and high in accuracy, and application of the image-text cross-modal pedestrian retrieval field in actual scenes can be promoted.

Description

technical field [0001] The invention relates to a pedestrian recognition method and a model training method based on a local feature-aware graphic and text cross-modal model, belonging to the technical field of pattern recognition. Background technique [0002] Manually reviewing surveillance cameras to find target pedestrians may have problems such as high time cost, easy omission, and low reliability. In addition, in some specific scenarios, it is impossible to perform intelligent retrieval through technologies such as pedestrian re-recognition and face recognition. For example, witnesses do not take pictures of the target, and can only describe the appearance of pedestrians through dictation. [0003] Existing related technologies are as follows: (1) A text-based pedestrian retrieval self-supervised visual representation learning system and method with application number CN202010590313.2: the algorithm constructs auxiliary tasks (gender judgment and pedestrian similarity ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

Application Information

Patent Timeline

Login to view more

IPC IPC(8): G06V40/10G06V10/40G06V10/74G06V10/774G06K9/62G06V10/82G06N3/04G06N3/08

CPCG06N3/049G06N3/08G06N3/045G06N3/044G06F18/22G06F18/214

Inventor 陈裕豪张国庆

Owner NANJING UNIV OF INFORMATION SCI & TECH

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Try Eureka

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.

Pedestrian identification method based on local feature perception image-text cross-modal model and model training method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology