A Monocular 3D Object Detection Method Based on Lightweight Feature Pyramid Structure

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A feature pyramid and target detection technology, applied in neural learning methods, neural architecture, character and pattern recognition, etc., can solve problems that cannot fully meet perception needs, affect real-time deployment of algorithms, increase model reasoning delay, etc., and shorten the model Effects of reasoning delay, improving algorithm accuracy index, and overcoming algorithm efficiency

Active Publication Date: 2021-10-08

TSINGHUA UNIV

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The idea of no anchor frame is the latest research hotspot in the field of target detection. Although the newly proposed pure visual 3D target detection algorithms based on key points (such as CenterNet, SMOKE, RTM3D) have high algorithm efficiency (CenterNet: 30ms, SMOKE: 30ms , RTM3D: 50ms) to meet the requirements of real-time deployment and engineering implementation on the edge computing platform for autonomous driving, but the low accuracy index still cannot fully meet the perception needs in the autonomous driving scenario

[0005] Subsequent improvement methods mainly include: adding traditional feature pyramid structure, connecting multi-stage cascaded regression structure, and introducing deep reinforcement learning to realize the optimization of detection frames. These improvement schemes have effectively improved the accuracy index of the algorithm, but due to the The additional structural branch introduced also greatly increases the delay of model reasoning, which affects the real-time deployment of the algorithm on the edge computing platform for autonomous driving; therefore, how to greatly improve the accuracy index of the existing method without reducing the efficiency will be Has great practical engineering value

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0102] The method in Embodiment 1 may be applied to or implemented by a processor. A processor may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in a processor or an instruction in the form of software. The above-mentioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable Logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps and logic block diagrams disclosed in Embodiment 1 can be realized or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a monocular 3D target detection method based on a light-weight feature pyramid structure, comprising: collecting RGB images of a vehicle-mounted camera; inputting the RGB images into a pre-established and trained monocular 3D target detection network, and outputting target detection Results; The monocular 3D target detection network includes: feature extraction network, detection head and post-processing module; the feature extraction network is used to extract advanced semantic features from RGB image downsampling, generate 4 times, 8 times and 16 times downsampling feature maps and Input to the detection head; the detection head is used to generate candidate key point category vectors and candidate key point pixel position index vectors based on 4 times downsampled feature maps, and generate candidate key point 3D regression based on 4 times, 8 times and 16 times downsampled feature maps The frame encoding vector, and output the candidate key point category vector and the 3D regression frame encoding vector to the post-processing module; the post-processing module is used to decode the 3D regression frame encoding vector, and output the target detection result in combination with the candidate key point category vector.

Description

technical field [0001] The invention relates to the technical field of automatic driving, in particular to a monocular 3D object detection method based on a lightweight feature pyramid structure. Background technique [0002] In the automatic driving system, 3D object detection is a very important task in the perception module. The back-end prediction, planning, motion control and other modules all rely on the reliable detection results of specific types of objects around the main vehicle. With the advantages of high-beam lidar that can accurately model the surrounding environment at the centimeter level, lidar-based 3D target detection algorithms have made great progress in recent years. Congenital deficiencies such as poor capabilities severely limit the large-scale implementation of lidar and related algorithms in the field of autonomous driving. Compared with lidar, vision sensors are not only low in cost, better than lidar in adapting to severe weather such as rain and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06K9/00G06K9/46G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06V20/56G06V10/44G06N3/045G06F18/241

Inventor 李骏张新钰杨磊王力

Owner TSINGHUA UNIV

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Monocular 3D Object Detection Method Based on Lightweight Feature Pyramid Structure

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology