Three-dimensional object reconstruction algorithm based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A three-dimensional object, deep learning technology, applied in the field of computer vision and deep learning

Pending Publication Date: 2021-09-14

UNIV OF ELECTRONICS SCI & TECH OF CHINA

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

See description below

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0036] refer to figure 1 and figure 2 , the invention discloses a three-dimensional object reconstruction algorithm based on deep learning, comprising the following steps:

[0037] Step 1 inputs multiple two-dimensional images of objects obtained from any angle;

[0038] Step 2 establishes a convolutional neural network model;

[0039] In step 3, the two-dimensional image in step 1 is used as training data, which is input into the convolutional neural network established in step 2 for training;

[0040] Step 4: Input the two-dimensional image to be tested into the convolutional neural network model trained in step 3, and the convolutional neural network model outputs a three-dimensional reconstruction result.

[0041]The convolutional neural network model described in step 2 includes an encoder, a decoder, and a multi-view feature combination module. The encoder mines its three-dimensional spatial structure by extracting two-dimensional image features. It is composed of ...

Embodiment 2

[0043] The scheme in embodiment 1 is described in detail below in conjunction with specific calculation formulas and examples, see below for details:

[0044] There are three structures of the convolutional neural network model described in step 2, namely: encoder, decoder, and multi-view feature combination module. Among them, the encoder network designed by the present invention is based on the ResNet network, and SE-Block is added to make the model have a simple attention mechanism. The ReLU activation function is selected in each convolutional layer, and BatchNorm is used for regularization. The SE-Block module can improve the expressive ability of features at a small cost, and only needs to assign different weights to different channels. The network structure of the encoder is attached image 3 shown.

[0045] In the specific embodiment of the present invention, the encoder network is embedded with a SE-Block module, and the SE-Block module first performs global averag...

Embodiment 3

[0065] Below in conjunction with concrete experiment, the scheme in embodiment 1 and 2 is carried out feasibility verification, see the following description for details:

[0066] 1) Experimental data set

[0067] The present invention selects a subset of the ShapeNet data set——ShapeNetCore data set for training and testing. The ShapeNetCore dataset is a subset of the full ShapeNet dataset, which contains 55 common object categories and approximately 51,300 3D models. The present invention selects 13 categories in which the number of models exceeds 1000, and a total of 43783 three-dimensional models. The selected subset categories of the present invention are respectively: airplane (plane), chair (chair), automobile (car), table (table), sofa (couch), stool (bench), cabinet (cabinet), display (monitor), Lamp, speaker, rifle, telephone, vessel.

[0068] Each model is an image with a resolution of 256 × 256 acquired from 12 different angles and saved as a true voxel occupancy...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a three-dimensional object reconstruction algorithm based on deep learning, and the algorithm comprises the steps: inputting a plurality of object two-dimensional images obtained from any angle, carrying out the preprocessing, building a convolutional neural network, enabling the two-dimensional images to serve as training data, inputting the training data into the built convolutional neural network for training, and inputting a to-be-measured two-dimensional image into the trained convolutional neural network model, and outputting a three-dimensional reconstruction result by the convolutional neural network model. According to the invention, the convolutional neural network model comprises an encoder, a decoder and a multi-view feature combination module. The input of the encoder is a multi-view two-dimensional image, the output of the encoder is a two-dimensional feature vector, and the two-dimensional feature vector needs to be converted into three-dimensional information; the three-dimensional information is input into a decoder to obtain three-dimensional prediction voxel occupation of the single image; and finally the final predicted voxel occupation is obtained through a multi-view feature combination module. In the test stage, the accuracy is calculated according to the 0-1 occupancy predicted by the hierarchical prediction strategy and the real ground occupancy.

Description

technical field [0001] The invention belongs to the field of computer vision and deep learning, and in particular relates to a three-dimensional object reconstruction algorithm based on deep learning. Background technique [0002] In recent years, with the emergence of public datasets of 3D objects, the complete and accurate reconstruction of 3D geometric structures from images has become a research hotspot in the fields of computer vision and industrial manufacturing. For example, the AR and VR emerging in the 5G era use 3D reconstruction technology to allow us to truly experience the reconstruction effect of real-time transmission; Make full use of 3D reconstruction technology. In addition, people can get more information from 3D models than from 2D images. Therefore, 3D object reconstruction becomes more and more important. [0003] On the other hand, with the development of computer hardware and artificial intelligence technology, the use of deep learning tools to rec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06T17/20G06T19/00G06N3/04G06N3/08

CPCG06T17/20G06T19/00G06N3/08G06T2200/08G06N3/047G06N3/048

Inventor 贾海涛刘欣月张诗涵李玉琳邹新雷任利许文波罗俊海

Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Three-dimensional object reconstruction algorithm based on deep learning

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A three-dimensional object, deep learning technology, applied in the field of computer vision and deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A three-dimensional object, deep learning technology, applied in the field of computer vision and deep learning

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology