Three-dimensional object reconstruction algorithm based on deep learning
A three-dimensional object, deep learning technology, applied in the field of computer vision and deep learning
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0036] refer to figure 1 and figure 2 , the invention discloses a three-dimensional object reconstruction algorithm based on deep learning, comprising the following steps:
[0037] Step 1 inputs multiple two-dimensional images of objects obtained from any angle;
[0038] Step 2 establishes a convolutional neural network model;
[0039] In step 3, the two-dimensional image in step 1 is used as training data, which is input into the convolutional neural network established in step 2 for training;
[0040] Step 4: Input the two-dimensional image to be tested into the convolutional neural network model trained in step 3, and the convolutional neural network model outputs a three-dimensional reconstruction result.
[0041]The convolutional neural network model described in step 2 includes an encoder, a decoder, and a multi-view feature combination module. The encoder mines its three-dimensional spatial structure by extracting two-dimensional image features. It is composed of ...
Embodiment 2
[0043] The scheme in embodiment 1 is described in detail below in conjunction with specific calculation formulas and examples, see below for details:
[0044] There are three structures of the convolutional neural network model described in step 2, namely: encoder, decoder, and multi-view feature combination module. Among them, the encoder network designed by the present invention is based on the ResNet network, and SE-Block is added to make the model have a simple attention mechanism. The ReLU activation function is selected in each convolutional layer, and BatchNorm is used for regularization. The SE-Block module can improve the expressive ability of features at a small cost, and only needs to assign different weights to different channels. The network structure of the encoder is attached image 3 shown.
[0045] In the specific embodiment of the present invention, the encoder network is embedded with a SE-Block module, and the SE-Block module first performs global averag...
Embodiment 3
[0065] Below in conjunction with concrete experiment, the scheme in embodiment 1 and 2 is carried out feasibility verification, see the following description for details:
[0066] 1) Experimental data set
[0067] The present invention selects a subset of the ShapeNet data set——ShapeNetCore data set for training and testing. The ShapeNetCore dataset is a subset of the full ShapeNet dataset, which contains 55 common object categories and approximately 51,300 3D models. The present invention selects 13 categories in which the number of models exceeds 1000, and a total of 43783 three-dimensional models. The selected subset categories of the present invention are respectively: airplane (plane), chair (chair), automobile (car), table (table), sofa (couch), stool (bench), cabinet (cabinet), display (monitor), Lamp, speaker, rifle, telephone, vessel.
[0068] Each model is an image with a resolution of 256 × 256 acquired from 12 different angles and saved as a true voxel occupancy...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com