The invention discloses a real scene three-dimensional semantic
reconstruction method and device based on
deep learning and a storage medium, relates to the technical field of
remote sensing surveying and mapping geographic information, and solves the problem of inaccurate multi-
scene labeling in the prior art. The method comprises: obtaining anaerial image; carrying out semantic segmentation on the
aerial image, and determining a pixel probability
distribution diagram; performing motion structure
recovery on the
aerial image, and determining a camera
pose of the
aerial image; performing depth
estimation on the aerial image, and determining a
depth map of the aerial image; and performing semantic fusion on the pixel probability distribution map, the camera
pose and the
depth map to determine a three-dimensional semantic model. Thus, high-precision segmentation is realized under the conditions of more scene objects, serious stacking and the like is realized; and in a large-scale scene, the performance of the depth
estimation network is not affected, stable and
accurate estimation can be carried out in various scenes, and compared with other traditional three-dimensional reconstruction algorithms, the semantic three-dimensional
reconstruction algorithm constructed by the invention has the
advantage that the calculation speed is increased.