The invention relates to a
land cover classification method based on deep fusion of multi-
modal remote sensing data. The method comprises the following steps: (1) constructing a
remote sensing image semantic segmentation network based on multi-
modal information fusion, wherein the network comprises an
encoder for extracting ground feature features, a depth
feature fusion module, a spatial
pyramid module and an up-sampling decoder, and the depth
feature fusion module comprises an ACF3 module and a CACF3 module which are used for simultaneously fusing three
modal information of RGB, DSM and NDVI, the ACF3 module is a self-attention
convolution fusion module based on
transformer and
convolution, and the CACF3 module is a cross-modal
convolution fusion module based on
transformer and convolution; (2) training the network constructed in the step (1); (3) utilizing the
network model trained in the step (2) to predict the
remote sensing image ground object category. Compared with a conventional method, the
earth surface coverage classification method based on multi-modal remote
sensing data deep fusion has the advantages that the precision improvement effect on
earth surface classification tasks is remarkable, and the application prospect is wide.