A RGBD Salient Object Detection Method Based on Siamese Network

A twin network and object detection technology, applied in the fields of image processing and computer vision, can solve the problems of difficult training, complex models, and unfavorable mining of the commonality of RGB maps and depth maps, so as to improve the detection performance and reduce the number of training data. Effect

Active Publication Date: 2021-03-02
SICHUAN UNIV
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Chen et al. proposed "Progressively complementarity-aware fusion network for rgb-dsalient object detection" in 2018. This method uses two-way parallel neural networks (two-way parallel neural network structures are inconsistent, and parameters are not shared) for RGB and depth information. The features are extracted separately and then fused. Although the detection effect is better than the traditional method of manually designing features, the two-way parallel network structure increases the model parameters while making the whole model more complex, difficult to train, and not conducive to mining RGB. The commonality of the salient features between the map and the depth map, so the detection effect is limited in complex scenes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A RGBD Salient Object Detection Method Based on Siamese Network
  • A RGBD Salient Object Detection Method Based on Siamese Network
  • A RGBD Salient Object Detection Method Based on Siamese Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] A RGBD salient object detection method based on twin network, the flow chart is as follows figure 1 As shown, it specifically includes the following steps:

[0043] S1: Prepare training pictures required for training. Wherein, according to the RGBD saliency detection task involved in the present invention, the training pictures include the original RGB image, the corresponding depth image, and the corresponding expected saliency image. The original RGB image and the depth image (Depth) are used as the network input, and the expected saliency map is used as the expectation of the network output, which is used to calculate the loss function and optimize the network.

[0044] S2: Design twin neural network structure and decoder with fusion function, including:

[0045]S2-1: Design the Siamese neural network part. The twin network is actually implemented by two parallel networks with all parameters shared and consistent structure, which can be VGG-16 structure, Resnet-50...

Embodiment 2

[0056] In this embodiment, the twin neural network part is based on the common VGG-16 network structure, and its Conv1_1-Pool5 part is taken, which is divided into a main network and a side channel, including a total of 13 convolutional layers and 6 levels. From top to bottom are Conv1_1~1_2, Conv2_1~2_2, Conv3_1~3_3, Conv4_1~4_3, Conv5_1~5_3, Pool5. The input resolution of the main network is 320×320, and the output resolution is 20×20. In addition, there are 6 side channels (side channel 1-side channel 6), which are respectively connected to the output of the 6 levels of the main network, namely Conv1_2, Conv2_2, Conv3_3, Conv4_3, Conv5_3, Pool5, and each side channel consists of 2 layers of convolution The output resolutions of the side channels from shallow to deep and from top to bottom are 320×320 (side channel 1), 160×160 (side channel 2), 80×80 (side channel 3), 40×40 ( Side channel 4), 20×20 (side channel 5), 20×20 (side channel 6), the network structure diagram is a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a twin network-based RGBD salient object detection method in the technical field of image processing and computer vision. The steps include: 1. Obtain the RGB image and the depth image of the picture to be detected; 2. Input the RGB image and the depth image into "" The twin network-decoder neural network outputs RGBD saliency detection results, and the "twin network-decoder" neural network is pre-trained jointly, including the twin network and decoder; the steps of S2 specifically include: RGB image and depth image input twin The network outputs the RGB and depth hierarchical features of the twin network side channels; the RGB and depth hierarchical features are input to the decoder, and the RGBD saliency detection results are output. The present invention adopts a twin network combined with a decoder network structure with a fusion function, performs feature fusion on hierarchical features and then decodes, so that RGB information and depth information complement each other, improve detection performance, and obtain refined RGBD detection results.

Description

technical field [0001] The invention relates to the technical fields of image processing and computer vision, in particular to an RGBD salient object detection method based on a twin network. Background technique [0002] Salient object detection intends to automatically detect areas or objects that human eyes focus on in images or scenes, and the detection results are called saliency maps, which can be used in various computer vision applications such as target detection and recognition, image compression, image retrieval, Content-based image editing. Although there are many salient object detection models and algorithms for RGB (that is, the input picture is a single RGB color image), for RGBD—that is, the input is a single RGB color image and its corresponding scene depth (Depth) map. Sexual object detection methods have been relatively lacking. With the increasing popularity of depth cameras such as Microsoft Kinect, Intel RealSense, and mobile phone depth cameras, the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/084G06N3/08G06N3/045G06F18/253
Inventor 傅可人范登平赵启军
Owner SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products