The invention discloses an image processing method and device based on an embedded GPU and the convolution calculation. The method comprises the steps of optimizing the convolutional calculation in an SSD algorithm; performing matrix transformation on an input image by adopting the memory optimized convolution expansion; using a CUDA to parallelly process to form an intermediate matrix, meanwhile, adopting the convolution kernel matrix row and column expansion alignment, partitioning after the convolution kernel matrix expansion so as to reduce the memory overhead during operation, finally, adopting a highly optimized cuBLAS matrix multiplication function in a CUDA library to carry out convolution calculation parallel acceleration, and finally, combining and outputting the matrixes. According to the method provided by the invention, the memory overhead can be reduced, the performance of the algorithm is improved, the advantages of the GPU parallel control are brought into play, the matrix multiplication time is reduced, and the calculation efficiency is improved.