The invention discloses a learning-based video coding and decoding framework, which comprises a space-
time domain reconstruction memory, a space-
time domain prediction network, an iterative analyzer,an iterative synthesizer, a binarization device, an entropy coder and an entropy decoder, wherein the space-
time domain reconstruction memory is used for storing a reconstructed video content after coding and decoding; the space-time
domain prediction network is used for utilizing the space-
time domain correlation of the reconstructed video content, modeling the reconstructed video content througha
convolutional neural network and a circulating neural network, and outputting a predicted value of a current
coding block, wherein a residual error is formed by subtraction of the predicted value and an original value; the iterative analyzer and the iterative synthesizer are used for coding and decoding the input residual error step by step; the binarization device is used for quantizing the output of the iterative analyzer into a binary representation; the entropy coder is used for carrying out entropy coding on the quantized coding output in order to obtain an output code
stream; and theentropy decoder is used for carrying out entropy decoding on the output code
stream and outputting the output code
stream to the iterative synthesizer. According to the coding framework, the prediction of a space-time domain is realized through the learning-based VoxelCNN (namely space-time
domain prediction network), and the control of video coding
rate distortion optimization is realized througha residual iterative coding method.