Network live video feature extraction method in complex scene based on joint attention ResNeSt

A network live broadcast and video feature technology, applied in image communication, selective content distribution, electrical components, etc., can solve the problems that it is difficult to effectively learn spatio-temporal context information and affect the accuracy rate, so as to save computing resources, enhance effective extraction, The effect of good discrimination

Active Publication Date: 2021-04-13
BEIJING UNIV OF TECH
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the video scene is relatively single, the number of people is small, and the edge of the object is clear, using the existing deep learning network can obtain better performance, but when there are many types of scenes, the number of people is uncertain, and the lighting conditions are affected In complex live video scenes under limited conditions, directly applying the above-mentioned deep network is not easy to effectively learn spatio-temporal context information, which affects the improvement of accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network live video feature extraction method in complex scene based on joint attention ResNeSt
  • Network live video feature extraction method in complex scene based on joint attention ResNeSt
  • Network live video feature extraction method in complex scene based on joint attention ResNeSt

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] According to the above description, the following is a specific implementation process, but the protection scope of this patent is not limited to this implementation process. Below is the concrete workflow of the present invention:

[0031] The video data used in the present invention comes from multiple network video platforms, and key frames are extracted from various downloaded live videos. During the experiment, the key frame was taken at 5 fps, and only the segment composed of 16 consecutive frames was taken to represent the video, and the video frame data of 224×224 pixels was obtained through Resize preprocessing. Put the video frame data into the feature pyramid for down-sampling to obtain feature maps of different scales; then through the calculation of the joint attention mechanism, the attention weight distribution of multi-scale features is obtained; finally, combined with convolution and pooling operations The ResNeSt module is set up, and the ResNeSt50 fe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a network live video feature extraction method in a complex scene based on joint attention ResNeSt. Firstly, key frame extraction is performed on a network live video to obtain key frame data of the video; in order to utilize multi-scale features of video frames, a parallel path is designed according to a multi-scale structure of a feature pyramid network. The parallel path is constructed from bottom to top, information exchange is carried out between the parallel path and an original main path by utilizing transverse connection and oblique connection, and the transverse connection and the oblique connection are convolution operations. Considering that the picture representation form of network live broadcast is mostly a human main body, and a large amount of redundant information is mingled, the space-channel joint attention is introduced, and the picture main body characteristics are conveniently focused. And finally, a ResNeSt feature extraction module is constructed by combining the parallel feature pyramid fused with the joint attention with a convolution layer and a pooling layer, and feature extraction of the network live video in a complex scene is realized through superposition of multiple layers of modules.

Description

technical field [0001] The present invention takes network live video in complex scenes as the research object, and extracts features of live video through joint attention and ResNeSt network, thereby forming an efficient feature expression of live video. First, use the parallel feature pyramid to perform feature convolution on the key frame of the video; in the convolution process of the feature pyramid, obtain the low-level visual information and high-level semantic information of the video by introducing a joint attention mechanism; finally combine the split attention residual network (ResidualNetworks with Split-Attention, ResNeSt), forming an efficient feature expression for live webcast videos. Background technique [0002] With the advent of the Internet self-media era, more and more people begin to share their lives on the Internet in the form of live videos, and the number of live videos on the Internet is also increasing geometrically. Webcasting has a strong abil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04N21/2187H04N21/234H04N21/44
CPCH04N21/2187H04N21/23418H04N21/44008
Inventor 张菁康俊鹏张广朋卓力
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products