Video recommendation method based on multi-modal video content and multi-task learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-task learning and video content technology, applied in the field of video recommendation based on multi-modal video content and multi-task learning, can solve problems such as dependence, cold start, and video inaccuracy, and achieve the effect of reducing the scale of parameters

Inactive Publication Date: 2020-06-05

SOUTH CHINA UNIV OF TECH +1

View PDF9 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] At present, the short video recommendation technology is facing two important challenges: (1) Most of the current recommendation algorithms are based on user preferences and user behavior to make recommendations, ignoring the content of items, and there is also a serious cold start problem, which leads to large Most videos are ignored, and even traditional content-based recommendation methods do not perform well because they rely on metadata rather than original video content

However, the metadata of micro-videos is uploaded by users, which may be inaccurate for videos, how to effectively utilize the multi-modal information of videos becomes an important challenge for video recommendation

(2) The single-task recommendation model cannot meet the current needs for multi-tasks. In video recommendation, it is not only necessary to predict whether the user will watch the video, but also predict the user's rating of the video, whether to like it, whether to forward it, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0042] figure 1 The flow chart of the video recommendation method based on multi-modal video content and multi-task learning disclosed in the present invention is given, which specifically includes the following steps:

[0043] T1, video multi-modal feature extraction:

[0044] a. The extraction of video frames, through the opencv video reading class cv2.VideoCapture to intercept video frame pictures, save them in the path folder, the number of frames starts from 0, considering the short and precise characteristics of short videos, intercept each frame of video The picture is intercepted without skipping frames.

[0045] b. Video static feature extraction, adjust the size of each frame of the video to [299, 299] and then input it into the pre-trained Inception-V3 network, as attached figure 1 As shown, the input is mapped to a 2048-dimensional feature vector as the static original feature vector of the video frame. In order to preserve the information of each frame of the vide...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video recommendation method based on multi-modal video content and multi-task learning. The method comprises the following steps: extracting visual, audio and text features of a short video through a pre-trained model; fusing the multi-modal features of the video by adopting an attention mechanism method; learning feature representation of the social relationship of the user by adopting a deep walk method; proposing a deep neural network model based on an attention mechanism to learn multi-domain feature representation; embedding the features generated based on the above steps into a sharing layer as a multi-task model, and generating prediction results through a multi-layer perceptron. According to the method, the attention mechanism is combined with the user features to fuse the video multi-modal features, so that the whole recommendation is richer and more personalized; meanwhile, because of multi-domain features and with consideration of the importance ofinteraction features in recommendation learning, a deep neural network model based on an attention mechanism is provided, so that learning of high-order features is enriched, and more accurate personalized video recommendation is provided for users.

Description

technical field [0001] The invention relates to the technical field of network video and recommendation systems, in particular to a video recommendation method based on multi-modal video content and multi-task learning. Background technique [0002] With the rapid popularization of smart mobile terminals and the development of multimedia technology, video has gradually become the carrier of information dissemination. In recent years, short videos have risen rapidly. Video has become a main way of entertainment for people, and users' interests have also shown wider. The rapid increase in the number of short videos has brought about a serious problem of information overload. How to find videos of interest to users from massive amounts of data has become a hot topic and research object. A good recommendation system can not only help consumers find interesting or even potentially interesting videos faster and more conveniently, but also help content providers increase profits an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): H04N21/25H04N21/466G06F16/783G06N3/04G06N3/08

CPCH04N21/251H04N21/4666H04N21/4668G06F16/7844G06F16/7834G06F16/783G06N3/084G06N3/045

Inventor 史景伦邓丽梁可弘傅钎栓林阳城

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video recommendation method based on multi-modal video content and multi-task learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology