Systems and Methods for Semantically Classifying and Extracting Shots in Video

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
a semantic classification and video technology, applied in the field of semantic classification of video shots or sequences, can solve the problems of inability to identify and classify the wide array of content present in most videos, no defined mechanism, and inability to classify videos or portions of videos

Active Publication Date: 2014-10-30

TIVO SOLUTIONS INC

View PDF3 Cites 8 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Benefits of technology

The present patent is about a system for classifying videos based on their content. The system includes a processor that receives a video file and extracts a subset of frames from it. The processor then uses a combination of software modules to analyze the frames and determine their content. These modules include an intensity classification module, an indoor / outdoor classification module, an outdoor classification module, a segmentation module, a material arrangement module, and a video file classification module. The system can also determine if a frame is a dark frame, which indicates that it was shot in low or no light. The video file can then be associated with specific content categories based on the determined content. The system can be used for video indexing and retrieval. Overall, the system allows for efficient classification of videos based on their content.

Problems solved by technology

Regardless of the specific approach, conventional image classification systems are ill-equipped to classify videos or portions of videos.

Additionally, the features used in single-image classification systems are often designed for narrow and particular purposes, and are unable to identify and classify the wide array of content present in most videos.

Further, even if conventional systems were able to classify images from a video, these systems include no defined mechanism to account for the presence of a multitude of scene types across a video or portion of video (i.e. identification or classification of a single image or frame in a video does not necessarily indicate that the entire shot within the video from which the frame was extracted corresponds to the identified image class).

In addition to those mentioned, classification of video, or shots within video, presents further challenges because of the variations and quality of images present in most videos.

During the close-up shots, the camera is typically focused on the subject of interest, often resulting in a blurred background, thus confusing any part of the scene type that is visible.

Most videos also include shots in which either the camera or objects within the scene are moving, again causing blurring of the images within the shot.

Additionally, scene content in videos often varies immensely in appearance, resulting in difficulty in identification of such content.

Thus, because video often represents wide varieties of content and subjects, even within a particular content type, identification of that content is exceedingly difficult.

Further, use of raw or basic features, which are sufficient for some conventional image classification systems, are insufficient for a video classification system because videos typically include a multiplicity of image types.

Additionally, the mere detection or identification of a color or type of material in a scene does not necessarily enable classification of the scene.

The Vicar system, however, has many drawbacks that produce inconsistent results.

Further, the key frames are partitioned based on a predetermined grid, such that resulting grid cells may (and often do) contain more than one category, thus leading to confusion of scene types.

Also, the color and texture features used in the system are relatively weak features, which are inadequate for classifying many categories of images.

Additionally, the inference that a key frame or frames adequately and accurately represents an entire sequence of frames does not take into account variations in shots, especially for long or extended shots in videos.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0079]For the purpose of promoting an understanding of the principles of the present disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. It will, nevertheless, be understood that no limitation of the scope of the disclosure is thereby intended; any alterations and further modifications of the described or illustrated embodiments, and any further applications of the principles of the disclosure as illustrated therein are contemplated as would normally occur to one skilled in the art to which the disclosure relates.

Overview

[0080]Aspects of the present disclosure generally relate to systems and methods for semantically classifying shots of video based on video content. Generally, embodiments of the present system analyze video files and associate predefined textual descriptors to the video files. The textual descriptors relate to predefined scene classes or categories describing content in the fi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present disclosure relates to systems and methods for classifying videos based on video content. For a given video file including a plurality of frames, a subset of frames is extracted for processing. Frames that are too dark, blurry, or otherwise poor classification candidates are discarded from the subset. Generally, material classification scores that describe type of material content likely included in each frame are calculated for the remaining frames in the subset. The material classification scores are used to generate material arrangement vectors that represent the spatial arrangement of material content in each frame. The material arrangement vectors are subsequently classified to generate a scene classification score vector for each frame. The scene classification results are averaged (or otherwise processed) across all frames in the subset to associate the video file with one or more predefined scene categories related to overall types of scene content of the video file.

Description

CROSS REFERENCE TO RELATED APPLICATION[0001]This application claims the benefit under 35 U.S.C. §119(e) of U.S. Provisional Patent Application No. 61 / 029,042, filed Feb. 15, 2008, and entitled “Scene Classification on Video Data with a Material Modeling Step.” This application claims the benefit under 35 U.S.C. §121 of U.S. patent application Ser. No. 12 / 372,561, filed Feb. 17, 2009, and entitled “Systems and methods for semantically classifying shots in video.” Each application of which is incorporated herein by reference as if set forth herein in its entirety.TECHNICAL FIELD[0002]The present systems and methods relate generally to classification of video data, files, or streams, and more particularly to semantic classification of shots or sequences in videos based on video content for purposes of content-based video indexing and retrieval, as well as optimizing efficiency of further video analysis.BACKGROUND[0003]Image classification systems (i.e. systems in which the content of a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(United States)

IPC IPC(8): G06K9/62G06K9/34

CPCG06V20/35G06V20/38G06V20/41G06V20/46G06V10/993G06V10/464

Inventor DUNLOP, HEATHERBERRY, MATTHEW

Owner TIVO SOLUTIONS INC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Systems and Methods for Semantically Classifying and Extracting Shots in Video

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Benefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology