Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Summarization of audio and/or visual data

A technology of video data and audio data, applied in the direction of electric digital data processing, special data processing applications, digital data information retrieval, etc., can solve problems such as expensive creation and maintenance, slow access, and inability to find names or roles

Inactive Publication Date: 2008-03-05
KONINKLIJKE PHILIPS ELECTRONICS NV
View PDF1 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the system may not be able to find a name or character for each face or voice pattern
Creating and maintaining a database for general video (such as TV content and home video movies) is a very expensive and difficult task
Furthermore, such databases are undoubtedly large, resulting in slow access during the identification phase

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Summarization of audio and/or visual data
  • Summarization of audio and/or visual data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] One embodiment of the invention is described for a video summarization system that locates segments in video content that represent the main (leading) actors and characters. Elements of this embodiment are schematically depicted in FIGS. 1 and 2 . However object detection is not limited to face detection, any type of object can be detected, such as speech, sound, car, phone, cartoon character, etc., and summarization can be based on these objects.

[0041]In a first phase I, the input phase, a set of video data is input 10 . The set of video data may be a stream of video frames from a movie. A given frame 1 of a video stream may be analyzed by a face detector D. A face detector can locate an object 2 in a frame, which in this case is a face. The face detector provides the located faces to the facial feature extractor E for type feature 3 extraction. The type signature is shown here by a vector quantization histogram known in the prior art (see "FaceRecognition Using...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Summarization of audio and / or visual data based on clustering of object type features is disclosed. Summaries of video, audio and / or audiovisual data may be provided without any need of knowledge about the true identity of the objects that are present in the data. In one embodiment of the invention are video summaries of movies provided. The summarization comprising the steps of inputting audio and / or visual data, locating an object in a frame of the data, such as locating a face of an actor, extracting type features of the located object in the frame. The extraction of type features is done for a plurality of frames and similar type features are grouped together in individual clusters, each cluster being linked to an identity of the object. After the processing of the video content, the largest clusters correspond to the most important persons in the video.

Description

technical field [0001] The present invention relates to summarization of audio and / or video data, and in particular to summarization of audio and / or video data based on grouping of type characteristics of objects present in the audio and / or video data. Background technique [0002] Automatic summarization of audio and / or video data aims to efficiently represent audio and / or video data for easier browsing, searching and, more generally, content management. Automatically generated summaries can support users in searching and navigating in large data documents, for example, in order to make more efficient decisions when acquiring, moving, deleting, etc. content. [0003] For example, the automatic generation of video previews and video summaries requires the positioning of video segments with main actors or characters. Current systems use facial and voice recognition technology to identify people appearing on video. [0004] Patent Publication No. US2003 / 0123712 discloses a m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G10L25/57G10L25/78G10L25/84
CPCG06K9/00711G06F17/30843G06F17/30793G06F17/30796G06F16/7844G06F16/739G06F16/784G06V20/47G06V20/41G06F18/00
Inventor M·巴比里N·迪米特罗瓦L·阿格尼霍特里
Owner KONINKLIJKE PHILIPS ELECTRONICS NV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products