Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

552 results about "Audio recognition" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Modular Audio Recognition Framework (MARF) is an open-source research platform and a collection of voice, sound, speech, text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework that attempts to facilitate addition of new algorithms.

Audio identification system and method

InactiveUS7174293B2Facilitate interactive acceptance and processingImprove accuracySpeech recognitionStatic storageThe InternetEngineering

A method and system for direct audio capture and identification of the captured audio. A user may then be offered the opportunity to purchase recordings directly over the Internet or similar outlet. The system preferably includes one or more user-carried portable audio capture devices that employ a microphone, analog to digital converter, signal processor, and memory to store samples of ambient audio or audio features calculated from the audio. Users activate their capture devices when they hear a recording that they would like to identify or purchase. Later, the user may connect the capture device to a personal computer to transfer the audio samples or audio feature samples to an Internet site for identification. The Internet site preferably uses automatic pattern recognition techniques to identify the captured samples from a library of recordings offered for sale. The user can then verify that the sample is from the desired recording and place an order online. The pattern recognition process uses features of the audio itself and does not require the presence of artificial codes or watermarks. Audio to be identified can be from any source, including radio and television broadcasts or recordings that are played locally.

Audio identification system and method

Audio identification system and method

Audio identification system and method

Owner:ICEBERG IND

Advertising using extracted context sensitive information and data of interest from voice/audio transmissions and recordings

InactiveUS20080109222A1Without irritating consumersGood serviceSpeech recognitionMarketingPattern recognitionContext sensitivity

Method and apparatus that use voice / audio recognition and analysis technologies to deliver assigned context sensitive information and data of interest (keywords, phrases, mood, etc.). Context sensitive information and data of interest can be extracted from any voice / audio transmissions and voice / audio recordings (or any transmission or recording that includes voice / audio) for advertising purposes. This invention includes the said assigned context sensitive information and data of interest extraction method using voice / audio recognition and analysis technologies. Most importantly, this invention opens up new doors to advertising using extracted context sensitive information and data of interest from voice / audio transmissions and recordings (or any transmission that includes voice / audio).

Advertising using extracted context sensitive information and data of interest from voice/audio transmissions and recordings

Advertising using extracted context sensitive information and data of interest from voice/audio transmissions and recordings

Advertising using extracted context sensitive information and data of interest from voice/audio transmissions and recordings

Owner:LIU EDWARD

Multi-mode audio recognition and auxiliary data encoding and decoding

ActiveUS20140108020A1Improving communication over networkOptimize networkSpeech analysisData capacityFeature extraction

Audio signal processing enhances audio watermark embedding and detecting processes. Audio signal processes include audio classification and adapting watermark embedding and detecting based on classification. Advances in audio watermark design include adaptive watermark signal structure data protocols, perceptual models, and insertion methods. Perceptual and robustness evaluation is integrated into audio watermark embedding to optimize audio quality relative the original signal, and to optimize robustness or data capacity. These methods are applied to audio segments in audio embedder and detector configurations to support real time operation. Feature extraction and matching are also used to adapt audio watermark embedding and detecting.

Multi-mode audio recognition and auxiliary data encoding and decoding

Multi-mode audio recognition and auxiliary data encoding and decoding

Multi-mode audio recognition and auxiliary data encoding and decoding

Owner:DIGIMARC CORP

Multi-mode audio recognition and auxiliary data encoding and decoding

ActiveUS20140142958A1Improving communication over networkOptimize networkSpeech analysisData capacityFeature extraction

Audio signal processing enhances audio watermark embedding and detecting processes. Audio signal processes include audio classification and adapting watermark embedding and detecting based on classification. Advances in audio watermark design include adaptive watermark signal structure data protocols, perceptual models, and insertion methods. Perceptual and robustness evaluation is integrated into audio watermark embedding to optimize audio quality relative the original signal, and to optimize robustness or data capacity. These methods are applied to audio segments in audio embedder and detector configurations to support real time operation. Feature extraction and matching are also used to adapt audio watermark embedding and detecting.

Multi-mode audio recognition and auxiliary data encoding and decoding

Multi-mode audio recognition and auxiliary data encoding and decoding

Multi-mode audio recognition and auxiliary data encoding and decoding

Owner:DIGIMARC CORP

Rolling audio recognition

ActiveUS20110173208A1Fast constructionDigital data information retrievalDigital data processing detailsComputer hardwareAudio recognition

An audio fingerprint is generated by transforming an audio sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array, detecting a plurality of local maxima for a predetermined number of time slices, selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting, and generating one or more hash values corresponding to the predetermined number of largest-magnitude maxima.

Rolling audio recognition

Rolling audio recognition

Rolling audio recognition

Owner:ROVI TECH CORP

Automatic labeling and control of audio algorithms by audio recognition

ActiveUS20110075851A1Better-sounding audioFaster and more creative work flowElectrical apparatusSpeech analysisMultimedia softwareApplication software

Controlling a multimedia software application using high-level metadata features and symbolic object labels derived from an audio source, wherein a first-pass of low-level signal analysis is performed, followed by a stage of statistical and perceptual processing, followed by a symbolic machine-learning or data-mining processing component is disclosed. This multi-stage analysis system delivers high-level metadata features, sound object identifiers, stream labels or other symbolic metadata to the application scripts or programs, which use the data to configure processing chains, or map it to other media. Embodiments of the invention can be incorporated into multimedia content players, musical instruments, recording studio equipment, installed and live sound equipment, broadcast equipment, metadata-generation applications, software-as-a-service applications, search engines, and mobile devices.

Automatic labeling and control of audio algorithms by audio recognition

Automatic labeling and control of audio algorithms by audio recognition

Automatic labeling and control of audio algorithms by audio recognition

Owner:IZOTOPE

System and methods for continuous audio matching

ActiveUS20120029670A1Easy to useMetadata audio data retrievalSpecial data processing applicationsContinuous monitoringAudio recognition

The present invention relates to the continuous monitoring of an audio signal and identification of audio items within an audio signal. The technology disclosed utilizes predictive caching of fingerprints to improve efficiency. Fingerprints are cached for tracking an audio signal with known alignment and for watching an audio signal without known alignment, based on already identified fingerprints extracted from the audio signal. Software running on a smart phone or other battery-powered device cooperates with software running on an audio identification server.

System and methods for continuous audio matching

System and methods for continuous audio matching

System and methods for continuous audio matching

Owner:SOUNDHOUND AI IP LLC

Extended videolens media engine for audio recognition

InactiveUS20130006625A1Improve recognition accuracySpeech recognitionSelective content distributionClosed captioningNetwork media

A system, method, and computer program product for automatically analyzing multimedia data audio content are disclosed. Embodiments receive multimedia data, detect portions having specified audio features, and output a corresponding subset of the multimedia data and generated metadata. Audio content features including voices, non-voice sounds, and closed captioning, from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Particular speakers and the most meaningful content sounds and words and corresponding time-stamps are recognized via database comparison, and may be presented in order of match probability. Embodiments responsively pre-fetch related data, recognize locations, and provide related advertisements. The content features may be also sent to search engines so that further related content may be identified. User feedback and verification may improve the embodiments over time.

Extended videolens media engine for audio recognition

Extended videolens media engine for audio recognition

Extended videolens media engine for audio recognition

Owner:SONY CORP

Media identification

ActiveUS20070286463A1Digital computer detailsCharacter and pattern recognitionAudio recognitionVideo recognition

A method obtains media on a device, provides identification of an object in the media via image / video recognition and audio recognition, and displays on the device identification information based on the identified media object.

Media identification

Media identification

Media identification

Owner:SONY CORP

Video surveillance system and method with combined video and audio recognition

InactiveUS20060227237A1Improve detection accuracyShorten the timeTelevision system detailsColor television detailsVideo monitoringFalse alarm

A novel video surveillance system is made up of video and audio compression engine, a storage device and, a video and audio recognition engine. The video recognition engine detects such events as face recognition, motion detection etc, whereas audio recognition engine detects voice and other sound signatures indicating a potential alarm situation, e.g., panic voices such as screaming and yelling, or sounds such as gun shots, explosions. Combined recognition of audio and video signals provides for higher true alarm generation and lower false alarms level of the surveillance system. Additionally, the audio recognition engine provides information for directing video cameras in the direction of interest allowing better capture of an interesting scene.

Video surveillance system and method with combined video and audio recognition

Video surveillance system and method with combined video and audio recognition

Video surveillance system and method with combined video and audio recognition

Owner:IBM CORP

Audio matching with semantic audio recognition and report generation

ActiveUS20140180674A1Electrophonic musical instrumentsSpeech recognitionHarmonicSpeech sound

System, apparatus and method for determining semantic information from audio, where incoming audio is sampled and processed to extract audio features, including temporal, spectral, harmonic and rhythmic features. The extracted audio features are compared to stored audio templates that include ranges and / or values for certain features and are tagged for specific ranges and / or values. The semantic information may be associated with audio signature dataExtracted audio features that are most similar to one or more templates from the comparison are identified according to the tagged information. The tags are used to determine the semantic audio data that includes genre, instrumentation, style, acoustical dynamics, and emotive descriptor for the audio signal.

Audio matching with semantic audio recognition and report generation

Audio matching with semantic audio recognition and report generation

Audio matching with semantic audio recognition and report generation

Owner:THE NIELSEN CO (US) LLC

Audio signal de-identification

ActiveUS20060190263A1Data processing applicationsSpeech recognitionTimestampSpoken language

Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.

Audio signal de-identification

Audio signal de-identification

Audio signal de-identification

Owner:MULTIMODAL TECH INC

Method for identifying local discharge signals of switchboard based on support vector machine model

InactiveCN102426835AGuaranteed reliabilityEnsure safetySpeech recognitionZero-crossing rateFeature parameter

The invention discloses a method for identifying local discharge signals of a switchboard based on a support vector machine model. The method comprises a model training process and an audio identifying process, and particularly comprises the following steps of: preprocessing audio signals; extracting effective audios according to short-time energy and a zero-crossing rate; segmenting the effective audios and extracting characteristic parameters such as Mel cepstrum coefficients, first order difference Mel cepstrum coefficients, high zero-crossing rate and the like of each segment of the audios; training a sample set by using a support vector machine tool, and establishing a corresponding support vector machine model; after preprocessing audio signals to be identified and extracting and segmenting the effective audios, classifying and identifying segment-characteristic-based samples to be tested according to the support vector machine model; and post-processing classification results, and judging whether partial discharge signals exist. By using the method, the existence of the partial discharge signals of the switchboard is accurately identified, the happening of major accidents involving electricity is prevented and avoided, economic losses caused by insulation accidents are reduced, and the power distribution reliability is improved.

Method for identifying local discharge signals of switchboard based on support vector machine model

Method for identifying local discharge signals of switchboard based on support vector machine model

Owner:SOUTH CHINA UNIV OF TECH

Identification of an object in media and of related media objects

ActiveUS7787697B2Digital computer detailsCharacter and pattern recognitionComputer scienceAudio recognition

A method obtains media on a device, provides identification of an object in the media via image / video recognition and audio recognition, and displays on the device identification information based on the identified media object.

Identification of an object in media and of related media objects

Identification of an object in media and of related media objects

Identification of an object in media and of related media objects

Owner:SONY CORP

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

ActiveUS7330538B2Automatic exchangesSpeech recognitionThe InternetClosed loop

A system and method for enabling two computer systems to communicate over an audio communications channel, such as a voice telephony connection. Such a system includes a software application that enables a user's computer to call, interrogate, download, and manage a voicemail account stored on a telephone company's computer, without human intervention. A voicemail retrieved from the telephone company's computer can be stored in a digital format on the user's computer. In such a format, the voicemail can be readily archived, or even distributed throughout a network, such as the Internet, in a digital form, such as an email attachment. Preferably a computationally efficient audio recognition algorithm is employed by the user's computer to respond to and navigate the automated audio menu of the telephone company's computer.

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

Owner:INTELLISIST

Audio Processing Techniques for Semantic Audio Recognition and Report Generation

ActiveUS20140180673A1Natural language translationElectrophonic musical instrumentsAudio signal flowSpeech sound

System, apparatus and method for determining semantic information from audio, where incoming audio is sampled and processed to extract audio features, including temporal, spectral, harmonic and rhythmic features. The extracted audio features are compared to stored audio templates that include ranges and / or values for certain features and are tagged for specific ranges and / or values. Extracted audio features that are most similar to one or more templates from the comparison are identified according to the tagged information. The tags are used to determine the semantic audio data that includes genre, instrumentation, style, acoustical dynamics, and emotive descriptor for the audio signal.

Audio Processing Techniques for Semantic Audio Recognition and Report Generation

Audio Processing Techniques for Semantic Audio Recognition and Report Generation

Audio Processing Techniques for Semantic Audio Recognition and Report Generation

Owner:THE NIELSEN CO (US) LLC

Audio content fingerprinting based on two-dimensional constant Q-factor transform representation and robust audio identification for time-aligned applications

ActiveUS9299364B1Speech analysisSpecial data processing applicationsFrequency spectrumSpectral bands

Content identification methods for consumer devices determine robust audio fingerprints that are resilient to audio distortions. One method generates signatures representing audio content based on a constant Q-factor transform (CQT). A 2D spectral representation of a 1D audio signal facilitates generation of region based signatures within frequency octaves and across the entire 2D signal representation. Also, points of interest are detected within the 2D audio signal representation and interest regions are determined around selected points of interest. Another method generates audio descriptors using an accumulating filter function on bands of the audio spectrum and generates audio transform coefficients. A response of each spectral band is computed and transform coefficients are determined by filtering, by accumulating derivatives with different lags, and computing second order derivatives. Additionally, time and frequency based onset detection determines audio descriptors at events and enhances descriptors with information related to an event.

Audio content fingerprinting based on two-dimensional constant Q-factor transform representation and robust audio identification for time-aligned applications

Audio content fingerprinting based on two-dimensional constant Q-factor transform representation and robust audio identification for time-aligned applications

Audio content fingerprinting based on two-dimensional constant Q-factor transform representation and robust audio identification for time-aligned applications

Owner:ROKU INCORPORATED

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

ActiveUS20070143106A1Automatic call-answering/message-recording/conversation-recordingAutomatic exchangesThe InternetClosed loop

A system and method for enabling two computer systems to communicate over an audio communications channel, such as a voice telephony connection. Such a system includes a software application that enables a user's computer to call, interrogate, download, and manage a voicemail account stored on a telephone company's computer, without human intervention. A voicemail retrieved from the telephone company's computer can be stored in a digital format on the user's computer. In such a format, the voicemail can be readily archived, or even distributed throughout a network, such as the Internet, in a digital form, such as an email attachment. Preferably a computationally efficient audio recognition algorithm is employed by the user's computer to respond to and navigate the automated audio menu of the telephone company's computer.

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

Owner:INTELLISIST

Audio recognition during voice sessions to provide enhanced user interface functionality

InactiveUS20110014952A1Input/output for user-computer interactionCathode-ray tube indicatorsMobile deviceSpeech sound

The user interface for a mobile communication device may be provided based on the current context of a voice session, as recognized by an automated audio recognition engine. In one implementation, the mobile device may transcribe, by an audio recognition engine in the mobile device, audio from a voice session conducted through the mobile device; detect, by the mobile device and based at least on the transcribed audio, changes in context during the voice session that relate to a change in functionality of the user interface of the mobile device; and update, by the mobile device, the user interface in response to the detected change in context.

Audio recognition during voice sessions to provide enhanced user interface functionality

Audio recognition during voice sessions to provide enhanced user interface functionality

Audio recognition during voice sessions to provide enhanced user interface functionality

Owner:SONY ERICSSON MOBILE COMM AB

Method of providing customized ring tone service

InactiveUS20070264978A1Minimizing and altogether eliminating musical portionMinimizing or altogether eliminating the musical portionSpecial service for subscribersCurrent supply arrangementsAddress bookCompanion animal

A Ring Tone is downloaded to a mobile terminal that combines both a musical portion selected by a subscriber / purchaser as well as an audio identification portion that is associated with the identity of a caller and is customized by the subscriber / purchaser. For example, the audio identification portion can be a recording of the caller's name or a “pet” name or nickname associated with that caller, or any sound that the subscriber / purchaser chooses to identify the caller. Thus, when an incoming call from that caller is received by the subscriber's / purchaser's mobile terminal and is recognized from its caller ID as being one that is stored in the mobile terminal's address book with an associated Ring Tone, the Ring Tone that is played includes both a musical portion and an audio portion that audibly identifies the caller. The subscriber / purchaser can thus immediately and unambiguously identify the caller.

Method of providing customized ring tone service

Method of providing customized ring tone service

Method of providing customized ring tone service

Owner:LUCENT TECH INC

Audio Recognition System

InactiveUS20100023328A1Electrophonic musical instrumentsDigital data information retrievalService provisionAudio frequency

A system and method of identifying an audio track uses music identification software that produces a fingerprint or audio profile for an audio segment recorded with a portable communication device. The audio profile is transmitted from the portable communication device to a remote service provider over a communication network. The remote server receives the transmitted audio track profile and compares the profile to a stored database of audio tracks. If a matching audio track is identified by the remote server, metadata relating to the identified audio track is transmitted from the remote server to the portable communication device. The received audio track metadata is then displayed on the portable communication device.

Audio Recognition System

Audio Recognition System

Audio Recognition System

Owner:VINCI BRANDS LLC

Voice denoising method based on audio recognition

InactiveCN101404160AImprove integrityEffective destinationSpeech analysisNoise reductionSpaceflight

The invention provides a speech noise reduction method based on audio recognition, which reduces the noise of a receiving end by aiming at the speech communication under complex noise environment, belonging to the field of computer science and technology. Most of the existing noise reduction methods are only suitable for stable noise environment and can not remove the noise under the situations of complex noise environment, especially the situation of frequent mutagenicity noise and the like. The method leads a mode recognition idea in the communication speech noise reduction, divides an audio signal into a speech signal and a non-speech signal, automatically identifies the input signal by extracting the speech characteristic and designing a sorter model, and judges the audio type; if the audio type is noise, the audio signal is removed; if the audio type is speech, the audio signal is remained and processed further. The method meets the real-time requirement and has better reduction noise effect at the same time, can be suitable for the situations with complex communication environments such as manned spaceflight speech communication, construction sites, battlefields and the like, and provides an idea and a method for the noise reduction of signals.

Voice denoising method based on audio recognition

Voice denoising method based on audio recognition

Voice denoising method based on audio recognition

Owner:UNIV OF SCI & TECH BEIJING

Methods and devices for obtaining and pushing information and information interaction system

ActiveCN104023247AImprove the efficiency of obtaining informationEfficient pushBroadcast components for monitoring/identification/recognitionBroadcast services for monitoring/identification/recognitionInteraction systemsRelevant information

The invention provides a method for obtaining information. The method comprises the following steps that: sound is detected and collected, and then, events are triggered; the sound, played in real time in the environment, of the current channel is collected, and audio data is obtained; the audio data or audio feature information or audio fingerprints are sent to a server, so that the server obtains the audio fingerprints, and a matched channel mark corresponding to the channel audio fingerprints matched with the audio fingerprints is determined according to a real-time buffered channel audio fingerprint database; and preset information which corresponds to the matched channel mark, is obtained from the preset information database and is sent by the server is received. When the method for obtaining information provided by the invention is utilized, the audio identification can be carried out through the server only through triggering the collected sound on a user terminal, further, relevant information of programs playing in the current channel can be obtained, and the information obtaining efficiency is greatly improved. The invention also provides a device for obtaining the information, a method for pushing the information, a device for pushing the information and an information interaction system.

Methods and devices for obtaining and pushing information and information interaction system

Methods and devices for obtaining and pushing information and information interaction system

Methods and devices for obtaining and pushing information and information interaction system

Owner:TENCENT TECH (SHENZHEN) CO LTD

Device and method for extracting audio/video content information

InactiveCN101600118AEasy to browseTelevision system detailsPulse modulation television signal transmissionUser inputVideo processing

The invention provides an audio / video processing device and an audio / video processing method. The audio / video processing device comprises a receiving unit, a decoding unit, a user interface unit, an information extracting unit and an information storage unit, wherein the receiving unit receives signals and outputs transmission streams; the decoding unit decodes the output transmission streams; the user interface unit receives a determined content output by users; the information extracting unit extracts a prescribed content; and the information storage unit stores the prescribed content, wherein the determined content includes a determined video content or determined audio content, and the alternative one is determined by an audio / video contrasting relation table. The information extracting unit comprises an audio identification unit, a video identification unit and an information matching unit, wherein the audio identification unit identifies the determined audio content from audio steams from the decoding unit; the video identification unit identifies the determined video content from video steams from the decoding unit; the information matching unit determines if an identification result of the audio identification unit is matched with an identification result of the video identification unit, and when the identification result of the audio identification unit is matched with the identification result of the video identification unit, the information matching unit records the prescribed content which corresponds to the determined video content or the determined audio consent in the information storage unit.

Device and method for extracting audio/video content information

Device and method for extracting audio/video content information

Device and method for extracting audio/video content information

Owner:HITACHI LTD

Intelligent interaction system and method

ActiveCN104867492AImprove experienceImprove interaction efficiencySpeech recognitionSpecial data processing applicationsInteraction systemsUser input

The invention relates to an intelligent interaction system and method. The system includes an audio receiving module, a real-time processing module and an execution module, wherein the audio receiving module is used for receiving audio information inputted by a user, the real-time processing module is used for performing parallel online real-time processing on the audio information, and the execution module is used for executing corresponding operation according to identification results transmitted by the real-time processing module. The parallel online real-time processing includes the following steps that: classification processing and identification processing corresponding to different types are performed on the audio information; if credible classification types are obtained before the ending of audio input, identification processing on classification types except the credible classification types is terminated; identification results corresponding to the credible classification types can be obtained and are transmitted to the execution module. With the intelligent interaction system and method of the invention adopted, the user can use audio identification and voice interaction functions easily and quickly, and user experience can be enhanced.

Intelligent interaction system and method

Intelligent interaction system and method

Intelligent interaction system and method

Owner:科大讯飞(北京)有限公司

Video feature extraction method, device and computer device

ActiveCN108833973ASpeech recognitionSelective content distributionAcquired characteristicFeature extraction

The present invention provides a video feature extraction method, a video feature extraction device and a computer device. The video feature extraction method includes the following steps that: a target video is divided according to a predetermined unit time length; at least two frames of images included in video segments are obtained; the at least two frames of images are identified, feature information contained in the images is obtained, and the image feature information of the video segments is obtained according to the feature information included in the images; the text feature information of the video segments is obtained according to the caption recognition result of each frame of image and the real-time speech recognition result of the video segments; semantic analysis is performed, so that the feature information of the video segments is obtained; and a mapping relationship between the feature information of the video segments and the target video is established. With the video feature extraction method, the video feature extraction device and the computer device of the invention adopted, the feature information of the video can be automatically extracted through image video and audio recognition technology, and the extraction of the feature information is refined to the dimension of the video segments of unit time length in the video, and therefore, the obtained feature information is more comprehensive.

Video feature extraction method, device and computer device

Video feature extraction method, device and computer device

Video feature extraction method, device and computer device

Owner:TENCENT TECH (SHENZHEN) CO LTD

Video surveillance system and method with combined video and audio recognition

InactiveUS20080309761A1Improve detection accuracyImprove efficiencyColor television detailsClosed circuit television systemsVideo monitoringFalse alarm

A novel video surveillance system is made up of video and audio compression engine, a storage device and, a video and audio recognition engine. The video recognition engine detects such events as face recognition, motion detection etc, whereas audio recognition engine detects voice and other sound signatures indicating a potential alarm situation, e.g., panic voices such as screaming and yelling, or sounds such as gun shots, explosions. Combined recognition of audio and video signals provides for higher true alarm generation and lower false alarms level of the surveillance system. Additionally, the audio recognition engine provides information for directing video cameras in the direction of interest allowing better capture of an interesting scene.

Video surveillance system and method with combined video and audio recognition

Video surveillance system and method with combined video and audio recognition

Video surveillance system and method with combined video and audio recognition

Owner:IBM CORP

Audio recognition method and device

ActiveCN105657535AUnaffected by noisy environmentsEasy to operateSpeech analysisSelective content distributionAudio frequencyAudio recognition

The invention discloses an audio recognition method and an audio recognition device, and relates to the field of audio technologies. The method comprises the steps of intercepting an audio stream with a first time length from the source data of a video file; and obtaining corresponding audio information through retrieving according to the audio stream with the first time length, and showing to a user, wherein the step of obtaining the corresponding audio information through retrieving according to the audio stream with the first time length comprises dividing the audio stream into at least two sub-audio streams according to a preset rule, and sequentially retrieving the sub-audio streams obtained through dividing to obtain the audio information. According to the audio recognition method, the audio stream can be directly extracted from the current played video source data for retrieval without additional recording operation and influence of a noisy environment, the operation is simple, the accuracy rate is high, the retrieval process does not influence the user to normally watch the video, and the retrieval efficiency and the success rate of retrieval can be improved.

Audio recognition method and device

Audio recognition method and device

Audio recognition method and device

Owner:BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD

Extensible audio recognition method based on man-machine interaction

InactiveCN101923857AReduced ability to recognizeReduce overheadSpeech recognitionFinite-state machineSpeech identification

The invention belongs to the technical field of audio processing, and relates to an extensible audio recognition system based on man-machine interaction and a method thereof. The extensible audio recognition system comprises an audio acquisition device, a voice recognition module, a loading sample unit, a finite-state machine, a classification storage characteristic sample database and an instruction execution module. The audio recognition method is based on high recognition rate of isolate word speed recognition to a speaker dependent, and enables the system to store voice segments which can not be recognized into the sample database in an online learning mode after a process of man-machine interaction through the assistance of a user on the premise of fully training the user, and in addition, the cost to recognition is reduced through divided module storage and loading. The core algorithm of the invention is based on voice signals, is not limited to languages of speakers, and can support the recognition of mixed languages (for example, Chinese and English and the like). The method has lower false recognition rate and no recognition rate, and improves the reliability and adaptability of the system through dialogue interaction and online increment training.

Extensible audio recognition method based on man-machine interaction

Extensible audio recognition method based on man-machine interaction

Extensible audio recognition method based on man-machine interaction

Owner:FUDAN UNIV

Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program

InactiveCN101238511ASignal processingSpeech recognitionSound sourcesEngineering

The sound source signal from a target sound source is separated from the mixed sound in which the sound source signals from sound sources are mixed without being influenced by the variations of the sensitivity of microphone elements are provided. A beam former section (3) of the sound source separating device (1) performs beam formation to attenuate the sound source signals arriving from directions symmetrical with respect to the vertical line of the line connecting two microphones (10, 11) by spectrum-analyzing the output signals from the microphones (10, 11) and multiplying the signals after the spectrum analysis by the weighting factors complex conjugated with the signals. Power calculating sections (40, 41) calculate power spectrum information. Target sound spectrum extracting sections (50, 51) extract spectrum information on target sound sources according to the difference between the power spectrum information from one beam former and that from the other.

Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program

Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program

Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program

Owner:ASAHI KASEI KK

Popular searches

Personal computer Human–computer interaction Analog-to-digital converter Recognition system Identification system Pattern identification Microphone Multimedia Extraction methods Phrase

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com