Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

33663results about "Speech analysis" patented technology

Method and system for enabling connectivity to a data system

A method and system that provides filtered data from a data system. In one embodiment the system includes an API (application programming interface) and associated software modules to enable third party applications to access an enterprise data system. Administrators are enabled to select specific user interface (UI) objects, such as screens, views, applets, columns and fields to voice or pass-through enable via a GUI that presents a tree depicting a hierarchy of the UI objects within a user interface of an application. An XSLT style sheet is then automatically generated to filter out data pertaining to UI objects that were not voice or pass-through enabled. In response to a request for data, unfiltered data are retrieved from the data system and a specified style sheet is applied to the unfiltered data to return filtered data pertaining to only those fields and columns that are voice or pass-through enabled.
Owner:ORACLE INT CORP

Media recording device with remote graphic user interface

An apparatus for processing digital media signals, comprising a digital processor for controlling the apparatus; a graphic user interface, having a wireless remote control providing a command input to the processor; a network interface for transmitting digital information from the processor to a remote location over a communications network, the information identifying a digital media signal for desired reproduction based, at least in part, on an input received from the remote control; and an output, controlled by, and local to, the processor, for transferring the desired digital media signals for reproduction thereof.
Owner:BLANDING HOVENWEEP

Coding method and apparatus for multiple channels of audio information representing three-dimensional sound fields

In an encoder, multiple channels of audio information representing multidimensional sound fields are split into subband signals and the subband signals in one or more subbands are combined to form composite signals. The composite signals, the subband signals not combined into a composite signal and information describing the spectral levels of subband signals combined into composite signals are assembled into an encoded output signal. The spectral level information conveys either the amplitude or power of the combined subband signals or the apparent direction of the sound field represented by the combined subband signals. In digital implementations, adaptive bit allocation may be used to reduce the informational requirements of the encoded signal.
Owner:DOLBY LAB LICENSING CORP

Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal

The apparatus for constructing a multi-channel output signal using an input signal and parametric side information, the input signal including the first input channel and the second input channel derived from an original multi-channel signal, and the parametric side information describing interrelations between channels of the multi-channel original signal uses base channels for synthesizing first and second output channels on one side of an assumed listener position, which are different from each other. The base channels are different from each other because of a coherence measure. Coherence between the base channels (for example the left and the left surround reconstructed channel) is reduced by calculating a base channel for one of those channels by a combination of the input channels, the combination being determined by the coherence measure. Thus, a high subjective quality of the reconstruction can be obtained because of an approximated original front / back coherence.
Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV +1

Dynamic audio ducking

ActiveUS8428758B2Gain controlSpeech analysisDuckingLoudness
Various dynamic audio ducking techniques are provided that may be applied where multiple audio streams, such as a primary audio stream and a secondary audio stream, are being played back simultaneously. For example, a secondary audio stream may include a voice announcement of one or more pieces of information pertaining to the primary audio stream, such as the name of the track or the name of the artist. In one embodiment, the primary audio data and the voice feedback data are initially analyzed to determine a loudness value. Based on their respective loudness values, the primary audio stream may be ducked during the period of simultaneous playback such that a relative loudness difference is generally maintained with respect to the loudness of the primary and secondary audio streams. Accordingly, the amount of ducking applied may be customized for each piece of audio data depending on its loudness characteristics.
Owner:APPLE INC

Dynamic audio ducking

ActiveUS20100211199A1Gain controlSpeech analysisDuckingLoudness
Various dynamic audio ducking techniques are provided that may be applied where multiple audio streams, such as a primary audio stream and a secondary audio stream, are being played back simultaneously. For example, a secondary audio stream may include a voice announcement of one or more pieces of information pertaining to the primary audio stream, such as the name of the track or the name of the artist. In one embodiment, the primary audio data and the voice feedback data are initially analyzed to determine a loudness value. Based on their respective loudness values, the primary audio stream may be ducked during the period of simultaneous playback such that a relative loudness difference is generally maintained with respect to the loudness of the primary and secondary audio streams. Accordingly, the amount of ducking applied may be customized for each piece of audio data depending on its loudness characteristics.
Owner:APPLE INC

System, method and article of manufacture for concept based information searching

A system, method and article of manufacture are provided for allowing concept based information searching. Textual information is collected utilizing a network. The textual information is parsed to create topic specific information packets, which are stored in an information cache. The system parses a user query and compares the parsed user query with the information packets in the information cache to locate matching information packets and displays the matching information packets to a user.
Owner:MICROSOFT TECH LICENSING LLC

System and method for speaker recognition on mobile devices

A speaker recognition system for authenticating a mobile device user includes an enrollment and learning software module, a voice biometric authentication software module, and a secure software application. Upon request by a user of the mobile device, the enrollment and learning software module displays text prompts to the user, receives speech utterances from the user, and produces a voice biometric print. The enrollment and training software module determines when a voice biometric print has met at least a quality threshold before storing it on the mobile device. The secure software application prompts a user requiring authentication to repeat an utterance based at least on an attribute of a selected voice biometric print, receives a corresponding utterance, requests the voice biometric authentication software module to verify the identity of the second user using the utterance, and, if the user is authenticated, imports the voice biometric print.
Owner:CIRRUS LOGIC INC

Method and apparatus for automatically recognizing input audio and/or video streams

A method and system for the automatic identification of audio, video, multimedia, and / or data recordings based on immutable characteristics of these works. The invention does not require the insertion of identifying codes or signals into the recording. This allows the system to be used to identify existing recordings that have not been through a coding process at the time that they were generated. Instead, each work to be recognized is “played” into the system where it is subjected to an automatic signal analysis process that locates salient features and computes a statistical representation of these properties. These features are then stored as patterns for later recognition of live input signal streams. A different set of features is derived for each audio or video work to be identified and stored. During real-time monitoring of a signal stream, a similar automatic signal analysis process is carried out, and many features are computed for comparison with the patterns stored in a large feature database. For each particular pattern stored in the database, only the relevant characteristics are compared with the real-time feature set. Preferably, during analysis and generation of reference patterns, data are extracted from all time intervals of a recording. This allows a work to be recognized from a single sample taken from any part of the recording.
Owner:ICEBERG IND

Methods and systems for authoring of mixed-initiative multi-modal interactions and related browsing mechanisms

Application authoring techniques, and information browsing mechanisms associated therewith, which employ programming in association with mixed-initiative multi-modal interactions and natural language understanding for use in dialog systems. Also, a conversational browsing architecture is provided for use with these and other authoring techniques.
Owner:INT BUSINESS MASCH CORP

Touch gesture based interface for motor vehicle

An improved apparatus and method is provided for operating devices and systems in a motor vehicle, while at the same time reducing vehicle operator distractions. One or more touch sensitive pads are mounted on the steering wheel of the motor vehicle, and the vehicle operator touches the pads in a pre-specified synchronized pattern, to perform functions such as controlling operation of the radio or adjusting a window. At least some of the touch patterns used to generate different commands may be selected by the vehicle operator. Usefully, the system of touch pad sensors and the signals generated thereby are integrated with speech recognition and / or facial gesture recognition systems, so that commands may be generated by synchronized multi-mode inputs.
Owner:WAYMO LLC

Systems and methods for processing natural language queries

Methods and systems are provided for processing natural language queries. Such methods and systems may receive a natural language query from a user and generate corresponding semantic tokens. Information may be retrieved from a knowledge base using the semantic tokens. Methods and systems may leverage an interpretation module to process and analyze the retrieved information in order to determine an intention associated with the natural language query. Methods and systems may leverage an actuation module to provide results to the user, which may be based on the determined intention.
Owner:SAP AG

Apparatus and method for visually representing behavior of a user of an automated response system

A system for visually representing user behavior within an interactive voice response (IVR) system of a call processing center generates a complete sequence of events within the IVR system for plural calls to the call processing center, the plurality of calls being recorded from end to end. A call flow of the IVR system is modeled as a non-deterministic finite-state machine, such that a start state of the finite-state machine represents a first prompt of the IVR system, other states of the finite-state machine represent subsequent prompts at which a branching occurs in the call flow of the IVR system, exit conditions are represented as end states, and transitions of the finite-state machine represent transitions between call flow states triggered by data inputted by a user or by internal processing of the IVR system. The complete sequences of events for the plural calls are provided to the finite-state machine to produce a two-way matrix of several counters. The data from the two-way matrix is represented as a state-transition diagram.
Owner:CX360 INC +1

Compatible multi-channel coding/decoding

ActiveUS20050074127A1Suitable for processingEfficient and artifact-reduced encodingSpeech analysisStereophonic systemsSide informationComputer science
In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information. Since the channel side information only occupy a low number of bits, and since the decoder does not use dematrixing, an efficient and high quality multi-channel extension for stereo players and enhanced multi-channel players is obtained.
Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV +1

Lighting control using speech recognition

A system and method for the control of color-based lighting through voice control or speech recognition as well as a syntax for use with such a system. In this approach, the spoken voice (in any language) can be used to more naturally control effects without having to learn the myriad manipulation required of some complex controller interfaces. A simple control language based upon spoken words consisting of commands and values is constructed and used to provide a common base for lighting and system control.
Owner:PHILIPS LIGHTING NORTH AMERICA CORPORATION

Security enhancements of digital watermarks for multi-media content

Methods and apparatus for embedding digital watermarks into a digital host content are provided. A digital host content is received, e.g., at a receiver or other device incorporating a receiver. One or more watermark embedding technologies is selected. Multiple embedding opportunities are identified within the host content. A subset of the identified embedding opportunities is selected. A multiplicity of digital watermarks are then embedded into the host content in accordance with the selected subset of embedding opportunities utilizing the one or more selected watermark embedding technologies. The selecting of the subset of embedding opportunities may be adapted to provide a desired tradeoff between levels of robustness, security, and transparency of the watermark. A plurality of watermarking embedding technologies may be selected and used in the embedding step.
Owner:VERANCE

Fast-start streaming and buffering of streaming content for personal media player

A personal media broadcasting system enables video distribution over a computer network and allows a user to view and control media sources over a computer network from a remote location. A personal broadcaster receives an input from one or more types of media sources, digitizes and compresses the content, and streams the compressed media over a computer network to a media player running on any of a wide range of client devices for viewing the media. The system may allow the user to issue control commands (e.g., “channel up”) from the media player to the broadcaster, causing the source device to execute the commands. The broadcaster and the media player may employ several techniques for buffering, transmitting, and viewing the content to improve the user's experience.
Owner:SLING MEDIA LLC

Personal media broadcasting system with output buffer

A personal media broadcasting system enables video distribution over a computer network and allows a user to view and control media sources over a computer network from a remote location. A personal broadcaster receives an input from one or more types of media sources, digitizes and compresses the content, and streams the compressed media over a computer network to a media player running on any of a wide range of client devices for viewing the media. The system may allow the user to issue control commands (e.g., “channel up”) from the media player to the broadcaster, causing the source device to execute the commands. The broadcaster and the media player may employ several techniques for buffering, transmitting, and viewing the content to improve the user's experience.
Owner:SLING MEDIA LLC

Accessory authentication for electronic devices

Improved techniques to control utilization of accessory devices with electronic devices are disclosed. The improved techniques can use cryptographic approaches to authenticate electronic devices, namely, electronic devices that interconnect and communicate with one another. One aspect pertains to techniques for authenticating an electronic device, such as an accessory device. Another aspect pertains to provisioning software features (e.g., functions) by or for an electronic device (e.g., a host device). Different electronic devices can, for example, be provisioned differently depending on different degrees or levels of authentication, or depending on manufacturer or product basis. Still another aspect pertains to using an accessory (or adapter) to convert a peripheral device (e.g., USB device) into a host device (e.g., USB host). The improved techniques are particularly well suited for electronic devices, such as media devices, that can receive accessory devices. One example of a media device is a media player, such as a hand-held media player (e.g., music player), that can present (e.g., play) media items (or media assets).
Owner:APPLE INC

System and method for multi-channel recording

Embodiments of the present invention are directed generally to recording communication of a call utilizing a multi-channel recording technique. According to one exemplary embodiment, inbound communication from each party to a call (e.g., from each communication device that is party to a call) to a recording system is assigned to a separate channel, and communication on each channel is independently recorded. Further, during the call, a control channel is generated that correlates the multiple communication channels. The independently recorded communication channels and control channel may be used to analyze a recorded call from any desired perspective. For instance, communication from a given party may be analyzed in isolation. Further, the control channel enables the recorded multiple communication channels to be correlated such that the communication received (e.g., heard) by any selected party may be accurately re-created for analysis thereof.
Owner:SECURUS TECH

Broadband frequency translation for high frequency regeneration

InactiveUS20030187663A1Reduce in quantityMaintaining perceived qualitySpeech analysisFrequency spectrumAudio signal flow
An audio signal is conveyed more efficiently by transmitting or recording a baseband of the signal with an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal's noise-like quality. The signal is reconstructed by translating spectral components of the baseband signal to frequencies outside the baseband, adjusting phase of the regenerated components to maintain phase coherency, adjusting spectral shape according to the estimated spectral envelope, and adding noise according to the noise-blending parameter. Preferably, the transmitted or recorded signal also includes an estimated temporal envelope that is used to adjust the temporal shape of the reconstructed signal.
Owner:DOLBY LAB LICENSING CORP

Near-transparent or transparent multi-channel encoder/decoder scheme

A multi-channel encoder / decoder scheme additionally preferably generates a waveform-type residual signal. This residual signal is transmitted together with one or more multi-channel parameters to a decoder. In contrast to a purely parametric multi-channel decoder, the enhanced decoder generates a multi-channel output signal having an improved output quality because of the additional residual signal.
Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV

Multi-modal device capable of automated actions

A multi-modal multi-lingual mobile device that facilitates intelligently automating an action. The device can automatically synchronize a user schedule based upon a user state, intention, preference and / or limitation. The device can employ sensors to automatically detect criteria by which to automatically implement an action. Moreover, the system can interrogate a user thus converging upon a user intention and / or preference. An analyzer component can intelligently evaluate the compiled criterion in order to automatically perform an action. The multi-modal multi-lingual mobile device can automatically facilitate identification of an individual. Other actions that are automatically performed can include modifying personal information manager data, translating languages into a language comprehendible to a user, etc. Implementation of these actions can be based at least in part upon an environmental factor, a conversation, a location factor and a temporal factor.
Owner:MICROSOFT TECH LICENSING LLC

Portable devices and methods employing digital watermarking

Media objects are transformed into active, connected objects via identifiers embedded into them or their containers. In the context of a user's playback experience, a decoding process extracts the identifier from a media object and possibly additional context information and forwards it to a server. The server, in turn, maps the identifier to an action, such as returning metadata, re-directing the request to one or more other servers, requesting information from another server to identify the media object, etc. The server may return a higher fidelity version of content from which the identifier was extracted. In some applications, the higher fidelity version may be substituted for the original media object and rendered to provide higher quality output. The linking process applies to broadcast objects as well as objects transmitted over networks in streaming and compressed file formats.
Owner:DIGIMARC CORP

Decoding of information in audio signals

Systems and methods are provided for decoding a message symbol in an audio signal. This message symbol is represented by first and second code symbols displaced in time. Values representing the code signals are accumulated and the accumulated values are examined to detect the message symbol.
Owner:NIELSEN HLDG NV +1

Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal

ActiveUS7394903B2Decoder-side computing workload can be reducedHigh scaleSpeech analysisSpecial data processing applicationsSide informationEngineering
The apparatus for constructing a multi-channel output signal using an input signal and parametric side information, the input signal including the first input channel and the second input channel derived from an original multi-channel signal, and the parametric side information describing interrelations between channels of the multi-channel original signal uses base channels for synthesizing first and second output channels on one side of an assumed listener position, which are different from each other. The base channels are different from each other because of a coherence measure. Coherence between the base channels (for example the left and the left surround reconstructed channel) is reduced by calculating a base channel for one of those channels by a combination of the input channels, the combination being determined by the coherence measure. Thus, a high subjective quality of the reconstruction can be obtained because of an approximated original front / back coherence.
Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV +1

Method and device for monitoring and analyzing signals

A method and system for monitoring and analyzing at least one signal are disclosed. An abstract of at least one reference signal is generated and stored in a reference database. An abstract of a query signal to be analyzed is then generated so that the abstract of the query signal can be compared to the abstracts stored in the reference database for a match. The method and system may optionally be used to record information about the query signals, the number of matches recorded, and other useful information about the query signals. Moreover, the method by which abstracts are generated can be programmable based upon selectable criteria. The system can also be programmed with error control software so as to avoid the re-occurrence of a query signal that matches more than one signal stored in the reference database.
Owner:WISTARIA TRADING INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products