Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

324results about How to "Improve speech recognition performance" patented technology

Apparatus and methods for developing conversational applications

Apparatus with accompanying subsystems and methods for developing conversational computer applications. As a user interface, the apparatus allows for a user to initiate the conversation. The apparatus also answers simple and complex questions, understands complex requests, pursues the user for further information when the request is incomplete, and in general provides customer support with a human like conversation while, at the same time, it is capable to interact with a company's proprietary database. As a development tool, the apparatus allows a software developer to implement a conversational system much faster than takes, with current commercial systems to implement basic dialog flows. The apparatus contains three major subsystems: a state transition inference engine, a heuristic answer engine and a parser generator with semantic augmentations. A main process broker controls the flow and the interaction between the different subsystems. The state transition inference engine handles requests that require processing a transaction or retrieving exact information. The heuristic answer engine answers questions that do not require exact answers but enough information to fulfill the user's request. The parser generator processes the user's natural language request, that is, it processes the syntactical structure of the natural language requests and it builds a conceptual structure of the request. After the parser generator processes the user's request, a main process broker feeds the conceptual structure to either the heuristic answer engine or to the state transition inference engine. The interaction between the main process broker and the subsystems creates a conversational environment between the user and the apparatus, while the apparatus uses information from proprietary databases to provide information, or process information, during the course of the conversation. The apparatus is equipped with a programming interface that allows implementers to declare and specify transactions based requests and answers to a multiplicity of questions. The apparatus may be used with a speech recognition interface, in which case, the apparatus improves the recognition results through the context implicitly created by the apparatus.
Owner:GYRUS LOGIC INC

Method for detecting voice section from time-space by using audio and video information and apparatus thereof

The present invention relates to a method for detecting a voice section in time-space by using audio and video information. According to an embodiment of the present invention, a method for detecting a voice section from time-space by using audio and video information comprises the steps of: detecting a voice section in an audio signal which is inputted into a microphone array; verifying a speaker from the detected voice section; sensing the face of the speaker by using a video signal which is inputted into a camera if the speaker is successfully verified, and then estimating the direction of the face of the speaker; and determining the detected voice section as the voice section of the speaker if the estimated face direction corresponds to a reference direction which is previously stored.
Owner:KOREA UNIV IND & ACADEMIC CALLABORATION FOUND

Natural language processing for a location-based services system

A method and system for providing natural language processing in a communication system is disclosed. A voice request is generated with a remote terminal that is transmitted to a base station. A voice recognition application is then used to identify a plurality of words that are contained in the voice request. After the words are identified, a grammar associated with each word is also identified. Once the grammars have been identified, each word is categorized into a respective grammar category. A structured response is then generated to the voice request with a response generation application.
Owner:ACCENTURE GLOBAL SERVICES LTD

Method of recognizing spoken language with recognition of language color

In accordance with a present invention speech recognition is disclosed. It uses a microphone to receive audible sounds input by a user into a first computing device having a program with a database consisting of (i) digital representations of known audible sounds and associated alphanumeric representations of the known audible sounds and (ii) digital representations of known audible sounds corresponding to mispronunciations resulting from known classes of mispronounced words and phrases. The method is performed by receiving the audible sounds in the form of the electrical output of the microphone. A particular audible sound to be recognized is converted into a digital representation of the audible sound. The digital representation of the particular audible sound is then compared to the digital representations of the known audible sounds to determine which of those known audible sounds is most likely to be the particular audible sound being compared to the sounds in the database. A speech recognition output consisting of the alphanumeric representation associated with the audible sound most likely to be the particular audible sound is then produced. An error indication is then received from the user indicating that there is an error in recognition. The user also indicates the proper alphanumeric representation of the particular audible sound. This allows assistant to determine whether the error is a result of a known type or instance of mispronunciation. In response to a determination of error corresponding to a known type or instance of mispronunciation, the system presents an interactive training program from the computer to the user to enable the user to correct such mispronunciation.
Owner:LESSAC TECH INC

Collaboration of multiple automatic speech recognition (ASR) systems

A system and method for collaborating multiple ASR (automatic speech recognition) systems. The system and method analyzes voice data on various computers having speech recognition residing thereon. The speech recognition residing on the various computers may be different systems. The speech recognition systems detect voice data and recognize their respective masters. The master computer as well as those computers which did not recognize their master may analyze the voice data (evaluate) and then integrate this analyzed voice data into a single decoded output. In this manner, many different speakers, utilizing the system and method for collaborating multiple ASR systems, may have their voice data analyzed and integrated into a single decoded output, regardless of ASR systems.
Owner:IBM CORP

Artificial cochlea speech processing method based on frequency modulation information and artificial cochlea speech processor

The invention provides an artificial cochlea speech processing method based on frequency modulation information and an artificial cochlea speech processor. The artificial cochlea speech processing method comprises the following steps: pre-emphasizing a speech signal; decomposing the speech signal by an analysis filter into a plurality of sub frequency bands; then, extracting the time-domain envelope information of each sub frequency band signal; adopting a Hilbert transform method to extract the frequency modulation information of a low-frequency part to multiple by time-domain envelopes so asto acquire a synthetic time-domain envelope containing the frequency modulation information; utilizing various acquired time-domain envelopes of the sub frequency bands to modulate a pulse sequence by a pulse generator; adding modulated pulses of various sub frequency bands to acquire a finally synthesized stimulus signal; and sending the stimulus signal to an electrode to generate an electric pulse to stimulate the auditory nerve. The artificial cochlea speech processor is suitable for deafness patients speaking Chinese as a native language to recognize speeches in a noise environment and has noise robustness, thereby enabling the deafness patients to feel more fine speech structure information, enhancing the speech recognition abilities of the deafness patients in the noise environmentand benefiting the tone recognition.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1

Method for recognizing speech/speaker using emotional change to govern unsupervised adaptation

To improve the performance and the recognition rate of a method for recognizing speech in a dialogue system, or the like, it is suggested to derive emotion information data (EID) from speech input (SI) being descriptive for an emotional state of a speaker or a change thereof based upon which a process of recognition is chosen and / or designed.
Owner:SONY DEUT GMBH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products