Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

270 results about "Speech generation" patented technology

Speech generation. Speech generation and recognition are used to communicate between humans and machines. Rather than using your hands and eyes, you use your mouth and ears. This is very convenient when your hands and eyes should be doing something else, such as: driving a car, performing surgery, or (unfortunately) firing your weapons at the enemy.

Speech generation device with a projected display and optical inputs

In several embodiments, a speech generation device is disclosed. The speech generation device may generally include a projector configured to project images in the form of a projected display onto a projection Surface, an optical input device configured to detect an input directed towards the projected display and a speaker configured to generate an audio output. In addition, the speech generation device may include a processing unit communicatively coupled to the projector, the optical input device and the speaker. The processing unit may include a processor and related computer readable medium configured to store instructions executable by the processor, wherein the instructions stored on the computer readable medium configure the speech generation device to generate text-to-speech output.
Owner:TOBII DYNAVOX AB

Method and apparatus for improved duration modeling of phonemes

A method and an apparatus for improved duration modeling of phonemes in a speech synthesis system are provided. According to one aspect, text is received into a processor of a speech synthesis system. The received text is processed using a sum-of-products phoneme duration model that is used in either the formant method or the concatenative method of speech generation. The phoneme duration model, which is used along with a phoneme pitch model, is produced by developing a non-exponential functional transformation form for use with a generalized additive model. The non-exponential functional transformation form comprises a root sinusoidal transformation that is controlled in response to a minimum phoneme duration and a maximum phoneme duration. The minimum and maximum phoneme durations are observed in training data. The received text is processed by specifying at least one of a number of contextual factors for the generalized additive model. An inverse of the non-exponential functional transformation is applied to duration observations, or training data. Coefficients are generated for use with the generalized additive model. The generalized additive model comprising the coefficients is applied to at least one phoneme of the received text resulting in the generation of at least one phoneme having a duration. An acoustic sequence is generated comprising speech signals that are representative of the received text.
Owner:APPLE INC

Voice dialing using text names

Methods and apparatus for implementing communication services such as voice dialing services are described. In one Centrex based voice dialing embodiment, voice dialing service subscribers are given access to personal voice dialing records including calling entries via the Internet as well as via telephone connections. Each calling entry normally includes the name and, optionally nickname, of a party to be called. It also includes one or more telephone numbers associated with each name. Different telephone number identifies, e.g. locations, can be associated with different names. A user can create or update entries in a voice dialing directory using text conveyed over the Internet or speech supplied via a telephone connection. In order to facilitate updating and maintenance of voice dialing directories over the Internet speaker independent (SI) speech recognition models are used. When a calling entry is created via the Internet the text of the name is processed to generate a corresponding speech recognition model there from. When an entry is created via speech obtained over the telephone, a speech recognition model is generated from the speech and a text name is generated is generated using speech to text technology. To avoid having to hang-up and initiate a new voice dialing call the outcome of a voice dialing call is monitored and the subscriber is provided the opportunity to initiate another call using voice dialing if the first call did not complete successfully e.g., goes unanswered.
Owner:GOOGLE LLC

System and method of media file access and retrieval using speech recognition

An embedded device for playing media files is capable of generating a play list of media files based on input speech from a user. It includes an indexer generating a plurality of speech recognition grammars. According to one aspect of the invention, the indexer generates speech recognition grammars based on contents of a media file header of the media file. According to another aspect of the invention, the indexer generates speech recognition grammars based on categories in a file path for retrieving the media file to a user location. When a speech recognizer receives an input speech from a user while in a selection mode, a media file selector compares the input speech received while in the selection mode to the plurality of speech recognition grammars, thereby selecting the media file.
Owner:INTERTRUST TECH CORP

Speech generation device with a head mounted display unit

A speech generation device is disclosed. The speech generation device may include a head mounted display unit having a variety of different components that enhance the functionality of the speech generation device. The speech generation device may further include computer-readable medium that, when executed by a processor, instruct the speech generation device to perform desired functions.
Owner:DYNAVOX SYST

Television voice voting method, television voice voting system and television voice voting terminal

InactiveCN103067754AThe convenient way to participate in TV voting activitiesRealize the function of voice TV votingSelective content distributionComputer hardwareUser participation
The invention discloses a television voice voting method, a television voice voting system and a television voice voting terminal. The television voice voting method comprise the following steps: a vote event is formed by the voting system and sent to a digital television terminal, wherein the vote event comprises a vote event identification (ID), vote content information and validity period information; the vote event is received by the digital television terminal and the vote event within the period of validity is showed; voice input is received, voice is identified, and voice voting information is generated and sent to the voting system; a voting result is updated by the voting system according to the voice voting information, the voting result is sent to the digital television terminal and showed. Due to the fact that the digital television terminal receives the vote event which is sent by the voting system, identifies the voice voting information of a user, sends a vote which is chosen by the user to the voting system, receives and shows the updated voting result which is sent back by the system, the function of voice television voting is achieved. The method through which users participate in a television voting activity is more convenient, faster and more interesting.
Owner:SHENZHEN COSHIP ELECTRONICS CO LTD

System for treating disabilities such as dyslexia by enhancing holistic speech perception

The present invention relates to systems and methods for enhancing the holistic and temporal speech perception processes of a learning-impaired subject. A subject listens to a sound stimulus which induces the perception of verbal transformations. The subject records the verbal transformations which are then used to create further sound stimuli in the form of semantic-like phrases and an imaginary story. Exposure to the sound stimuli enhances holistic speech perception of the subject with cross-modal benefits to speech production, reading and writing. The present invention has application to a wide range of impairments including, Specific Language Impairment, language learning disabilities, dyslexia, autism, dementia and Alzheimer's.
Owner:EPOCH INNOVATIONS

Voice generation method and device based on generative adversarial network

The invention discloses a voice generation method based on a generative adversarial network. According to the method, randomly-generated noise data meeting Gaussian distribution is converted into a simulation sample through a generative model; as the simulation sample does not have the language content, when the generative model and a discrimination model are circularly updated, generative capacities required to be learned by the generative model and discrimination capacities required to be learned by the discrimination model are correspondingly increased, and accordingly the generative capacities of the generative model and the discrimination capacities of the discrimination model are improved; when a contrast value between a training sample and the simulation sample is smaller than or equal to a preset threshold value, it is thought that the generative model has the capacity of generating real data; a voice database generated through the generative model has enough reality, and the recognition rate can be increased when the generative model is applied to identity recognition. Correspondingly, the voice generation method, a voice generation device and voice generation equipment based on the generative adversarial network and a computer readable storage medium have the same advantages.
Owner:SPEAKIN TECH CO LTD

Voice authentication system

A voice authentication system includes: a standard template storage part 17 in which a standard template that is generated from a registered voice of an authorized user and featured with a voice characteristic of the registered voice is stored preliminarily in a state of being associated with a personal ID of the authorized user; an identifier input part 15 that allows a user who intends to be authenticated to input a personal ID; a voice input part 11 that allows the user to input a voice; a standard template / registered voice selection part 16 that selects a standard template and a registered voice corresponding to the inputted identifier; a determination part 14 that refers to the selected standard template and determines whether or not the inputted voice is a voice of the authorized user him / herself and whether or not presentation-use information is to be outputted by referring to a predetermined determination reference; a presentation-use information extraction part 19 that extracts information regarding the registered voice of the authorized user corresponding to the inputted identifier; and a presentation-use information output part 18 that presents the presentation-use information to the user in the case where it is determined by the determination part that the presentation-use information is to be outputted to the user.
Owner:FUJITSU LTD

Context-aware augmented communication

Systems and methods of providing electronic features for creating context-aware vocabulary suggestions for an electronic device include providing a graphical user interface design area having a plurality of display elements. An electronic device user may be provided automated context-aware analysis from information from plural sources including GPS, compass, speaker identification (i.e., voice recognition), facial identification, speech content determination, user specifications, speech output monitoring, and software navigation monitoring to provide a selectable display of suggested vocabulary, previously stored words and phrases, or a keyboard as input options to create messages for text display and / or speech generation. The user may, optionally, manually specify a context.
Owner:DYNAVOX SYST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products