Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

304 results about "Speech rate" patented technology

Method for automatic evaluation based on generalized fluent spoken language fluency

ActiveCN101740024ATroubleshoot automated assessment issuesFast scoringSpeech recognitionData dredgingSpoken language
The invention relates to a method for automatic evaluation based on generalized fluent spoken language fluency, which comprises the following steps of: acquiring speech data according to different ages and spoken language levels by using a speech input device; adopting an evaluating model based on characteristics of the generalized fluency and the machine learning training fluency; configuring a speech recognition system with corresponding parameters according to scripts of different subjects and genders of enunciators in the speech data; performing quantification on speech speed coherence, content understanding, advanced skills and reconstruction standard characteristics in the speech data to comprehensively extract the characteristics of the fluency from the speech data from the angle of expert assessment and evaluation; and adopting a decision tree method in regression fitting analysis and data mining to detect faults of abnormal fluency and grade and diagnose the fluency. The acquired score of the machine fluency can reach the level close to that of grading experts, and the relativity index exceeds that of 2 to 3 of general 5 experts; besides, the method has a high speed, and can be embedded into a spoken language automatic evaluation system to serve as an important module to evaluate fluency indexes in pronunciation quality.
Owner:IFLYTEK CO LTD

Device and method for acquiring speech recognition multi-information text

The invention provides a device and a method for acquiring a speech recognition multi-information text. After a speech audio frequency is converted into pure text information by speech recognition, individual character pronunciation speed, individual character pronunciation strength and individual character pronunciation intonation in the speech audio frequency are integrated into the initially-generated pure text information in a certain expression way to generate multi-information text information. The device and the method for acquiring the speech recognition multi-information text can be widely used for information release platforms such as micro blogs, short messages, signature files and the like.
Owner:SHANGHAI GUOKE ELECTRONICS

System and method for predicting prosodic parameters

A method for generating a prosody model that predicts prosodic parameters is disclosed. Upon receiving text annotated with acoustic features, the method comprises generating first classification and regression trees (CARTs) that predict durations and F0 from text by generating initial boundary labels by considering pauses, generating initial accent labels by applying a simple rule on text-derived features only, adding the predicted accent and boundary labels to feature vectors, and using the feature vectors to generate the first CARTs. The first CARTs are used to predict accent and boundary labels. Next, the first CARTs are used to generate second CARTs that predict durations and F0 from text and acoustic features by using lengthened accented syllables and phrase-final syllables, refining accent and boundary models simultaneously, comparing actual and predicted duration of a whole prosodic phrase to normalize speaking rate, and generating the second CARTs that predict the normalized speaking rate.
Owner:CERENCE OPERATING CO

System and method for audio hot spotting

Audio hot spotting is accomplished by specifying query criterion to include a non-lexical audio cue. The non-lexical audio cue can be, e.g., speech rate, laughter, applause, vocal effort, speaker change or any combination thereof. The query criterion is retrieved from an audio portion of a file. A segment of the file containing the query criterion can be provided to a user. The duration of the provided segment can be specified by the user along with the files to be searched. A list of detections of the query criterion within the file can also be provided to the user. Searches can be refined by the query criterion additionally including a lexical audio-cue. A keyword index of topic terms contained in the file can also be provided to the user.
Owner:MITRE SPORTS INT LTD

Method of automatically classifying speaking rate and speech recognition system using the same

Provided are a method of automatically classifying a speaking rate and a speech recognition system using the method. The speech recognition system using automatic speaking rate classification includes a speech recognizer configured to extract word lattice information by performing speech recognition on an input speech signal, a speaking rate estimator configured to estimate word-specific speaking rates using the word lattice information, a speaking rate normalizer configured to normalize a word-specific speaking rate into a normal speaking rate when the word-specific speaking rate deviates from a preset range, and a rescoring section configured to rescore the speech signal whose speaking rate has been normalized.
Owner:ELECTRONICS & TELECOMM RES INST

Methods and devices for treating non-stuttering speech-language disorders using delayed auditory feedback

Methods, devices and systems treat non-stuttering speech and / or language related disorders by administering a delayed auditory feedback signal having a delay of under about 200 ms via a portable device. The DAF treatment may be delivered on a chronic basis. For certain disorders, such as Parkinson's disease, the delay is set to be under about 100 ms, and may be set to be even shorter such as about 50 ms or less. Certain methods treat cluttering (an abnormally fast speech rate) by exposing the individual to a DAF signal having a sufficient delay that automatically causes the individual to slow his or her speech rate.
Owner:EAST CAROLINA UNIVERISTY

The invention discloses a tTranslation method and translation system based on intelligent hardware

The invention discloses a translation method based on intelligent hardware, and the method comprises the following steps: sS1, obtaining audio, image, video or text information, and translating the audio, image and video to obtain text contents; S; s2, translating the obtained character information or character content into second language characters through an online or offline translation engine; S; s3, carrying out knowledge base knowledge point automatic identification is carried out on text information or keywords or semantics of text content before and after translation, and a use sceneis intelligently prejudged; S; s4, automatically or manually selecting the tone of the phonetic bank and adjusting the speed tone through a pre-judged use scene; S; s5, translating result voice broadcast. Information is transmitted by using a wireless transmission technology, translation is completed by combining applications of new technologies such as a voice transfer technology, an image recognition character technology, a translation engine and the like, meanwhile, storage, playback and sharing functions are provided for a user, and scene extension of the user and continuous optimization of a product are also realized.
Owner:广州市讯飞樽鸿信息技术有限公司

Method for realizing sound speed-variation without tone variation and system for realizing speed variation and tone variation

The invention discloses a system for realizing sound speed variation and tone variation, which comprises an input cache module, a tone variation processing module, a speed-variation no-tone-variation processing module and a data output module, wherein the input cache module is used for reading the sound signal data to be processed into the cache; the tone variation processing module is used for carrying out the tone variation processing on the sound signal to change the sound tone; the speed-variation no-tone-variation processing module is used for carrying out the speed-variation no-tone-variation processing on the sound signal, thereby changing the sound speed without changing the tone; and the data output module is used for outputting the speed-variation tone-variation signal. The speed-variation no-tone-variation processing module comprises a segmentation data module and a connection data module, wherein the speed-variation no-tone-variation processing module extracts a string of signal subfamilies (namely small sections of sound) from the original speech signal according to the coefficient of variation in speed by using a window function; and the connection data module connects the signal subfamilies according to the time sequence, thereby obtaining the speed-variation no-tone-variation signal. The invention realizes the speed-variation no-tone-variation function and the speed-variation tone-variation function of the audio frequency by using very low algorithm complexity, and does not introduce noise, thereby enhancing the quality of the processed sound.
Owner:刘盛举 +1

Method and device for providing voice service

The application discloses a method and device for providing a voice service. The method for providing the voice service in a specific implementation mode includes the steps of acquiring a voice inputsignal; analyzing a time domain waveform of the voice input signal to determine current speech rate information of the voice input signal; comparing the current speech rate information with an obtained standard speech rate information set of a user that sends out the voice input signal, and determining first demand information from a preset demand information set according to the comparison result, the standard speech rate information set including at least one piece of standard speech rate information, and the preset demand information set including demand information corresponding to each piece of standard speech rate information in the standard speech rate information set; and generating a voice response signal according to the first demand information and second demand information obtained by analyzing the voice input signal. The embodiment can improve the matching degree between the voice service and the user's potential demand, and achieve a more flexible and accurate voice service.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products