Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

74 results about "Phoneme recognition" patented technology

Phoneme recognition is carried out using the acoustic model. The acoustic model is created using machine learning algorithms. The machine learning is divided into two phases: training and testing.

Method for Automated Training of a Plurality of Artificial Neural Networks

The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.
Owner:CERENCE OPERATING CO

Network teaching method and system with voice assessment function

The invention provides a network teaching method and system with a voice assessment function. According to the voice assessment method provided by the invention, a phoneme state of a voice is used for replacing a multi-Gaussian mixture model trained by a conventional Mel-frequency cepstral coefficient (MFCC), and a posterior probability and a zero-order Baum-Welch statistical magnitude are calculated according to the feature. A voice feature based on phonemes is extracted through a multi-language phoneme identifier. A feature based on multi-language extraction is complementary during catching of non-native pronunciation information, and a feature based on phoneme duration is effective in automatic native accent assessment. Finally, a fusion system is provided in the method, so that Spearman relevant coefficients of 0.5706 and 0.6089 are reached on a development set and a test set. As indicated by the relevant coefficients, the method provided by the invention is very accurate and effective in oral speech assessment.
Owner:SHENZHEN EAGLESOUL EDUCATION SERVICE CO LTD

DNN (Deep Neural Network)-HMM (Hidden Markov Model)-based civil aviation radiotelephony communication acoustic model construction method

The invention relates to a DNN (Deep Neural Network)-HMM (Hidden Markov Model)-based civil aviation radiotelephony communication acoustic model construction method. The method includes the following steps that: a Chinese radiotelephony communication corpus is set up; civil aviation radiotelephony communication speech signals are pre-processed; Fbank features are extracted from the civil aviation radiotelephony communication speech signals and are adopted as civil aviation radiotelephony communication speech features; linear discrimination analysis, feature space maximum likelihood regression transformation and speaker adaptive training transformation processing are performed on the civil aviation radiotelephony communication speech features; and the processed speech features are utilized to build a DNN-HMM-based radiotelephony communication acoustic model. With the method of the invention adopted, the FBANK and MFCC features of radiotelephony communication speech are extracted to traina DNN network, so that the DNN-HMM acoustic model suitable for radiotelephony communication speech recognition can be obtained; and since a dictionary and a language model are combined, so that the feature enhanced DNN-HMM model can reduce the phoneme recognition error rate of the radiotelephony communication speech to 5.62% on the basis of constructed data.
Owner:CIVIL AVIATION UNIV OF CHINA

Lexical acquisition apparatus, multi dialogue behavior system, and lexical acquisition program

A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and the second evaluation value is maximum, and wherein the acquisition section 5 acquires, as a new word, a word in the word sequence selected by the discrimination section that is not involved in the calculation of the first evaluation value.
Owner:HONDA MOTOR CO LTD +1

Voice adaptive completion system based on multi-modal knowledge graph

The invention discloses a voice adaptive completion system based on a multi-modal knowledge graph. The system comprises a data receiver, a data analyzer and a data inference device. The data receiver preprocesses received audio and video data and outputs the audio and video data to the data analyzer; the data analyzer analyzes the voice and the image to extract waveform time sequence features and lip track features, and a phoneme sequence is obtained through multi-mode joint representation; and the data inference device carries out domain session modeling and candidate text prediction according to historical texts, text inference is carried out in combination with a phoneme sequence, statements with semantics are obtained, and complemented voice is synthesized according to waveform features. According to the invention, through a phoneme reasoning model, phoneme recognition is carried out when the voice modality is lost, the domain session modeling is carried out on the historical text generated by the existing voice according to the semantic relationship between the entities in the multi-modal knowledge graph, so that reasoning is carried out to generate the text with semantic, the voice is synthesized in combination with the waveform characteristics of the user voice, and the complemented audio is formed.
Owner:SHANGHAI JIAO TONG UNIV

Interactive language learning system and method thereof

The invention discloses an interactive language learning system and a method thereof. The interactive language learning system comprises a voice reference module, a characteristic extracting module, aphoneme associating module, a voice learning module, a phoneme correction module, a correction suggestion module, a phoneme evaluating module, a voice feedback module and a corpus, wherein the voicelearning module is used for collecting voice data designated by aloud reading of a learner; the phoneme correction module is used for synthesizing feedback voice having reference voice rhythm and learner's tone, and rhythm corrected voice can guide the learner to simulate rhythm of the reference voice; the correction suggestion module, the phoneme evaluating module and the voice feedback module are used for saving results in a data collection module; and the corpus is used for transmitting random spoken language information to the learner, and learner's learning is fed back to a database via the correction suggestion module. The interactive language learning system and the method provided by the invention, besides usual pronunciation evaluation, also provide an error detection function based on phoneme associating and phoneme recognition; and in combination with standard voice improvement suggestions and phoneme correction voice in the corpus, the learner can be helped timely, and mostunintentional errors of learners having certain basis can be corrected.
Owner:合肥凌极西雅电子科技有限公司

Spoken language pronunciation evaluation method based on deep neural network posterior probability algorithm

InactiveCN108364634AAccurate Voice Evaluation ResultsSpeech recognitionEvaluation resultPhoneme recognition
The present invention discloses a spoken language pronunciation evaluation method based on a deep neural network posterior probability algorithm. The method comprises the following steps of: selectinga certain amount of voice frequencies from voice, wherein the number of words of each voice frequency is in a certain range, calculating the average likelihood of the phoneme of one word, the averageEGOP of the phoneme of one word and the average duration probability of the phoneme of one word in each voice frequency; and taking the average likelihood of the phoneme of one word, the average EGOPof the phoneme of one word and the average duration probability of the phoneme of one word in each voice frequency as input items, inputting the average likelihood of the phoneme of one word, the average EGOP of the phoneme of one word and the average duration probability of the phoneme of one word in each voice frequency into a neural network, and outputting scores of words. The spoken languagepronunciation evaluation method based on a deep neural network posterior probability algorithm starts from an acoustic model, the LSTM modeling is employed to improve the phoneme recognition rate, theFA likelihood and all the similar phoneme likelihoods are compared, a GOP method is extended to an EGOP method, an artificial neural network scoring model is employed to perform scoring so as to obtain an accurate voice evaluation result.
Owner:苏州声通信息科技有限公司

Language model training method and system, mobile terminal and storage medium

The invention provides a language model training method and system, a mobile terminal and a storage medium, and the method comprises the steps: obtaining a training text and a training vocabulary, carrying out the classification of the training text so as to obtain a plurality of language modules, and constructing a language dictionary corresponding to the language modules according to the training vocabulary; performing model training on a module language model in the language module according to the language dictionary, and training the training text to obtain a text language model; obtaining to-be-recognized voice to perform phoneme recognition to obtain a phoneme string, and matching the phoneme string with the module language model to obtain a phoneme matching result; and performing probability calculation on the phoneme matching result through a text language model, and outputting the sentence corresponding to the maximum probability value. According to the method, the training efficiency and accuracy of the language model are improved by classifying the training texts and constructing and designing the language dictionary, and the language model can be effectively expanded on the basis of the training design of the module language model and the training texts.
Owner:XIAMEN KUAISHANGTONG TECH CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products