Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

91 results about "Language speech" patented technology

Speech is the verbal expression of language and includes articulation (the way sounds and words are formed). Language is the entire system of giving and getting information in a meaningful way. It's understanding and being understood through communication — verbal, nonverbal, and written.

Cross-language end-to-end speech recognition method for low resource Tujia language

The invention discloses a cross-language end-to-end speech recognition method for the low resource Tujia language. The method comprises the following steps: preprocessing the Tujia language data; constructing a cross-language Tujia language corpus; establishing a unified coding dictionary of Chinese international phonetic alphabets and national international phonetic alphabets; establishing a cross-language end-to-end Tujia speech recognition model; and using a join temporal classification model and performing decoding under the action of the coding dictionary so as to obtain the recognition result. The recognition model with higher generalization is constructed by taking advantage of sufficient major language data and combining the idea of transfer learning so as to improve the accuracy of Tujia language speech recognition.
Owner:BEIJING TECHNOLOGY AND BUSINESS UNIVERSITY

Speech recognition device and speech recognition method

The speech recognition apparatus ( 1 ) is equipped with the garbage acoustic model storage unit ( 110 ) storing the garbage acoustic model which learned the collection of the unnecessary words; the feature value calculation unit ( 101 ) which calculates the feature parameter necessary for recognition by acoustically analyzing the unidentified input speech including the non-language speech per frame which is a unit for speech analysis; the garbage acoustic score calculation unit ( 111 ) which calculates the garbage acoustic score by comparing the feature parameter and the garbage acoustic model; the garbage acoustic score correction unit ( 113 ) which corrects the garbage acoustic score calculated by the garbage acoustic score calculation unit ( 111 ) so as to raise it in the frame where the non-language speech is inputted; and the recognition result output unit ( 105 ) which outputs, as the recognition result of the unidentified input speech, the word string with the highest cumulative score of the language score, the word acoustic score, and the garbage acoustic score which is corrected by the garbage acoustic score correcting means.
Owner:PANASONIC INTELLECTUAL PROPERTY CORP OF AMERICA

Multi-language speech recognition method based on language type and speech content collaborative classification

ActiveCN110895932ASolving Adaptive ProblemsRealize collaborative identificationSpeech recognitionAcoustic modelPosteriori probability
The invention discloses a multi-language speech recognition method based on language type and speech content collaborative classification. The method comprises the following steps: 1) establishing andtraining a language type and speech content collaborative classification acoustic model; wherein the acoustic model is fused with a language feature vector containing language related information, and model adaptive optimization can be performed on a phoneme classification layer of a specific language by utilizing the language feature vector in a multi-language recognition process; 2) inputting aspeech feature sequence to be recognized into the trained language type and speech content collaborative classification acoustic model, and outputting phoneme posteriori probability distribution corresponding to the feature sequence; the decoder generating a plurality of candidate word sequences and acoustic model scores corresponding to the candidate word sequences in combination with the sequence phoneme posteriori probability distribution of the features; and 3) combining the acoustic model scores and the language model scores of the candidate word sequences to serve as an overall score, and taking the candidate word sequence with the highest overall score as a recognition result of the voice content of the specific language.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1

Method and system for modeling a common-language speech recognition, by a computer, under the influence of a plurality of dialects

The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model. Finally, a recognition model is obtained in a manner that the second dialectal-accented common-language model is merged into the temporary merged model according to a second confusion matrix generated by recognizing the development data of second dialectal-accented common language by the temporary merged model. This method effectively enhances the operating efficiency and admittedly raises the recognition rate for the dialectal-accented common language. The recognition rate for the standard common language is also raised.
Owner:SONY COMPUTER ENTERTAINMENT INC +1

System and method for direct speech translation system

PendingUS20200226327A1Simplifies speech recognitionSimplifies translationNatural language translationMathematical modelsEncoder decoderSpeech translation
A system for translating speech from at least two source languages into another target language provides direct speech to target language translation. The target text is converted to speech in the target language through a TTS system. The system simplifies speech recognition and translation process by providing direct translation, includes mechanisms described herein that facilitate mixed language source speech translation, and punctuating output text streams in the target language. It also in some embodiments allows translation of speech into the target language to reflect the voice of the speaker of the source speech based on characteristics of the source language speech and speaker's voice and to produce subtitled data in the target language corresponding to the source speech. The system uses models having been trained using (i) encoder-decoder architectures with attention mechanisms and training data using TTS and (ii) parallel text training data in more than two different languages.
Owner:APPL TECH APPTEK

Speech real-time translation method and system based on mobile terminal and double-ear wireless headset

The invention provides a speech real-time translation method based on a mobile terminal and a double-ear wireless headset. The method comprises the steps that the mobile terminal is connected with a first wireless earphone and / or a second wireless earphone through a Bluetooth communication mode; the first wireless earphone is used as a source end / receiving end of a first language speech signal, the second wireless earphone is used as a source end / receiving end of a second language speech signal, and the mobile terminal receives the first language speech signal transmitted by the first wirelessearphone and translates the first language speech signal into the second language speech signal; and the mobile terminal transmits the second language speech signal obtained through translation to the first wireless earphone and / or the second wireless earphone, so that the first language speech signal and the second language speech signal are mutually switched between the first wireless earphoneand the second wireless earphone. Through the method, two types of speech signals can be mutually switched between the two earphones of the double-ear wireless headset, application forms of the double-ear wireless headset are richer, and the usage experience of a user is improved.
Owner:GOERTEK INC

Anthropomorphic oral translation method and system with man-machine communication function

The invention provides an anthropomorphic oral translation method with a man-machine communication function. The method comprises the following steps of conducting intelligent speech recognition of source language speech and obtaining source language text; processing the source language text and a communication scene, and conducting anthropomorphic man-machine communication; conducting machine translation to obtain a translation result. The invention further provides an anthropomorphic oral translation system with the man-machine communication function. Through the adoption of the system, man-machine communication with a user needs to be conducted according to a translation task if necessary, the information used for obviously improving translation experiences of the user in a complex application scene is obtained accurately, and the accuracy of translation semantics is improved.
Owner:BEIJING ZIDONG COGNITIVE TECH CO LTD

Simultaneous interpretation method and device, and computer equipment

The invention provides a simultaneous interpretation method and device, and computer equipment. The simultaneous interpretation method comprises the steps: acquiring a source language voice signal tobe translated; carrying out speech recognition on the source language speech signal to generate a source language vocabulary sequence and a source language pinyin sequence; inputting the source language vocabulary sequence and the source language pinyin sequence into corresponding encoders respectively, and obtaining a vocabulary vector sequence and a pinyin vector sequence corresponding to the source language voice signal; inputting the vocabulary vector sequence and the pinyin vector sequence into a decoder; generating a target language sequence corresponding to the source language voice signal; and due to the fact that the source language pinyin sequence generally does not make mistakes, determining the target language sequence corresponding to the source language voice signal by combining the source language pinyin sequence, so that part of errors in the source language vocabulary sequence can be corrected, and the simultaneous interpretation efficiency is improved, and the fault-tolerant capability of voice recognition errors is improved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Earphone capable of improving English listening comprehension of user

InactiveCN103179481AImprove listening skillsIn line with the laws of language learningEarpiece/earphone attachmentsTransducer circuitsOutput deviceElectrophonic hearing
The invention discloses an earphone capable of improving the English listening comprehension of a user, and relates to a study aid. The earphone is characterized in that speech receivers are arranged on an earphone body of the earphone, speed output devices are arranged in earmuffs, the speech receivers and the speech output devices are connected with a Chinese-English simultaneous translator by conducting wires, source-language speech in an environment is received by the speech receivers, is transmitted to the Chinese-English simultaneous translator via the conducting wires, and is converted into target-language speech by the Chinese-English simultaneous translator, and the voice output devices are used for playing the target-language speech which is converted from the source-language speech by the Chinese-English simultaneous translator. The earphone has the advantages that the Chinese speech in the ambient environment can be converted into corresponding English speech in real time, and a natural and real English language environment can be created for the English learner, so that English learning becomes visual, specific and vivid, language learning laws are met, and the listening competence of the English learner is greatly improved.
Owner:DEZHOU UNIV

Method and system for modeling a common-language speech recognition, by a computer, under the influence of a plurality of dialects

The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model. Finally, a recognition model is obtained in a manner that the second dialectal-accented common-language model is merged into the temporary merged model according to a second confusion matrix generated by recognizing the development data of second dialectal-accented common language by the temporary merged model. This method effectively enhances the operating efficiency and admittedly raises the recognition rate for the dialectal-accented common language. The recognition rate for the standard common language is also raised.
Owner:SONY COMPUTER ENTERTAINMENT INC +1

System for realizing real-time speech mutual translation

The invention discloses a system for realizing real-time speech mutual translation. The system comprises an intercom device and a mobile communication device; the intercom device acquires a speech signal, converts the speech signal into speech data, and sends the speech data to the mobile communication device; the mobile communication device or a cloud server converts the speech data into statements and characters after performing speech recognition on the speech data, translates the statements and characters into a text of a target language, performs speech synthesis processing and generatesplayable speech data for playing in the mobile communication device; the mobile communication device obtains the speech signal, converts the speech signal into speech data, converts the speech data into the statements and characters after performing speech recognition on the speech data, translates the statements and characters into the text of the target language, performs the speech synthesis processing and generates playable speech data; and the playable speech data are sent to the intercom device for playing. By adoption of the system provided by the invention, a language speech can be translated into a speech signal in another language in real time, which facilitates two-way communication between people in different languages.
Owner:北京分音塔科技有限公司

Multilingual speech recognition model training method and device thereof, equipment and storage medium

The invention discloses a multilingual speech recognition model training method, and relates to the field of artificial intelligence, and the method comprises the steps: carrying out the training of aspeech recognition model through a first language, and obtaining an initial speech recognition model; building an adaptive network function, and embedding the adaptive network function into a hiddenlayer of the initial speech recognition model to obtain an initial multilingual speech recognition model; performing model training on the initial multilingual speech recognition model through the speech data of the second language to obtain a training result; and iteratively updating the initial multilingual speech recognition model until the training result falls into a preset standard trainingresult range, and outputting the multilingual speech recognition model. In addition, the invention also relates to a blockchain technology, and the voice data of the first language and the voice dataof the second language can be stored in the blockchain. According to the invention, the adaptive network function is embedded into the hidden layer of the initial speech recognition model, so that thetraining efficiency of the multi-language speech recognition model can be improved.
Owner:PING AN TECH (SHENZHEN) CO LTD

Speech translation apparatus, method and computer readable medium for receiving a spoken language and translating to an equivalent target language

Speech translation apparatus includes first generation unit generating first text representing speech recognition result, and first prosody information, second generation unit generating first para-language information, first association unit associating each first portion of first text with corresponding first portion of first para-language information, translation unit translating first text into second texts, second association unit associating each second portion of first para-language information with corresponding second portion of each second text, third generation unit generating second prosody-information items, fourth generation unit generating second para-language-information items, computation unit computing degree-of-similarity between each first para-language information and corresponding one of second para-language-information items to obtain degrees of similarity, selection unit selecting, from second prosody-information items, maximum-degree-of-similarity prosody information corresponding to maximum degree, fifth generation unit generating prosody pattern of one of second texts which corresponds to maximum-degree-of-similarity prosody information, and output unit outputting one of second texts which corresponds to maximum-degree-of-similarity prosody information.
Owner:KK TOSHIBA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products