Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

95 results about "Vocal pitch" patented technology

Method for tone/intonation recognition using auditory attention cues

ActiveUS20120116756A1Speech recognitionSpoken languageSpoken language processing
In a spoken language processing method for tone / intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.
Owner:SONY COMPUTER ENTERTAINMENT INC

Method of converting text note to voice broadcast in mobile phone

The present invention discloses a method which is used to transform a text message into a voice playing in a cell phone. The method aims at the content of the characters in the existing message, to initialize the dynamic Pinyin base after starting up the cell phone; the dynamic Pinyin base is adjusted after receiving a new message; the characters in the text are transformed into the Pinyin and the tones, after the user pressing a voice playing key while reading the message, then each Pinyin data is called out from the dynamic Pinyin base, to form a Pinyin data flow, to be sent to a voice processing chip; the voice processing chip transforms the voice data flow into a simulating voice signal, and the simulating voice signal is sent to a speaker of the cell phone to play out. A big storage space of the cell phone is unnecessary by adopting the present invention, and the content of the message can be transformed into the voice to playing. The method is quite convenient for the cell phone user.
Owner:ZTE CORP

Voice synthesis method and device, computer readable medium and electronic equipment

The invention relates to a voice synthesis method and device, a computer readable medium and electronic equipment. The method comprises the steps that voice feature information of a multilingual textand language feature vectors of all languages in the multilingual text are acquired, and the voice feature information comprises phonemes, tones, segmented words and rhythm boundaries; and voice synthesis is performed according to the voice feature information and the language feature vector to obtain first audio information corresponding to the multilingual text. Therefore, the accuracy and understandability of the first audio data are improved, and a user can quickly understand the text content corresponding to the first audio data. In addition, pause can be carried out at the natural rhythmboundary during speech synthesis, so that the naturalness and fluency of the first audio information can be improved. Besides, the voice synthesis method can realize smooth conversion of different languages, supports voice synthesis of texts of various languages, and does not limit specific languages, namely, has wide applicability.
Owner:BEIJING BYTEDANCE NETWORK TECH CO LTD

Speech Processing and Learning

This invention relates to the field of tonal language speech signal processing. We describe a computer system for characterizing samples of a tonal language. These are analyzed to identify one or more vocal tract characterizing parameters of the user and synthesized speech data is generated by modifying a variation of fundamental frequency with time using a set of standard tones. The synthesized speech data represents the user speaking the tonal language with the modified fundamental frequency. Graphical feedback to guide the user can also be provided.
Owner:YU KAI

Segmental tonal modeling for tonal languages

A phone set for use in speech processing such as speech recognition or text-to-speech conversion is used to model or form syllables of a tonal language having a plurality of different tones. Each syllable includes an initial part that can be glide dependent and a final part. The final part includes a plurality of phones. Each phones carries partial tonal information such that the phones taken together implicitly and jointly represent the different tones.
Owner:MICROSOFT TECH LICENSING LLC

Chinese mandarin character pronunciation conversion method based on self-attention mechanism

The embodiment of the invention provides a Chinese mandarin character pronunciation conversion method based on self-attention mechanism. The Chinese mandarin character pronunciation conversion methodcan be used for direct prediction from Chinese sentences to pronunciation after tone change. According to the Chinese mandarin character pronunciation conversion method, multi-task learning and relative position coding are combined with a self-attention model, a self-attention mechanism is used for capturing the dependency relationship of characters in input sentences, and extra part-of-speech andthree pinyin attributes are introduced into multi-task learning to serve as sub-tasks; a tone transfer relationship is modeled by using CRF, and position information of a sequence is effectively modeled by relative position coding; finally, pronunciation can be obtained through a main task prediction result, and can also be a result of joint judgment of three pinyin attribute subtasks. Accordingto the method, the performance of Chinese mandarin character pronunciation conversion is improved to a great extent.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI

Systems and Methods for Comprehensive Chinese Speech Scoring and Diagnosis

ActiveUS20210050001A1Provide feedbackMore accurate machine-generated phonetic annotations for polyphonic wordsMathematical modelsNatural language data processingSpoken languageSpeech sound
Systems and methods for scoring spoken Chinese are provided. In an exemplary method, a user reads a written transcript and the user's voice is recorded. Characters of the transcript are then represented as pinyins with tone markings. The voice recording is sectioned into individual phonemes that are aligned with the phonemes of the pinyins. For each character of the transcript, a tone is determined for the phonemes in the voice recording corresponding to that character. That tone is scored as correct or incorrect by comparison to the tone marking associated with the pinyins for that character. The pronunciation of each phoneme of the voice recording is also scored relative to the corresponding phonemes of the pinyins of the characters of the transcript. Further scores for words and sentences can be developed from the tone and pronunciation scores and provided to the user with feedback.
Owner:PONDDY EDUCATION INC

Dialogue type voice recognition method and system, electronic equipment and storage medium

PendingCN111508498AAccurate identificationSolve the problem of not being able to cut accuratelySpeech recognitionHigh level techniquesNoiseEngineering
The invention relates to the technical field of voice recognition, and provides a dialogue type voice recognition method and system, electronic equipment and a storage medium. The dialogue type voicerecognition method comprises the steps of obtaining a dual-channel audio of dialogue type voice, and performing compression reduction and channel separation on the dual-channel audio to obtain a single-channel original audio; performing framing processing on the original audio to obtain a plurality of audio frames, and performing cutting processing on the original audio according to the energy ofeach audio frame to obtain a plurality of effective audio segments; extracting Mel cepstrum features and tone features of the effective audio segments and speaker features of channels where the effective audio segments are located, and inputting the Mel cepstrum features, tone features and speaker features into a speech recognition model to obtain recognition results of the effective audio segments; and generating a voice recognition result of the original audio according to the recognition result of each effective audio segment. According to the invention, accurate cutting of the double-channel dialogue type voice can be realized, and the dialogue type voice can be accurately recognized under the condition of shielding surrounding noise.
Owner:CTRIP COMP TECH SHANGHAI

Method for verifying time domain fine structure novel code of artificial cochlea tone language

ActiveCN109036569AReflect acceptanceMedical simulationElectrotherapyCochleaFine structure
The invention discloses a method for verifying a time domain fine structure novel code of an artificial cochlea tone language. The method mainly comprises the following steps of 1), selecting an experiment biosome; 2), establishing a nervus thalamicus response time-space quantified mode I under the condition of original voice induction; 3), establishing a nervus thalamicus response time-space modeII on the condition of novel coding tone voice stimulation induction; 4), establishing a nervus thalamicus response time-space mode III on the condition of novel coding voice electric stimulation; and 5), through adjusting an electric stimulation mode parameter which corresponds with a voice code, making the nervus thalamicus response time-space mode III on the condition of novel coding voice electric stimulation approach the nervus thalamicus response time-space quantified mode I under the condition of original voice induction. The method can really and objectively reflect the receiving degree of a biosome hearing channel to a certain sound coding strategy and can be used as an auxiliary evaluating method for an artificial cochlea novel voice code.
Owner:CHONGQING UNIV

Computer Chinese character 'tone-correcting two stroke pinyin' quick input method

InactiveCN101059724ALarge coding design capacitySelf-made word functionInput/output processes for data processingDeep levelSound recognition
The invention relates to a computer Chinese quick input method, for supporting the demand of quick input technique of nation, or the like, making input character with dialectal correct function, first sound recognition or the like. The invention comprises that 1, considering the key frequency of main area of keyboard, that the middle part is higher than the top part, the top part is higher than the lower part, the middle part is higher than around, 2, considering the flexibility of hand that right hand is higher than left hand, 3, considering the application frequency of initial and last consonants, and tons to consider code, 4, adding classify recognizing function on the punctuation mark, 5, directly inputting with little memory amount, large code volume, better combine effect, low repeat error, high input speed, and wide application.
Owner:步玉程 +1

Vehicle, playing equipment thereof and automatic multimedia playing control method

The invention provides a vehicle, playing equipment thereof and an automatic multimedia playing control method. The playing equipment comprises a playing device, a microphone and a processor, whereinthe playing device is used for playing multimedia; the microphone is used for receiving a speech signal; the processor is respectively electrically connected with the playing device and the microphone, and used for identifying a scene intention corresponding to the speech signal received by the microphone according to a scene intention list, and controlling a multimedia playing state of the playing device according to the identified scene intention; and the scene intention list is a list of corresponding relations between the speech signals and the scene intentions. According to the mode of the equipment, the current user intention can be automatically identified; and when in special scenes where a user needs to make a call and the like, suspension, stopping or tone adjustment of the multimedia in playing can be automatically controlled, and the user does not need to operate manually.
Owner:SHANGHAI PATEO INTERNET TECH SERVICE CO LTD

Chinese tone recognition method based on time frequency crest line-Hough transformation

The invention provides a Chinese tone recognition method based on time frequency crest line-Hough transformation. Chinese tone recognition is converted into classification of the change trend of a line segment in a time frequency distribution diagram so that a new Chinese tone recognition method and technique can be acquired. The method includes the steps that firstly, final voice signals carrying Chinese tones are expressed through the SPWVD time frequency distribution diagram and tone information is shown through a group of similarly-parallel time frequency crest lines in the time frequency diagram; secondly, due to the fact that the main time frequency crest line is a region with larger energy in the diagram, the change trend of different tones is reflected, and in order to reduce the calculated amount, treatment such as binaryzation, thresholding and refining is conducted on the time frequency distribution diagram, and a center line segment of the main time frequency crest line reflecting the change trend of the tones is acquired; thirdly, Hough transformation is conducted on the time frequency distribution diagram containing the center line of the main crest line, so that the intercept and included angle parameters of the center line of the main crest line are acquired; finally, the tone type is judged according to the intercept and the included angle of the line segment and the coordinate values of a start point and an end point of the line segment.
Owner:JIANGNAN UNIV

Speech recognition method and device and terminal equipment

The invention is applicable to the technical field of terminal equipment. The invention provides a speech recognition method, a speech recognition device and terminal equipment. The method comprises the following steps: inputting target speech data into a pre-constructed acoustic model based on a neural network to obtain a target pinyin sequence; inputting the target pinyin sequence into a pre-constructed language model based on a neural network to obtain a target text sequence. A speech recognition process is split into two parts; wherein one part is a sequence from audio data to pinyin; onepart is from a pinyin sequence to a character sequence, so that the dependence on data volume is greatly reduced, the recognition accuracy from the pinyin sequence to the character sequence is greatlyimproved due to facts that there are only more than 1400 pinyin with tones and more than 7000 common Chinese characters, and the application requirement of commercial-level voice recognition accuracyis met.
Owner:TCL CORPORATION

Computer Chinese phonetic double-click rapid input method

The invention relates to a computer Chinese (consonant-vowel double- click) tone-participative coding quick input method, developed by deep research mainly for the requirements of Ministry Of Labour And Social Security and Ministry of Information Industry for professional skills of Chinese short-hand experts on compute: the primary input 140 Chinese characters per minute, the medium input 180 Chinese characters per minute and the senior input 220 Chinese characters per minute. And the invention overcomes the defect that the traditional phonetic English small letters correspond to on-keyboard English capital letters to input Chinese characters, and adopts a truly original design solution, and proceeds in all cases from raising Chinese character input speed and makes computer Chinese character input reach the limit. And its main principle and features: 1. fully considering English key frequency of the main region of a keyboard, where the middle row is higher than the top row, the top row is higher than the bottom row, and the middle is higher than the four sides; 2. fully considering human hand flexibility, where the right-hand is higher than the left-hand; 3. fully considering use frequencies of initial consonants and vowels and make integral optimized settings from high to low corresponding to the keyboard frequency and flexibility of left and right hands; 4. fully considering rules of daily spoken language and written language and giving punctuations to part of speech classification recognizing function. And it has also features of visual input, strong regularity, low memory quantity, large coding capacity, good word composing effect, low repeated code rate, high input speed, etc.
Owner:孙莹莹

Chinese tone dichotic listening testing system and testing method

The invention relates to a Chinese tone dichotic listening testing system. The system comprises a testee information managing module, a testing material selecting module, a testing parameter configuring module, a testee screening and training module, a dichotic listening testing module and a testing result storing module; the testee information managing module is used for inputting, inquiring, modifying and deleting basic information of a testee; the testing material selecting module is used for supplying and selecting a Chinese tone material; the testing parameter configuring module is used for configuring values of the signal-to-noise ratio and the response time which are adopted in testing; the testee screening and training module is used for screening the testee meeting the requirement and completing acquainting and training on testing content, testing processes and testing methods before testing; the dichotic listening testing module is used for simultaneously supplying testing signal pairs composed of same syllables and different tones to the left ear and the right ear of the testee respectively according to the dichotic listening normal form; the testing result storing module is used for storing the testing process and a testing result of the testee in a file.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI

Digital key phonetic transcription input method

The input method includes selector key, assertion key and phonetic input keys composed of 26 letters setup on digit key 2 to digit key 9. Initial consonants and finals selected by the spelling input keys are displayed on spelling display place of display screen as well as a group of relevant single Chinese character is displayed on display place of single character. Tone selection keys delete coincident characters. Selector key and assertion key chooses a character. Selector of number of associational words or phonetic input keys selects preferred phrase. The method reduces the number of time of pushing keys and the coincident characters since tone selection keys are introduced. The selector of number of associational words creates condition for choosing preferred phrase quickly.
Owner:YIRUAN SCI & TECH NANJING

Statement error correction method and device after speech recognition, equipment and storage medium

The embodiment of the invention discloses a statement error correction method and device after speech recognition, equipment and a storage medium. According to the technical scheme provided by the embodiment of the invention, the method comprises the steps of: recognizing the first occurrence probability of each character in the to-be-corrected text through the language model, determining the recognized error word in the to-be-corrected text according to the first occurrence probability, determining the model candidate word by utilizing the language model, determining the homophone candidate word according to the pinyin and tone of the recognized error word, further determining a first sequence and a second sequence between a model candidate word and a homophone candidate word, determining a candidate sequence between the model candidate word and the homophone candidate word according to the first sequence and the second sequence, determining an error correction candidate word according to the candidate sequence,replacing a recognition error word in a to-be-corrected text with the error correction candidate word, and directly docking and modifying the voice recognition result in a non-intrusive manner, so that the training cost of voice recognition network learning is effectively reduced.
Owner:PCI TECH GRP CO LTD +2

Spoken language evaluation method and device

PendingCN112331180AImprove the accuracy of judgmentReduce the impact of large differences in judgment effectsSpeech recognitionEvaluation resultSpoken language
The invention provides a spoken language evaluation method and device. The spoken language evaluation method comprises the steps: obtaining a to-be-evaluated audio and an evaluation text correspondingto the to-be-evaluated audio; determining an attribute characteristic value of each phoneme in the evaluation text and a posterior probability corresponding to each phoneme based on the to-be-evaluated audio and the evaluation text; extracting a pronunciation characteristic value corresponding to the evaluation text based on the evaluation text and the posterior probability corresponding to eachphoneme; generating a characteristic vector corresponding to each phoneme according to the attribute feature value and the pronunciation feature value of each phoneme; and inputting the characteristicvector corresponding to each phoneme into a spoken language evaluation model to obtain an evaluation result output by the spoken language evaluation model. According to the spoken language evaluationmethod provided by the invention, the pronunciation characteristic value corresponding to each phoneme is introduced, and the potential error of the current pronunciation can be accurately explored.Multi-dimensional characteristic information is provided for a spoken language evaluation model, and the judgment accuracy of initial consonants, final consonants and tones is improved.
Owner:BEIJING YUANLI WEILAI SCI & TECH CO LTD

Spoken language pronunciation evaluation method and system for minority language, and storage medium

The invention provides a spoken language pronunciation evaluation method and system for a minority language and a storage medium. The method comprises the steps of obtaining a target text, a pronunciation dictionary and a reading audio made by a user according to the target text; generating a phoneme decoding result and a phoneme alignment result by using a speech recognition model; performing sound beat analysis on the target text based on the language pronunciation characteristics to obtain a sound beat analysis result; performing pitch analysis on target voice data to obtain a pitch analysis result; obtaining an accuracy score, an intonation score and a tone score of a read audio, taking the intonation score as a second pronunciation evaluation result, and taking the tone score as a third pronunciation evaluation result; and fusing the accuracy score, the intonation score and the tone score to obtain a total score of sentence pronunciation. According to the method, voice is calculated and analyzed from multiple different dimensions such as accuracy, integrity, fluency, sentence segmentation, tone and intonation according to the pronunciation characteristics of the minority language to obtain an evaluation result.
Owner:早道(大连)教育科技有限公司

Voice interaction matching method, computer equipment and computer readable storage medium

The invention provides a voice interaction matching method, computer equipment and a computer readable storage medium, and the method comprises the following steps: obtaining a voice instruction, anddetermining text information corresponding to the voice instruction; determining pinyin information and tone information corresponding to the text information; and determining word and sentence information corresponding to the voice instruction in a preset word bank according to the pinyin information and the tone information. Through the technical scheme of the invention, the pinyin information and the tone information corresponding to the text information are determined, and the screened vocabularies can be further matched, so that the word and sentence information with higher matching degree with the voice instruction is obtained, and the voice interaction matching accuracy is improved.
Owner:YONYOU NETWORK TECH

Chinese character input method and special-purpose keyboard thereof

ActiveCN101034319AUniquely codedGuaranteed continued developmentInput/output processes for data processingSoftware designGlyph
The invention relates to Chinese character input method and its dedicated keyboard, it includes: Chinese Pinyin input, shaped input, tone and strokes input, tone input is based on the structure of Chinese characters to select importation tone denoted with a bond , stroke input is strokes of that word denoted by a bond behind in the tone ,this Chinese character input method compared with the existing Chinese character input method, it is the only with coding for whether the font or the pronunciation, and its expansibility, sorting retrieval and compatibility are better than the existing Chinese character input method, the Chinese character input method ia easy to learn, to remember, to promote, and convenient for the inquiry, retrieval, teaching and research;with software designed by the invention input method ,adding voice recognition technology, we can achieve human-machine interaction in Chinese, and even can directe robot in Chinese, a special keyboard is arrived according to operating frequency of using letters and connections among the input letters, it is conducive to touch-type quickly for the operator.
Owner:唐国栋

Audio amplification electronic device with independent pitch and bass response adjustment

Techniques used to selectively amplify audio signals are described herein in connection with audio amplification electronic devices, such as hearing aids, including over-the-ear hearing aids. A deviceand its operation are described to facilitate setting low and high tone / volume controls separately, using at least two selection mechanisms. In one aspect, a first selection mechanism includes a pitch frequency control rocker switch and the second selection mechanism includes a bass frequency control rocker switch disposed separately. In one aspect, the bass frequency control rocker switch causesa processor to bias the frequency response of the sound amplifier for frequencies below 1 kHz. In another aspect, the pitch frequency control rocker switch causes a processor to bias the frequency response of the hearing for frequencies above 1 kHz. In another aspect, the selection mechanism involves the separate attenuation of treble and bass adjustments in response to a user selection of a rocker switch setting for each adjustment.
Owner:恩里克·盖斯图特

Voice broadcasting method and electronic device

The embodiment of the invention discloses a voice broadcasting method and an electronic device. The voice broadcasting method comprises the following steps: acquiring voice information of a user; recognizing the voice information of the user, thus obtaining pinyin and pinyin tones corresponding to the user voice information and text content of the information; carrying out semantic analysis on thetext content, and generating voice broadcasting information according to the semantic analysis result, pinyin and pinyin tones; and broadcasting the voice broadcasting information. With the embodiment of the invention, polyphones can be accurately broadcasted.
Owner:VIVO MOBILE COMM CO LTD

Speech processing method and system for increasing Chinese tone recognition rate based on frequency shift processing

ActiveCN105167883AImprove the recognition rate of Chinese tonesReduce power consumptionEar treatmentSpeech recognitionIdentification rateN channel
The invention discloses a speech processing method and system for increasing the Chinese tone recognition rate based on frequency shift processing. The method comprises the following steps that M usable electrodes and H movable electrodes are determined, and therefore M-H electrodes exist in a fixed electrode sequence; changes of the fundamental frequency are detected according to the reference fundamental frequency and are expressed in percentage; when H is 2, the fixed electrode sequence is shifted if the change of the fundamental frequency exceeds 20% of the reference fundamental frequency; when H is 4, the fixed electrode sequence is shifted if the change of the fundamental frequency exceeds 15% of the reference fundamental frequency; when H is 6, the fixed electrode sequence is shifted if the change of the fundamental frequency exceeds 10% of the reference fundamental frequency; after the fixed electrode sequence is determined, N channels with the maximum energy are selected to be subjected to stimulation. According to the speech processing method and system for increasing the Chinese tone recognition rate based on frequency shift processing, the integral position of stimulating electrodes is changed according to the changes of the fundamental frequency of input sound signals so that the information of frequency changes in the sound signals can be transmitted, and finally the result of increasing the Chinese tone recognition rate is achieved.
Owner:ZHEJIANG NUROTRON BIOTECH

Voice tone recognition method and system based on random forest

The invention discloses a voice tone recognition method and system based on a random forest, and the method comprises the steps: obtaining a to-be-recognized voice signal, and carrying out the preprocessing of the to-be-recognized voice signal; extracting and selecting characteristic parameters of the preprocessed voice signal to be recognized; and inputting the extracted feature parameters into apre-trained random forest model, and outputting a tone recognition result of the to-be-recognized voice signal. The random forest has the advantages of being easy to implement, high in operation speed, high in anti-noise capacity and the like, can be well applied to tone recognition, and can reduce the operation complexity of the tone classifier to the minimum while ensuring the recognition accuracy.
Owner:SHANDONG UNIV

Tone contour transformation of speech

InactiveCN1920945ASpeech recognitionSpeech synthesisSyllableTone contour
Tonal transformation of speech is provided. A tone applicable to a syllable of received speech is determined. A tonal contour applicable to said tone for a dialect of a listener is determined, and the syllable of received speech is altered to have said determined tonal contour. The altered speech may then be delivered to the listener.
Owner:AVAYA INC

Chinese character input method, voice synthesis method, mandarin Chinese learning method, Chinese character input system and keyboard

PendingCN110716654ARealize PhonicsMaster the pronunciation accuratelySpeech synthesisInput/output processes for data processingChinese charactersSpeech synthesis
The invention relates to the technical field of electric data processing, and specifically relates to a Chinese character input method, a voice synthesis method, a mandarin Chinese learning method, aChinese character input system and a keyboard. The Chinese characters are mapped by three-bit codes. The three codes are respectively used for mapping initial consonants, intermediate consonants and final consonants of Chinese characters. The vowel further comprises a children's sound. Spelling and reading of Chinese can be accurately realized. All the initial consonants, vowels and intermediate consonants are represented by only one code. In a simply spelling process, misunderstanding possibly caused by the existence of double finals is reduced, and preferably, input and word segmentation ofChinese tones can be realized while spelling is performed, so that the method can be conveniently applied to other fields such as Chinese learning and speech synthesis, Chinese learners can accuratelymaster pronunciation, tones and word tones of Chinese, and the pronunciation, tones and word tones of Chinese can be accurately displayed.
Owner:韦松波
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products