Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

35 results about "Speech reconstruction" patented technology

Speech Affect Editing Systems

This invention generally relates to system, methods and computer program code for editing or modifying speech affect. A speech affect processing system to enable a user to edit an affect content of a speech signal, the system comprising: input to receive speech analysis data from a speech processing system said speech analysis data, comprising a set of parameters representing said speech signal; a user input to receive user input data defining one or more affect-related operations to be performed on said speech signal; and an affect modification system coupled to said user input and to said speech processing system to modify said parameters in accordance with said one or more affect-related operations and further comprising a speech reconstruction system to reconstruct an affect modified speech signal from said modified parameters; and an output coupled to said affect modification system to output said affect modified speech signal.
Owner:SOBOL SHIKLER TAL

Partial speech reconstruction

A system enhances the quality of a digital speech signal that may include noise. The system identifies vocal expressions that correspond to the digital speech signal. A signal-to-noise ratio of the digital speech signal is measured before a portion of the digital speech signal is synthesized. The selected portion of the digital speech signal may have a signal-to-noise ratio below a predetermined level and the synthesis of the digital speech signal may be based on speaker identification.
Owner:NUANCE COMM INC

Electronic larynx speech reconstructing method and system thereof

The invention provides an electronic larynx speech reconstructing method and a system thereof. The method comprises the following steps of: firstly, extracting model parameters form collected speech as a parameter library; secondly, collecting the face image of a sounder, and transmitting the face image to an image analysis and processing module to obtain the sounding start moment, the sounding stop moment and the sounding vowel category; thirdly, synthesizing a voice source wave form through a voice source synthesizing module; and finally, outputting the voice source wave form through an electronic larynx vibration output module. Wherein the voice source synthesizing module is used for firstly setting the model parameters of a glottis voice source to synthesize the glottis voice source wave form, then simulating the transmission of the sound in the vocal tract by using a waveguide model and selecting the form parameters of the vocal tract according to the sounding vowel category so as to synthesize the electronic larynx voice source wave form. The speech reconstructed by the method and the system is closer to the sound of the sounder per se.
Owner:XI AN JIAOTONG UNIV

Speech reconstruction-based instantaneous noise suppressing method

ActiveCN104599677ATransient Noise SuppressionTransient Noise CancellationSpeech analysisNoise detectionDistribution characteristic
A speech reconstruction-based instantaneous noise suppressing method relates to the technical field of audio processing and solves the technical problem of instantaneous noise suppression. The speech reconstruction-based instantaneous noise suppressing method eliminates influence of instantaneous noise through instantaneous noise detection and instantaneous suppression and comprises, firstly, eliminating steady-state noise inside signals through traditional methods, and based on the different distribution characteristics of white voice noise signals and instantaneous noise signals, detecting instantaneous noise; secondly, after the instantaneous noise is detected, proposing a speech reconstruction-based algorithm to suppress the instantaneous noise, discarding frames containing the instantaneous signals, performing waveform reconstruction through uninterrupted signals adjacent in tandem to replace original signals. Therefore, the instantaneous noise can be completely eliminated under the condition without obvious speech distortion. The speech reconstruction-based instantaneous noise suppressing method is applicable to processing speech signals containing the instantaneous noise.
Owner:SHANGHAI ADVANCED RES INST CHINESE ACADEMY OF SCI +1

Speech enhancement system and method based on MFrSRRPCA algorithm

ActiveCN109215671AReduces the possibility of false eliminationsValid reservationSpeech analysisTime domainTime–frequency analysis
The invention discloses a speech enhancement system and method based on a multi-subband short-time fractional Fourier spectrum random rearrangement robust principal component analysis MFrSRRPCA algorithm. The realization steps are: a time-frequency analysis module generates time-frequency information of noisy speech; the time-frequency analysis module generates time-frequency information of noisyspeech. The time-frequency subband division module divides the time-frequency amplitude spectrum of the noisy speech into a plurality of noisy subbands. Each time-frequency amplitude spectrum enhancement module randomly disrupts the sequence of each frame spectrum element in the corresponding noisy sub-band, and generates the corresponding enhancement sub-band by using a robust principal componentanalysis algorithm according to the noise intensity estimation value in the corresponding sub-band. The time-frequency subband recombination module composes all the enhancement subbands to enhance the time-frequency amplitude spectrum. The time-domain speech reconstruction module reconstructs the enhanced time-frequency amplitude spectrum into enhanced speech. The invention can improve the soundquality and intelligibility of the noisy speech, and can be used for the speech enhancement and noise reduction of the speech receiving system.
Owner:XIDIAN UNIV

Remote tracing and debugging of automatic speech recognition servers by speech reconstruction from cepstra and pitch information

Methods and systems are provided for remote tuning and debugging of an automatic speech recognition system. Trace files are generated on-site from input speech by efficient, lossless compression of MFCC data, which is merged with compressed pitch and voicing information and stored as trace files. The trace files are transferred to a remote site where human-intelligible speech is reconstructed and analyzed. Based on the analysis, parameters of the automatic speech recognition system are remotely adjusted.
Owner:NUANCE COMM INC

Method and apparatus for high resolution speech reconstruction

A method and apparatus identify a clean speech signal from a noisy speech signal. The noisy speech signal is converted into frequency values in the frequency domain. The parameters of at least one posterior probability of at least one component of a clean signal value are then determined based on the frequency values. This determination is made without applying a frequency-based filter to the frequency values. The parameters of the posterior probability distribution are then used to estimate a set of frequency values for the clean speech signal. A clean speech signal is then constructed from the estimated set of frequency values.
Owner:MICROSOFT TECH LICENSING LLC

Voice processing method, device and equipment and storage medium

The invention relates to a voice processing method, device and equipment and a storage medium. The method comprises the steps: obtaining a to-be-processed first voice and a to-be-processed second voice; calling an encoder in a voice processing model obtained by performing optimization training based on at least one target speaker statement to encode the obtained voice, and respectively obtaining a first feature representing text information irrelevant to the identity of the speaker and a second feature representing tone information of the target speaker; and performing decoding and voice reconstruction based on the first feature and the second feature to obtain a target voice after tone conversion. Thus, through an end-to-end voice processing model, the voice processing model does not need a large number of target speaker statements, and the tone modeling ability of the target speaker can be completed only based on a small number of utterances, so that the occupation and time consumption of computing resources for model training are reduced.
Owner:BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Cross-modal generation method based on voice and face images

The invention relates to a cross-modal generation method based on voice and a face image. The method comprises the steps of voice reconstruction of a face and personalized voice synthesis of the faceimage. A voice reconstruction face model based on residual priori is provided for voice reconstruction of a face, and the face of the person is generated according to an input section of unknown voice. According to personalized voice synthesis of the face image, a face image personalized voice synthesis model based on residual priori is provided, and the voice of the person is synthesized according to the given face image and a section of text. The invention is scientific and reasonable in design, the effect of the voice reconstruction face model can generate the face image very similar to theoriginal face, the robustness is very high, the number of the generated faces is not a fixed number, the voice of any speaker is input, and the face similar to the speaker can be reconstructed. And the residual priori face image personalized speech synthesis model is also used for synthesizing the speech of the person according to any face image. In addition, the proposed residual priori knowledge method can accelerate convergence of the model and achieve a better effect.
Owner:TIANJIN UNIV

Voice processing system and method and intelligent fume hood system based on active noise reduction

ActiveCN112139191AImprove practicalityAccurate noise analysisDirt cleaningSpeech reconstructionNoise
The invention discloses a voice processing system and method and an intelligent fume hood system based on active noise reduction. The system comprises a voice collection module, an exhaust fan parameter obtaining module, a noise reduction module and a voice reconstruction module, wherein the voice collection module is used for collecting voice and converting the voice into a digital signal; the exhaust fan parameter obtaining module is used for obtaining the rotating speed of an exhaust fan; the noise reduction module is used for obtaining noise signals of the exhaust fan and converting the noise signals into noise reduction signals, the noise reduction module is used for receiving output data of the exhaust fan parameter obtaining module, fundamental frequency and sound pressure are obtained through calculation in combination with the number of blades of the exhaust fan, the diameter of an impeller and power, and therefore the noise signals are obtained; and the voice reconstruction module is used for superposing the noise reduction signals and output signals of the voice collection module to obtain a reconstruction signal. According to the system, the noise signal of the exhaustfan is obtained by directly measuring the rotating speed of the exhaust fan and combining the number of the blades of the exhaust fan, the noise analysis is accurate, and the noise reduction accuracyand the noise reduction effect in the exhaust environment using the exhaust fan are improved.
Owner:AVIC HUADONG OPTOELECTRONICS (SHANGHAI) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products