Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

65 results about "Time delay neural network" patented technology

Time delay neural network (TDNN) is a multilayer artificial neural network architecture whose purpose is to 1) classify patterns with shift-invariance, and 2) model context at each layer of the network.

System and method for audio/video speaker detection

ActiveUS7343289B2Improved TDNN structureVideo featureSpeech recognitionSpeech synthesisLoudspeakerNetwork processing
A system and method for detecting speech utilizing audio and video inputs. In one aspect, the invention collects audio data generated from a microphone device. In another aspect, the invention collects video data and processes the data to determine a mouth location for a given speaker. The audio and video are inputted into a time-delay neural network that processes the data to determine which target is speaking. The neural network processing is based upon a correlation to detected mouth movement from the video data and audio sounds detected by the microphone.
Owner:MICROSOFT TECH LICENSING LLC

A dynamic heterogeneous network traffic prediction method based on a deep space-time neural network

The invention belongs to the technical field of wireless communication, and particularly relates to a dynamic heterogeneous network flow prediction method based on a deep space-time neural network. Aiming at the problems of small coverage area, low prediction precision, short prediction time and the like of the existing mobile data traffic prediction method, the dynamic heterogeneous network traffic prediction method based on the deep space-time neural network is studied. Considering the characteristics of user mobility, flow data space-time correlation and the like, deeply researching a wide-coverage long-term mobile data flow prediction mathematical model description method in the dynamic heterogeneous network; On the basis, a space-time related convolutional long-short time memory network model is studied to predict the long-term trend of the mobile traffic in the dynamic heterogeneous network; A space-time related three-dimensional convolutional neural network model is studied to capture micro-fluctuation of a mobile flow sequence in the dynamic heterogeneous network; And fusing the long-term trend prediction model and the short-term change model of the mobile traffic, therebyrealizing wide-coverage and high-precision long-term mobile traffic prediction in the dynamic heterogeneous network.
Owner:HUBEI UNIV OF TECH

Neural-network self-correcting control method of permanent magnet synchronous motor speed loop

The invention discloses a neural-network self-correcting control method of a permanent magnet synchronous motor speed loop. The method is characterized by: taking a current loop and a motor as generalized objects; firstly, collecting information, such as a rotating speed, a current and the like; using an adaptive linear time-delay neural network to carry out off-line parameter identification to the motor; then, taking a weight obtained through off-line learning as an initial value of on-line learning; finally, carrying out on-line parameter identification to the system, calculating a load torque of the motor according to the identified parameter; designing a neural-network self-correcting control law according to the obtained parameter value and a load disturbance value, adjusting the network weight on line according to an error between a controlled object and an identification model, and then setting the parameter of the neural-network self-correcting controller on line so as to realize online adjustment of the controller parameter. Uncertainty of the system and influence brought by the external disturbance can be eliminated. Dynamic performance and an anti-disturbance ability of a servo system can be improved.
Owner:SOUTHEAST UNIV +1

Speaker recognition method based on Gaussian mixture model embedded with time delay neural network

The invention discloses a speaker recognition method based on a Gaussian mixture model (GMM) embedded with a time delay neural network (TDNN). In the speaker recognition method, the advantages of the TDNN and the GMM are fully considered, the TDNN is embedded into the GMM, and solves a residual of input and output vectors of the TDNN by fully utilizing the time sequence of an input characteristic vector through the conversion of a time delay network, and the residual modifies the training of the GMM through an expectation maximization method; besides, a likelihood probability is acquired by a modified GMM model parameter and the residual, and a TDNN parameter is modified by an inertial backward inversion method so as to ensure that parameters of the GMM and the TDNN are alternately updated. An experiment shows that: a recognition rate of the method is improved to a certain extent compared with that of a baseline GMM under various signal to noise ratios.
Owner:戴红霞 +2

Priority message forwarding method applied to allowed time delay network

ActiveCN102932275ASolving Differentiated QoS RequirementsReduce communication overheadData switching networksQuality of serviceNon real time
The invention relates to a priority message forwarding method applied to an allowed time delay network, which comprises the following steps: carrying out priority classification on messages at a source node and when a node is contacted with a forwarding node, and sending the messages to the forwarding node according to priorities; by the forwarding node in the network, carrying out cache region configuration and message forwarding on the received messages and messages to be sent of the node; and then sending the message with the corresponding priority when the node is contacted with a subsequent forwarding node, wherein an information sink takes charge of receiving the data messages and carrying out corresponding storage according to the priorities. According to the invention, different sending cache regions are set at the source node according to different priority requirements of services and at the message forwarding nodes, forwarding cache regions facing the messages with different priorities are set and the message forwarding is carried out, so that the problem of distinguishing QoS (Quality of Service) requirements of non-real-time services in the allowed time delay network is solved.
Owner:CHINA ACADEMY OF SPACE TECHNOLOGY

Speaker verification method and device

The invention provides a speaker verification method and a speaker verification device. The speaker verification method comdprises the following steps: acquiring second voice; converting first voice and the second voice, which are acquired in advance, into corresponding first spectrogram and second spectrogram; conducting feature extraction on the first spectrogram and the second spectrogram by virtue of convolutional neural network so as to acquire corresponding first features and second features; conducting feature extraction on the first features and the second features by virtue of time delay neural network so as to acquire corresponding third features and fourth features; and verifying a speaker in accordance with the third features and the fourth features. According to the speaker verification method and the speaker verification device provided by the invention, in a mode of combining the convolutional neural network and the ime delay neural network, the first voice and the second voice undergo the feature extraction, and the third features and the fourth features, which are finally extracted, are compared, so that speaker verification is implemented; and the speaker verification method and the speaker verification device provided by the invention are simple in computation and strong in robustness, and an excellent recognition effect can be achieved.
Owner:TSINGHUA UNIV

Method for training acoustic model based on CTC (Connectionist Temporal Classification)

The invention provides a method for training an acoustic model based on CTC (Connectionist Temporal Classification). The method comprises the steps of 1, training an initial GMM (Gaussian Mixture Model), wherein time point forced alignment on text annotation of training data by using the GMM to obtain a time region corresponding to each phoneme; 2, inserting a blank symbol associated with the phoneme behind each phoneme, wherein each phoneme has a unique blank symbol; 3, constructing a CTC forward and backward calculated search path diagram for a phoneme annotation sequence with the blank symbols being added by adopting a finite state machine; 4, restricting the appearance time range of each phoneme according to a time alignment result, pruning the search path diagram, and cutting off thepath with the phoneme position exceeding the time restrictions so as to obtain a final search path diagram required by calculating a network error in CTC; and 5, performing acoustic model training byadopting the combination of a time-delay neural network (TDNN) structure and the CTC method to obtain a final TDNN-CTC acoustic model.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1

Tool abrasion state identification method based on convolutional neural network and long-short-time memory neural network combined model

ActiveCN110153802ANo expert experience requiredSave the cost of selecting featuresMeasurement/indication equipmentsNumerical controlNerve network
The invention discloses a tool abrasion state identification method based on a convolutional neural network and long-short-time memory neural network combined model. A force measuring instrument and an acceleration sensor are arranged on a workbench clamp of a numerical control machine tool and a workpiece, three-direction force signals and vibration acceleration signals are collected, collected data are subjected to data pre-processing, normalization processing and unified segmentation are conducted on the same row of data, one-dimension data are converted into two-dimension data to serve asinput, the convolutional neural network in the combined model is used for extracting abstraction features, the long-short-time memory neural network in the combined model is used for finding relevancebetween the data, and finally the tool abrasion state is output. An established double-network structure is arranged in a serial manner, the internal relation between the two kinds of signals can beestablished, the more abstract features are extracted through convolution, the timing sequence feature is determined according to the long-short-time memory, accordingly, the purpose of deeper relation of the data and the model is achieved, and applicability is achieved on various machine tools.
Owner:SOUTHWEST JIAOTONG UNIV

Speech enhancement method and device

The invention provides a speech enhancement method and device, and the voice enhancement method includes the steps: receiving to-be-enhanced speech data; separating at least one speech stream from theto-be-enhanced speech data based on a long-term and short-term memory neural network; recognizing a target speech stream corresponding to the predetermined speech from at least one speech stream based on a time delay neural network; enhancing the target speech stream; and outputting the enhanced target voice stream. The speech enhancement method and device separate the enhanced speech data by thelong-term and short-term memory neural network, and then recognizes the target speech stream from the separated result through the time delay neural network, and then only enhances the target speechstream, so that the target speech is clear, thereby achieving the effect of noise reduction and effectively improving the user experience.
Owner:SAMSUNG ELECTRONICS CHINA R&D CENT +1

Automatic speech recognition method based on random depth delay neural network model

The invention belonging to the field of automatic speech recognition technology relates to an automatic speech recognition method based on a random depth delay neural network model. The method comprises: preparing training data; extracting acoustic features from trained speech audio data; training a traditional GMM-HMM model and carrying out forced alignment on the trained speech audio data by using the trained GMM-HMM model to obtain a corresponding frame level training label; supervising and training a random-depth-based time-delay neural network model by using the trained speech audio dataand the corresponding frame level training label and acquiring an acoustic model by combining a hidden Markov model; carrying out training by using corresponding text annotation data or texts of otherdata sets to obtain a trained language model; and constructing an automatic speech recognition decoder by using the trained language model and acoustic model. Therefore, the modeling ability of the model is strengthened and problems of over-fitting and gradient disappearing during the training process are solved, so that the accuracy of the speech recognition is improved.
Owner:SOUTH CHINA UNIV OF TECH

Coal mine water burst predicting method based on long-short-time memory neural network

The invention discloses a coal mine water burst predicting method based on a long-short-time memory neural network. The coal mine water burst predicting method introduces the long-short-time memory neural network into coal mine water burst prediction. Firstly, a feature selection method based on a Wrapper evaluation strategy is adopted to preprocess data, extract feature data and eliminate the influence of redundancy features on a follow-up prediction algorithm; by adopting an MSRA initializing method, a weight matrix is initialized into Gaussian distribution with the mean value of 0 and the variance of 2 / (input number), so that the prediction method has more reasonable initializing weight, and the convergence rate of the method is improved; an LSTM method is adopted to learn the change law of dynamic water burst data and the influence of the law on water burst, the method is prevented from overfitting by using a Dropout technology in the learning process. With increase of iteration number, the weight matrix of the prediction method is constantly updated, and accordingly the precision, stability and robustness of the prediction method are improved.
Owner:XI'AN UNIVERSITY OF ARCHITECTURE AND TECHNOLOGY

Speech recognition method and device, computer device and storage medium

The application relates to a speech recognition method and device, a computer device and a storage medium. The above method is characterized by obtaining a target network layer from the network layersof a time delay neural network of band-down sampling; adding a second neural network to the target network layer, using the output data of the target network layer as the input data of the second neural network, wherein the second neural network comprises at least one layer of network; obtaining the to-be-recognized speech data, and inputting the to-be-recognized speech data to the time delay neural network of band-down sampling, and recognizing the to-be-recognized speech via the time delay neural network of band-down sampling and the second neural network to obtain a corresponding speech recognition result. By using the time delay neural network of band-down sampling and the second neural network to recognize the speech data jointly, the better speech recognition result can be obtained.
Owner:VOICEAI TECH CO LTD +1

Systems and methods to support medical therapy decisions

Systems and methods for supporting medical therapy decisions are disclosed that utilize predictive models and electronic medical records (EMR) data to provide predictions of health conditions over varying time horizons. Embodiments also determine a 0-100 health risk index value that represents the “risk” for a patient to acquire a health condition based on a combination of real-time and predicted EMR data. The systems and methods receive EMR data and use the predictive models to predict one or more data values from the EMR data as diagnostic criteria. In some embodiments, the health condition trying to be avoided is Sepsis and the health risk index is a Sepsis Risk Index (SRI). In some embodiments, the predictive models are neural network models such as time delay neural networks.
Owner:APTIMA

Underwater sound source positioning method

The invention relates to an underwater sound source positioning method. The method includes the following steps that: sound source signals received by a hydrophone array are converted into digital sound signals; Fourier transformation is performed on the digital sound signals; a data covariance matrix is calculated on each frequency within a signal bandwidth, and feature vectors which can represent signal orientation information are extracted through feature value decomposition; in a training stage, a time delay neural network is used for learning training samples, so that the mapping relationmodel of the feature vectors and sound source orientations is obtained; and in a testing stage, the feature vector of a test sample is inputted to the trained model, so that distance and depth estimation value of a sound source can be obtained. According to the method of the invention, the deep neural network is utilized, so that robust and efficient underwater sound source positioning can be realized.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1

Language identification method and identification system

The invention provides a language identification method and identification system, and can improve performance of the language identification system. The method comprises the steps of: converting eachframe of voice signal into a pronunciation attribute characteristic; training a time delay neural network by utilizing the pronunciation attribute characteristics, wherein the pronunciation attributecharacteristics are input into the time delay neural network, the time delay neural network carries out learning and classification on the input pronunciation attribute characteristics to obtain distribution of each language in a pronunciation attribute characteristic space, i.e., a language model; and when carrying out language identification, inputting a pronunciation attribute characteristic of to-be-identified voice into the trained time delay neural network, and obtaining an output result of the time delay neural network, which is a similarity between the to-be-identified voice and eachlanguage model, wherein the language model with the highest similarity is a language category of the to-be-identified voice. The invention relates to the technical field of voice identification.
Owner:SHENZHEN OCTOPUS TECH

Wind speed sequence forecasting method based on Kalman filtering

The invention belongs to the field of time sequence forecasting analysis and in particular relates to a wind speed sequence forecasting method based on Kalman filtering. A high-order AR (auto-regressive) model of a wind speed time sequence is constructed by virtue of a time sequence analysis method, so that a state equation and a measurement equation of Kalman filtering are constructed; a training sample pair of the wind speed time sequence is constructed by the time sequence analysis method; a wind speed sequence is forecasted and analyzed by adopting a time delay neural network; a forecasting result of the time delay neural network is used as a measurement value of Kalman filtering; diagonal covariance matrixes of the state equation and the measurement equation of the Kalman filtering method are determined according to the AR model and a forecasting error of the time delay neural network, so that the wind speed sequence can be forecasted and analyzed by the Kalman filtering method. The wind speed sequence forecasting method can be applied to an on-line wind speed forecasting analysis system of a wind power plant.
Owner:TIANJIN POLYTECHNIC UNIV

Voiceprint recognition method based on TDNN (time delay neural network)

The invention discloses a voiceprint recognition method based on a TDNN (time delay neural network), and solves the problems that a voiceprint recognition algorithm is complicated and data are complex. The voiceprint recognition method is technically characterized by extremely strong feature extraction capacity of a neural network. The TDNN is used for extracting the feature vector of a voice segment of a speaker, a pooling layer and a softmax layer are used for acquiring the posterior probability of the voice segment of the speaker, a loss function is used for training to obtain a cross entropy, the softmax layer is removed after training, the feature vector for finally training a PLDA (probabilistic linear discriminant analysis) model is acquired, transcription of training data is omitted, and simple calculation and good recognition effects are achieved.
Owner:NANJING SILICON INTELLIGENCE TECH CO LTD

Improved time delay neural network acoustic model

The invention belongs to the technical field of speech recognition, and relates to an improved time delay neural network acoustic model comprising the steps: a basic TDNN network is established; an attention module is added between two adjacent hidden layers so as to obtain the improved TDNN network; and the improved TDNN network is trained and the final acoustic model is obtained. The attention module is composed of affine transformation and a weighting function. The output of the previous hidden layer is used as input to extract the feature weight value of the input, and the extracted weightvalue is applied to perform weighing of the original input feature so as to obtain the weighted feature. The factors including the modeling capability of the model, the context information extractioncapacity and the model size are considered, and multilayer weighing of the features of the neural network hidden layers is performed to effectively perform explicit modeling of the relative importance of the interlayer features so as to enhance the performance of the TDNN acoustic model and enhance the overall performance of the speech recognition system.
Owner:SOUTH CHINA UNIV OF TECH

Gender identification method and system, mobile terminal and storage medium

ActiveCN110931023AImprove accuracyPrevent the phenomenon of low recognition accuracySpeech analysisEngineeringImaging Feature
The invention is suitable for the technical field of data processing, and provides a gender identification method and system, a mobile terminal and a storage medium, and the method comprises the steps: obtaining sample data, and carrying out the classification of the sample data, so as to obtain boy data and girl data; generating a training set according to the boy data and the girl data, and constructing a time delay neural network; obtaining acoustic features of the training set, and inputting the acoustic features into a time delay neural network for model training to obtain a gender recognition model; collecting voice data of the user, and inputting the voice data into the gender recognition model for analysis to obtain gender information of the user. According to the invention, acoustic characteristics in collected voice data are analyzed; the gender of the man and the woman is identified, the phenomenon of low identification accuracy caused by adopting image feature identification is prevented, and the accuracy of the gender identification model for identifying the man and the woman of the user is improved through the design of taking the acoustic features as the input of thenetwork to perform model training on the time delay neural network.
Owner:XIAMEN KUAISHANGTONG TECH CORP LTD

Nonlinear neural network model for modeling wide band RF (Radio Frequency) power amplifier

The invention discloses a nonlinear neural network model for a modeling wide band RF (Radio Frequency) power amplifier. The model comprises an input layer, a hidden layer and an output layer, wherein the input data of the input layer comprises advance items x (n+1), |x (n+1)|3, ..., |x (n+1)|<2Q+1>, aligning items x(n), |x(n)|, |x(n)|[3], ..., |x (n)|<2Q+1>, and delay items x (n-1), ..., x (n-M[1]), |x (n-1)|, |x (n-1)|, ..., |x (n-M[2])|, ..., |x (n-1)|<2Q+1>, ..., |x (n-M[Q+2]|<2Q+1>, wherein the x (n+1) is base band complex data of an input end of RF power amplifier at current time, and the output of the output layer is y(n). The nonlinear neutral network has the advantages that a generalized memory effect (memory effects at the delay time and the advance time shall be considered) is considered based on a super-strong memory effect and a strong static nonlinearity of the modeling RF power amplifier; meanwhile, an input signal of an input layer does not only comprises a base band signal, but also comprises a model of a base band complex signal and a high power of the model, and the output signal of the output layer is a plural signal, therefore the modeling precision is higher and can be improved by 5dB in comparison with a real time delay neural network model.
Owner:NANYANG NORMAL UNIV

Power amplifier behavioral modeling system and method based on neural network

The invention discloses a power amplifier behavioral modeling system based on a neural network and a method thereof, the modeling system comprises an input layer, a hidden layer and an output layer, modeling is performed based on the neural network, signal processing is performed respectively, and the method further comprises two processes of system training and system operation. According to theinvention, an activation function in a traditional behavior-level modeling system based on a real-value time-delay neural network is replaced with a leakage linear unit function from a hyperbolic tangent function; according to the invention, while behavioral modeling of the power amplifier is realized, the hardware implementation complexity of modeling is reduced, the modeling convergence rate isimproved, and the method has wide application and development prospects in a communication system.
Owner:SOUTHEAST UNIV

Underwater target classification method

The invention provides an underwater target classification method. The underwater target classification method comprises the following steps: converting a signal received by a sonar array into a digital signal; preprocessing the digital signal, calculating a cross correlation coefficient between each sonar and other sonars, summing the cross correlation coefficients, and taking the sonar signal with the maximum cross correlation coefficient sum as a reference signal; calculating the time delay of each sonar relative to the reference signal; and self-adapting the weight of each channel by usingthe cross correlation coefficient of the channels and the correlation between the front frame and the rear frame, and finally obtaining an enhanced signal; filtering the signal after framing, summingthe signal energy in each filter, and taking the logarithm as the characteristic of the frame signal; and taking the features as the input of a time delay neural network, outputting the features as the probability of each target type corresponding to the frame of signal, and training a multi-target classifier based on the rule. According to the method, the powerful nonlinear representation capability of the deep neural network is utilized, and the characteristics of the target are effectively utilized to distinguish the target.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1

PPG monitoring system based on multi-layer time-delay neural network for removing motion artifacts

The invention belongs to the technical field of medical equipment, in particular to a PPG monitoring system based on a multi-layer time-delay neural network for removing motion artifacts. The invention comprises a PPG probe, a transmission control motherboard and a PC host computer. PPG probe, transmission control motherboard forms wearable structure; The PPG probe comprises a PPG sensor and an IMU sensor, a front fixed PPG sensor and an LDO of the PCB, an IMU sensor and a FPC line connector are arranged on the back; The interrupt prompt lines of PPG sensor and IMU sensor are connected to GPIOrespectively. The transmission control motherboard is connected with the PPG probe through the FPC line, and the transmission control motherboard is connected with the PC host computer. The transmission control motherboard comprises a main control board and a wireless data transmission module. The main control board includes a motion artifact removing module of the multi-layer time delay network.The system can achieve real-time, on-line, accurate removal of PPG motion artifacts under intense exercise.
Owner:FUDAN UNIV

Tibetan Weizang dialect spoken language recognition method based on deep time delay neural network

The invention relates to the technical field of deep learning, signal processing, speech recognition, feature extraction, pronunciation and the like, aims at improving the overall effect of a Tibetan Weizang dialect spoken language recognition model in allusion to spoken language application scenes of Tibetan Weizang dialects, and provides a Tibetan Weizang dialect spoken language recognition method based on a deep time delay neural network. An audio data set mixed by three Tibetan dialects is adopted, an original audio data set is expanded through speed disturbance, noise adding and reverberation methods, the expanded data set is utilized to train the deep time delay neural network based on a chained chain model of an open-source speech recognition toolbox kaldi, the deep time delay neural network serves as a Tibetan acoustic model, and the acoustic model is trained for the second time by using the part of the Weizang dialect in the audio data so as to obtain a deep time delay neural network acoustic model for the Weizang dialect. The method is mainly applied to Tibetan Weizang dialect spoken language recognition occasions.
Owner:TIANJIN UNIV

Fixed-time adaptive neural network unmanned aerial vehicle track angle control method

ActiveCN110362110AOvercoming algebraic ring problemsAddressing the complexity explosionAutonomous decision making processPosition/course control in three dimensionsDifferentiatorNeural network controller
The invention relates to a fixed-time adaptive neural network unmanned aerial vehicle track angle control method comprising the steps: establishing an unmanned aerial vehicle longitudinal system trackangle dynamic mathematical model and an actuator model with an unknown nonlinear dead zone; determining an ideal output value and an output limit; designing a fixed-time adaptive neural network controller, an adaptive parameter updating law and a fixed-time differentiator so that the output of the system is enabled to track the reference output trajectory within a fixed time while ensuring the boundedness of all the state variables; and performing stability analysis on the control system and determining the parameters of the controller according to the results of stability analysis. The method fully considers the restriction factors such as dead zone, system uncertainty and the output limit existing in the actual system and is applicable to a more general nonlinear system such as a non-strict feedback system and can be better applied to the actual system to ensure the ideal track on the track angle tracking of the unmanned aerial vehicle within the fixed time.
Owner:NORTHWESTERN POLYTECHNICAL UNIV

Voiceprint recognition method under channel attention propagation and aggregation

The invention relates to a voiceprint recognition method under channel attention propagation and aggregation, belonging to the field of signal processing. The method comprises the following steps: S1, carrying out second-order wavelet scattering transformation on an original speech discrete signal; S2, performing voiceprint mapping coding of multi-scale features; and S3, evaluating the similarity of the voiceprint codes. According to the method, the multi-scale short-time voice features are obtained through wavelet scattering transformation, and the multi-scale features are mapped by adopting the time delay neural network based on channel attention propagation and aggregation to obtain the voiceprint codes, so the accuracy and robustness of voiceprint recognition are improved. The method takes the processing of long-time and short-time voices into consideration, provides a new technical means for voiceprint recognition containing short-time voice data, and can also be migrated to other voice processing fields to serve as one of voiceprint code acquisition methods.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Speech recognition method and device applied to field of power dispatching

The embodiment of the invention provides a speech recognition method and device applied to the field of power dispatching. The method comprises the steps of inputting a power normalized cepstrum coefficient feature of speech to be recognized into a convolutional neural network in a preset neural network model to acquire a new feature; splicing the new feature, the power normalized cepstrum coefficient feature and a speaker feature to acquire a mixed feature; inputting the mixed feature into a plurality of time delay neural network sets and a plurality of bidirectional long short-term memory circulatory neural network sets arranged alternatively in the preset neural network model to acquire a posterior probability of a word sequence set aiming at the feature of the speech to be recognized;and decoding the speech to be recognized according to the posterior probability in combination with a language model to acquire a recognized word sequence. In the field of power dispatching, a speechrecognition acoustic model multinetwork combined training method based on the abovementioned three networks is provided, so that the speech to be recognized can be recognized via the trained model, the power of work of a dispatcher is reduced, and the time of repeated work of the dispatcher is reduced.
Owner:CENT CHINA BRANCH OF STATE GRID CORP OF CHINA +1

Fast language recognition method based on time delay neural network

The invention discloses a fast language recognition method based on a time delay neural network, and the method comprises the steps: 1, inputting a voice signal, processing the input voice signal, andobtaining a voice signal frame sequence with a fixed length; 2, extracting bottom acoustic features of a voice signal frame sequence according to frames; 3, inputting the underlying acoustic featuresinto a Real TDNN residual block structure for calculation processing to obtain M*64 abstract features; 4, carrying out Attention calculation; 5, carrying out global average pooling processing on theAttention features in a time frame dimension to obtain an Embedded vector; 6, carrying out two-layer DNN extraction on the Embedded vector to obtain a language vector; and 7, inputting the language vectors into an ArcFaceStatic loss function, and inputting the underlying acoustic features into the trained neural network to obtain the probabilities of all recognizable languages. The method has highrobustness in short voice, so that the language can be quickly and accurately recognized.
Owner:因诺微科技(天津)有限公司

Voice recognition method based on transfer learning in field of civil aviation air-ground calls

The invention discloses a voice recognition method based on transfer learning in the field of civil aviation air-ground calls. The method comprises the steps of collecting a general data set and a transfer data set and performing data processing, initializing a neural network, and adopting a time delay neural network-hidden Markov model as an acoustic training model, performing voice recognition training by using the general data set to obtain a Chinese voice recognition general acoustic model, training the transfer data set on a general Chinese voice recognition model and adjusting parametersto obtain a Chinese voice recognition acoustic model in the civil aviation air-ground call field, and expanding text corpora in the civil aviation field, and generating a language model. According tothe method based on transfer learning, data outside the field can be effectively utilized, and compared with a common acoustic model, the recognition effect is greatly improved. By adopting the method, the problem of insufficient Chinese corpus in the field of civil aviation air-ground calls can be solved, and the accuracy of civil aviation air-ground calls is improved.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products