Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for generating psychoacoustic model

A psychoacoustic model and computing module technology, applied in speech analysis, instruments, etc., can solve the problems of high hardware implementation cost, difficult implementation, high power consumption, etc., and achieve the effects of easy hardware implementation, improved quantization efficiency, and reduced complexity

Inactive Publication Date: 2011-08-31
HUAWEI TECH CO LTD +1
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to solve the problems of the existing psychoacoustic model, such as high algorithm complexity, difficult implementation, high hardware implementation cost, and high power consumption, and to improve quantization efficiency, the embodiment of the present invention provides a method and device for generating a psychoacoustic model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for generating psychoacoustic model
  • Method and device for generating psychoacoustic model
  • Method and device for generating psychoacoustic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] see figure 1 , this embodiment provides a method for generating a psychoacoustic model, and the process of the method is specifically as follows:

[0050] 101: Perform time-frequency analysis on the input time-domain audio signal frame by Modified Discrete Cosine Transform MDCT to obtain MDCT frequency-domain parameters;

[0051] 102: Calculate the spectrum flatness measure function, the spectrum local maximum dominant component extended envelope and the spectrum local minimum dominant component average envelope according to the MDCT frequency domain parameters, and according to the spectrum flatness measure function, the spectrum local maximum dominant component spread envelope and the spectrum local minimum Dominant component mean envelope calculation local masking threshold;

[0052] 103: Generate and output a global masking threshold according to the local masking threshold.

[0053] The method provided in this embodiment, by using the spectral flatness measure fu...

Embodiment 2

[0055]In order to solve the problems that the algorithm of the existing psychoacoustic model is too complicated and the audio analysis performance cannot meet the needs of audio processing, this embodiment provides a method for generating a psychoacoustic model, through which a method based on modified discrete A psychoacoustic model of cosine transform (MDCT) and spectral flatness measure function (SpectralFlatness Measure, SFM). The psychoacoustic model considers the characteristics of pitch masking and non-tone masking, so that the coding efficiency can be improved.

[0056] Among them, the input-output relationship of the psychoacoustic model can be as follows: figure 2 As shown, the input signal is the time-domain audio signal frame X to be processed or coded in , the audio signal can be a speech signal, an audio signal or a mixed signal of various sound signals that can be heard by the human ear, and the frequency bandwidth of the signal includes all frequency ranges th...

Embodiment 3

[0127] see Figure 22 , the present embodiment provides a device for generating a psychoacoustic model, the device comprising:

[0128] The time-domain analysis module 2201 is used to perform time-frequency analysis on the input time-domain audio signal frame with Modified Discrete Cosine Transform MDCT to obtain MDCT frequency-domain parameters;

[0129] The first calculation module 2202 is used to calculate the spectral flatness measure function according to the MDCT frequency domain parameters obtained by the time domain analysis module 2201;

[0130] The second calculation module 2203 is used to calculate the local maximum dominant component extended envelope of the spectrum according to the MDCT frequency domain parameters obtained by the time domain analysis module 2201;

[0131] The third calculation module 2204 is used to calculate the average envelope of the local minimum dominant component of the frequency spectrum according to the MDCT frequency domain parameters o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for generating a psychoacoustic model, and the method and device provided by the invention belong to the technical field of audio processing. The method comprises the following steps: using a modified discrete cosine transform (MDCT) to perform the time sequence analysis on an input time domain audio signal frame to obtain a MDCT frequency domain parameter; computing a spectrum flat measure function, a local maximum dominant component extension envelope of the frequency spectrum and a local minimum dominant component average envelope of the frequency spectrum according to the MDCT frequency domain parameter, and computing a local masking threshold according to the spectrum flat measure function, the local maximum dominant component extension envelope of the frequency spectrum and the local minimum dominant component average envelope of the frequency spectrum; and generating and outputting a global masking threshold according to the local masking threshold. By computing the local masking threshold through the spectrum flat measure function, the tone masking characteristic and non-tone masking characteristic of the audio signal are distinguished from each other, thereby achieving the purposes of distributing quantification bit numbers more reasonably and effectively improving the effect of the quantification efficiency.

Description

technical field [0001] The invention relates to the technical field of audio processing, in particular to a method and device for generating a psychoacoustic model. Background technique [0002] In order to transmit or store broadband high-fidelity audio signals with as low a coding rate as possible or as little data as possible, high-quality and high-efficiency audio coding algorithms play an important role. In order to achieve a higher compression coding gain or compression ratio, the audio coding algorithm must use a perceptual coding algorithm, and the basis of the perceptual coding algorithm for audio signals is a psychoacoustic model. The psychoacoustic model is a mathematical model that reflects the characteristics of human auditory perception abstracted on the basis of the study of the human auditory system. It reflects the human auditory system's ability to perceive and mask audio and noise. [0003] The MPEG (Moving Pictures Experts Group, Dynamic Picture Experts ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L19/02G10L19/00G10L19/005
Inventor 马鸿飞郭泽华夏雨许丽净
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products