Method and device for generating psychoacoustic model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A psychoacoustic model and computing module technology, applied in speech analysis, instruments, etc., can solve the problems of high hardware implementation cost, difficult implementation, high power consumption, etc., and achieve the effects of easy hardware implementation, improved quantization efficiency, and reduced complexity

Inactive Publication Date: 2011-08-31

HUAWEI TECH CO LTD +1

View PDF5 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] In order to solve the problems of the existing psychoacoustic model, such as high algorithm complexity, difficult implementation, high hardware implementation cost, and high power consumption, and to improve quantization efficiency, the embodiment of the present invention provides a method and device for generating a psychoacoustic model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0049] see figure 1 , this embodiment provides a method for generating a psychoacoustic model, and the process of the method is specifically as follows:

[0050] 101: Perform time-frequency analysis on the input time-domain audio signal frame by Modified Discrete Cosine Transform MDCT to obtain MDCT frequency-domain parameters;

[0051] 102: Calculate the spectrum flatness measure function, the spectrum local maximum dominant component extended envelope and the spectrum local minimum dominant component average envelope according to the MDCT frequency domain parameters, and according to the spectrum flatness measure function, the spectrum local maximum dominant component spread envelope and the spectrum local minimum Dominant component mean envelope calculation local masking threshold;

[0052] 103: Generate and output a global masking threshold according to the local masking threshold.

[0053] The method provided in this embodiment, by using the spectral flatness measure fu...

Embodiment 2

[0055]In order to solve the problems that the algorithm of the existing psychoacoustic model is too complicated and the audio analysis performance cannot meet the needs of audio processing, this embodiment provides a method for generating a psychoacoustic model, through which a method based on modified discrete A psychoacoustic model of cosine transform (MDCT) and spectral flatness measure function (SpectralFlatness Measure, SFM). The psychoacoustic model considers the characteristics of pitch masking and non-tone masking, so that the coding efficiency can be improved.

[0056] Among them, the input-output relationship of the psychoacoustic model can be as follows: figure 2 As shown, the input signal is the time-domain audio signal frame X to be processed or coded in , the audio signal can be a speech signal, an audio signal or a mixed signal of various sound signals that can be heard by the human ear, and the frequency bandwidth of the signal includes all frequency ranges th...

Embodiment 3

[0127] see Figure 22 , the present embodiment provides a device for generating a psychoacoustic model, the device comprising:

[0128] The time-domain analysis module 2201 is used to perform time-frequency analysis on the input time-domain audio signal frame with Modified Discrete Cosine Transform MDCT to obtain MDCT frequency-domain parameters;

[0129] The first calculation module 2202 is used to calculate the spectral flatness measure function according to the MDCT frequency domain parameters obtained by the time domain analysis module 2201;

[0130] The second calculation module 2203 is used to calculate the local maximum dominant component extended envelope of the spectrum according to the MDCT frequency domain parameters obtained by the time domain analysis module 2201;

[0131] The third calculation module 2204 is used to calculate the average envelope of the local minimum dominant component of the frequency spectrum according to the MDCT frequency domain parameters o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method and device for generating a psychoacoustic model, and the method and device provided by the invention belong to the technical field of audio processing. The method comprises the following steps: using a modified discrete cosine transform (MDCT) to perform the time sequence analysis on an input time domain audio signal frame to obtain a MDCT frequency domain parameter; computing a spectrum flat measure function, a local maximum dominant component extension envelope of the frequency spectrum and a local minimum dominant component average envelope of the frequency spectrum according to the MDCT frequency domain parameter, and computing a local masking threshold according to the spectrum flat measure function, the local maximum dominant component extension envelope of the frequency spectrum and the local minimum dominant component average envelope of the frequency spectrum; and generating and outputting a global masking threshold according to the local masking threshold. By computing the local masking threshold through the spectrum flat measure function, the tone masking characteristic and non-tone masking characteristic of the audio signal are distinguished from each other, thereby achieving the purposes of distributing quantification bit numbers more reasonably and effectively improving the effect of the quantification efficiency.

Description

technical field [0001] The invention relates to the technical field of audio processing, in particular to a method and device for generating a psychoacoustic model. Background technique [0002] In order to transmit or store broadband high-fidelity audio signals with as low a coding rate as possible or as little data as possible, high-quality and high-efficiency audio coding algorithms play an important role. In order to achieve a higher compression coding gain or compression ratio, the audio coding algorithm must use a perceptual coding algorithm, and the basis of the perceptual coding algorithm for audio signals is a psychoacoustic model. The psychoacoustic model is a mathematical model that reflects the characteristics of human auditory perception abstracted on the basis of the study of the human auditory system. It reflects the human auditory system's ability to perceive and mask audio and noise. [0003] The MPEG (Moving Pictures Experts Group, Dynamic Picture Experts ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L19/02G10L19/00G10L19/005

Inventor 马鸿飞郭泽华夏雨许丽净

Owner HUAWEI TECH CO LTD

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method and device for generating psychoacoustic model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology