Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Singing detection method based on multi-scale time-frequency graph parallel input convolutional neural network

A technology of convolutional neural network and time-frequency graph, which is applied in the direction of neural learning method, biological neural network model, neural architecture, etc., can solve the problems such as the effect needs to be improved, the computing resources are large, and the training time is long, so as to improve the detection accuracy , improve the overall performance, improve the effect of the overall performance

Inactive Publication Date: 2021-11-09
JINLING INST OF TECH
View PDF2 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The invention implicitly extracts singing features at different levels through the deep residual network, and can use the adaptive attention characteristics of the embedded extrusion and excitation modules to judge the importance of these features. Under the JMD data set, the depths are respectively In the case of 14, 18, 34, 50, 101, 512, and 200, the average detection accuracy rate is 88.19, and the effect still needs to be improved
In addition, the method of stacking networks in this invention consumes a lot of computing resources and increases training time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Singing detection method based on multi-scale time-frequency graph parallel input convolutional neural network
  • Singing detection method based on multi-scale time-frequency graph parallel input convolutional neural network
  • Singing detection method based on multi-scale time-frequency graph parallel input convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] The present invention will be further explained below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the following specific embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention. It should be noted that the words "front", "rear", "left", "right", "upper" and "lower" used in the following description refer to the directions in the drawings, and the words "inner" and "outer" "Refer to a direction toward or away from the geometric center of a particular part, respectively.

[0070] As a specific embodiment of the present invention, the present invention provides a singing voice detection method based on multi-scale time-frequency graph parallel input convolutional neural network, the specific steps are as follows:

[0071] Step 1: Perform short-time Fourier transform on a single music file through different window lengths w i , i∈[1..n], get time-...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a singing detection method based on a multi-scale time-frequency graph parallel input convolutional neural network. Generally, in a song detection algorithm based on a convolutional neural network, a network input layer is a two-dimensional time-frequency graph matrix. The method comprises the following steps: firstly, according to the multi-scale characteristic of a music signal, generating a plurality of two-dimensional time-frequency graph matrixes with different scales by adjusting the window length of short-time Fourier transform; and then sending the plurality of time-frequency diagrams to a convolutional neural network in a parallel multi-channel mode, so that a neuron receptive field of the convolutional neural network can simultaneously observe multi-scale information of a music signal, thereby enhancing the time-frequency diagram feature extraction and resolution capability of neurons, and improving the overall performance of singing detection.

Description

technical field [0001] The invention relates to the technical field of music artificial intelligence, in particular to a singing voice detection method based on a multi-scale time-frequency graph input in parallel to a convolutional neural network. Background technique [0002] Regarding the background technology of singing voice inspection, the applicant has developed a singing voice detection method based on squeeze and incentive residual network (application number: CN202010164594.5) and a singing voice detection method based on dot product self-attention convolutional neural network (patent number : ZL202110192300.4) are described. Singing Voice Detection (SVD) is a process of judging whether each small piece of audio in digital music contains singing voices, and its detection accuracy is generally between 50-200 milliseconds. Singing detection is an important basic work in the field of Music Information Retrieval (MIR). Many other research directions, such as singer re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08
CPCG06N3/08G10L25/30G06N3/045G06F18/214
Inventor 桂文明
Owner JINLING INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products