Singing detection method based on multi-scale time-frequency graph parallel input convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of convolutional neural network and time-frequency graph, which is applied in the direction of neural learning method, biological neural network model, neural architecture, etc., can solve the problems such as the effect needs to be improved, the computing resources are large, and the training time is long, so as to improve the detection accuracy , improve the overall performance, improve the effect of the overall performance

Inactive Publication Date: 2021-11-09

JINLING INST OF TECH

View PDF2 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The invention implicitly extracts singing features at different levels through the deep residual network, and can use the adaptive attention characteristics of the embedded extrusion and excitation modules to judge the importance of these features. Under the JMD data set, the depths are respectively In the case of 14, 18, 34, 50, 101, 512, and 200, the average detection accuracy rate is 88.19, and the effect still needs to be improved

In addition, the method of stacking networks in this invention consumes a lot of computing resources and increases training time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0069] The present invention will be further explained below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the following specific embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention. It should be noted that the words "front", "rear", "left", "right", "upper" and "lower" used in the following description refer to the directions in the drawings, and the words "inner" and "outer" "Refer to a direction toward or away from the geometric center of a particular part, respectively.

[0070] As a specific embodiment of the present invention, the present invention provides a singing voice detection method based on multi-scale time-frequency graph parallel input convolutional neural network, the specific steps are as follows:

[0071] Step 1: Perform short-time Fourier transform on a single music file through different window lengths w i , i∈[1..n], get time-...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a singing detection method based on a multi-scale time-frequency graph parallel input convolutional neural network. Generally, in a song detection algorithm based on a convolutional neural network, a network input layer is a two-dimensional time-frequency graph matrix. The method comprises the following steps: firstly, according to the multi-scale characteristic of a music signal, generating a plurality of two-dimensional time-frequency graph matrixes with different scales by adjusting the window length of short-time Fourier transform; and then sending the plurality of time-frequency diagrams to a convolutional neural network in a parallel multi-channel mode, so that a neuron receptive field of the convolutional neural network can simultaneously observe multi-scale information of a music signal, thereby enhancing the time-frequency diagram feature extraction and resolution capability of neurons, and improving the overall performance of singing detection.

Description

technical field [0001] The invention relates to the technical field of music artificial intelligence, in particular to a singing voice detection method based on a multi-scale time-frequency graph input in parallel to a convolutional neural network. Background technique [0002] Regarding the background technology of singing voice inspection, the applicant has developed a singing voice detection method based on squeeze and incentive residual network (application number: CN202010164594.5) and a singing voice detection method based on dot product self-attention convolutional neural network (patent number : ZL202110192300.4) are described. Singing Voice Detection (SVD) is a process of judging whether each small piece of audio in digital music contains singing voices, and its detection accuracy is generally between 50-200 milliseconds. Singing detection is an important basic work in the field of Music Information Retrieval (MIR). Many other research directions, such as singer re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08

CPCG06N3/08G10L25/30G06N3/045G06F18/214

Inventor 桂文明

Owner JINLING INST OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Singing detection method based on multi-scale time-frequency graph parallel input convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology