Efficient method for detecting vocal starting position in song

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of starting position and detection method, which is applied in speech analysis, instruments, etc., and can solve problems such as low effect of multi-feature combination, fuzzy vocal features, long training time, etc.

Active Publication Date: 2019-03-01

UNIV OF ELECTRONICS SCI & TECH OF CHINA

View PDF6 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

According to the previous analysis, due to the influence of musical instruments in the song on the human voice, many common vocal features become blurred or even invalid, and the multi-feature combination has little effect, which is not enough to make up for the calculation cost brought by the introduction of multiple features. ; In terms of classifiers, the effect difference of the respective classifiers is not very obvious

In addition, the ANN method with relatively good results still has disadvantages such as long training time and a large number of samples required.

In short, in the absence of effective feature expression for the instrument-human voice mixture, the accuracy of human voice detection is currently lower than 90%, which makes it difficult to estimate the accuracy of the starting point of the human voice to meet practical requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0071] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0072] This embodiment provides an efficient method for detecting the starting position of vocals in a song, and its flow chart is as follows figure 1 As shown, two stages of training and recognition are included; in the present embodiment, there are 120 songs used in the simulation experiment, wherein the first 100 songs are training audio, and the last 20 songs are detection audio; each training audio is preformed as follows: Processing: 1) Cut the audio, and only keep the front part, and the reserved interval is 10 seconds after the start of the audio to the starting position of the human voice; 2) Mark the moment of the starting position of the human voice.

[0073] The detection method of vocal starting position in the song in the present embodiment, concrete steps are as follows:

[0074] ·Training stage:

[0075] S1. Read the training...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of digital audio processing, and relates to a human voice detection problem, and specifically relates to a method for estimating a vocal starting positionin a song. The method adopts a musical instrument sound suppression method for an orchestral instrument and a percussion instrument before feature extraction. In the feature extraction, framing is performed on an audio by a long window with high overlap, and audio features suitable for musical instrument sound suppression processing are designed, the audio features of an initial sound production stage are effectively captured. The song is divided into two types of musical instrument sound and vocal sound (or musical instrument and vocal mixed sound) through learning a singing starting point segment, and a vocal starting position is accurately estimated, and the method has good vocal / instrument sound judgment fault tolerance. The method has a simple algorithm, and is rapid in processing, and can be widely applied to automation and digital media management of program broadcast of broadcasting stations.

Description

technical field [0001] The invention belongs to the technical field of digital audio processing and relates to the problem of human voice detection, in particular to a method for estimating the starting position of the human voice in a song, which can be applied to marking the real-time position of the human voice in broadcast audio. Background technique [0002] A song usually consists of pure accompaniment and singing. The pure accompaniment is produced purely by accompaniment instruments (orchestral and percussion instruments) without human voice, while the singing part is the superposition of human voice and accompaniment music. In current digital media data management, it is often necessary to mark the starting position (starting point) of the human voice in a song. The starting point information of the human voice has many uses. For example, in the live program of the radio station, the starting position of the human voice can help the host control the speaking time, s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L19/02G10L25/45G10L25/69

CPCG10L19/02G10L25/45G10L25/69

Inventor 甘涛甘云强何艳敏

Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Efficient method for detecting vocal starting position in song

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology