Efficient method for detecting vocal starting position in song

A technology of starting position and detection method, which is applied in speech analysis, instruments, etc., and can solve problems such as low effect of multi-feature combination, fuzzy vocal features, long training time, etc.

Active Publication Date: 2019-03-01
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF6 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

According to the previous analysis, due to the influence of musical instruments in the song on the human voice, many common vocal features become blurred or even invalid, and the multi-feature combination has little effect, which is not enough to make up for the calculation cost brought by the introduction of multiple features. ; In terms of classifiers, the effect difference of the respective classifiers is not very obvio...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient method for detecting vocal starting position in song
  • Efficient method for detecting vocal starting position in song
  • Efficient method for detecting vocal starting position in song

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0072] This embodiment provides an efficient method for detecting the starting position of vocals in a song, and its flow chart is as follows figure 1 As shown, two stages of training and recognition are included; in the present embodiment, there are 120 songs used in the simulation experiment, wherein the first 100 songs are training audio, and the last 20 songs are detection audio; each training audio is preformed as follows: Processing: 1) Cut the audio, and only keep the front part, and the reserved interval is 10 seconds after the start of the audio to the starting position of the human voice; 2) Mark the moment of the starting position of the human voice.

[0073] The detection method of vocal starting position in the song in the present embodiment, concrete steps are as follows:

[0074] ·Training stage:

[0075] S1. Read the training...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of digital audio processing, and relates to a human voice detection problem, and specifically relates to a method for estimating a vocal starting positionin a song. The method adopts a musical instrument sound suppression method for an orchestral instrument and a percussion instrument before feature extraction. In the feature extraction, framing is performed on an audio by a long window with high overlap, and audio features suitable for musical instrument sound suppression processing are designed, the audio features of an initial sound production stage are effectively captured. The song is divided into two types of musical instrument sound and vocal sound (or musical instrument and vocal mixed sound) through learning a singing starting point segment, and a vocal starting position is accurately estimated, and the method has good vocal/instrument sound judgment fault tolerance. The method has a simple algorithm, and is rapid in processing, and can be widely applied to automation and digital media management of program broadcast of broadcasting stations.

Description

technical field [0001] The invention belongs to the technical field of digital audio processing and relates to the problem of human voice detection, in particular to a method for estimating the starting position of the human voice in a song, which can be applied to marking the real-time position of the human voice in broadcast audio. Background technique [0002] A song usually consists of pure accompaniment and singing. The pure accompaniment is produced purely by accompaniment instruments (orchestral and percussion instruments) without human voice, while the singing part is the superposition of human voice and accompaniment music. In current digital media data management, it is often necessary to mark the starting position (starting point) of the human voice in a song. The starting point information of the human voice has many uses. For example, in the live program of the radio station, the starting position of the human voice can help the host control the speaking time, s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L19/02G10L25/45G10L25/69
CPCG10L19/02G10L25/45G10L25/69
Inventor 甘涛甘云强何艳敏
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products