Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Self-supervised pitch estimation

A pitch, input audio technique, applied in the field of self-supervised pitch estimation

Pending Publication Date: 2022-05-27
GOOGLE LLC
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Various supervised machine learning techniques can be applied to train the encoder to perform pitch prediction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-supervised pitch estimation
  • Self-supervised pitch estimation
  • Self-supervised pitch estimation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Example methods, devices, and systems are described herein. It should be understood that the words "example" and "exemplary" are used herein to mean "serving as an example, instance, or illustration." Any embodiment or feature described herein as "exemplary" or "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments or features. Other embodiments may be utilized, and other changes may be made, without departing from the scope of the subject matter presented here.

[0023] Example embodiments are provided herein that include self-supervision techniques for training an encoder (or encoders) to predict pitch values ​​from audio samples. To generate training samples according to the example technique, a particular audio sample (eg, a piece of audio from a collection of audio training data) is used to generate two training samples, one or both of which have been known to be pitch-shifted Amount (eg, an amount randomly selected fro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Example embodiments relate to techniques for training an artificial neural network or other machine learning encoder to accurately predict the pitch of input audio samples in a pitch space that is semitone or otherwise logarithmic scaled. An example method may include generating two training samples from a sample of audio training data by applying two different pitch shifts to the sample of audio training data. This may be accomplished by converting samples of the audio data to the frequency domain and then shifting the converted data. These known displacements are then compared to predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison such that the relative pitch output by the encoder is improved in accuracy. The relative pitch values generated by the trained encoder may then be calibrated using one or more audio samples tagged with absolute pitch values.

Description

[0001] CROSS-REFERENCE TO RELATED APPLICATIONS [0002] This application is a continuation of US Patent Application No. 62 / 923,491, filed October 19, 2019, the contents of which are incorporated herein by reference in their entirety. Background technique [0003] Detecting the pitch of sounds exhibited in audio signals is beneficial in various applications. For example, pitch detection can be used to facilitate automatic and / or assisted translation of recorded music to sheet music. In another example, continuous detection of the pitch of an audio signal may be useful when training a person to sing, play an instrument, identify pitch, or perform some other task. [0004] Various methods exist for detecting pitch. Many of these methods involve the use of heuristics to detect fundamental frequencies in the audio signal (eg, by identifying peaks in the audio signal's spectrum). However, such methods may perform poorly in the presence of noise or in the presence of multiple pitc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/60G10L25/30G10L25/90
CPCG10L25/60G10L25/90G10L25/30G10L2025/906G10L25/12G10L2019/0011G10L15/063G10L21/013
Inventor M.塔吉利亚萨奇M.韦尔米罗维奇M.沙里菲D.罗布雷克C.弗兰克B.格费尔勒
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products