Voice tag judgment method and system, storage medium and electronic equipment

A judgment method and judgment system technology, applied in speech analysis, speech recognition, audio data retrieval, etc., can solve the problems of given label differences and single vocabulary composition, so as to improve integration, simple processing scheme, and operability. strong effect

Active Publication Date: 2022-01-18
北京数美时代科技有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the prior art, the composition of the vocabulary is too single, resulting in differences in label assignment.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice tag judgment method and system, storage medium and electronic equipment
  • Voice tag judgment method and system, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] Example 1, the existing sample A of high-frequency vocabulary in some relevant scenarios is obtained through the ASR translation result, which is the translation result of ASR to online data, as shown in Table 1:

[0057] Table 1 Sample A

[0058] There are a few mages on the single field on the field only mmm mmm so real 98k me i just go Little brother, teach me how to do this, I still have to listen to how to sing, I am not very good at it A werewolf card, I am offline, and a gold water card, I have been online all the time, that’s how we chatted. Three wolves

[0059] Through the word frequency statistics of sample A, we can obtain the following high-frequency words in game-related scenes. Traditional word segmentation methods for such words are difficult to cut out words based on text information. ASR actually combines some acoustic features when translating, such as As shown in Table 2, the labeling process is carried out through Table ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of audio recognition, and in particular, relates to a voice tag judgment method and system, a storage medium and electronic equipment. The method comprises the steps: acquiring open source vocabularies to form an open source vocabulary set; performing word segmentation processing on the text in the related scene to obtain a word segmentation set; obtaining an audio file, and processing the audio file to obtain a high-frequency vocabulary set; obtaining a preset list, and processing the preset list to obtain a related vocabulary set; performing union processing on the open source vocabulary set, the word segmentation set, the high-frequency vocabulary set and the related vocabulary set, and obtaining a vocabulary list; and performing tag processing on the voice content according to the vocabulary list. The method is high in operability and suitable for the cold start stage; and the ASR recognition accuracy in the content risk control field and the downstream nlp classification task and tag effect can be effectively improved, and the method can be quickly applied to related fields.

Description

technical field [0001] The present invention relates to the field of audio recognition, in particular to a voice tag determination method, system, storage medium and electronic equipment. Background technique [0002] In recent years, with the rapid development of the Internet and the rise of short video and live broadcasting, multimedia data has grown explosively, in which voice content plays an increasingly important role in people's life, communication, and entertainment. In this huge voice The huge content risks lurking under the content have also attracted more and more attention from the government and people. [0003] At this stage, the content review task of voice content mainly adopts the solution of ASR+nlp; the audio content is translated into text content through ASR, and then the corresponding risk label is given to the text content by using nlp and list. Among them, the vocabulary, as the basis of ASR and nlp, plays a crucial role, not only directly related to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/26G10L15/08G10L25/51G06F40/216G06F16/683G06F16/65
CPCG10L15/26G10L15/08G10L25/51G06F40/216G06F16/685G06F16/65
Inventor 邵历齐路唐会军梁堃
Owner 北京数美时代科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products