Voice retrieval device and voice retrieval method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A sound and sound signal technology, applied in the field of sound retrieval devices, can solve problems such as poor retrieval accuracy

Active Publication Date: 2019-03-08

CASIO COMPUTER CO LTD

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] In the technique disclosed in Non-Patent Document 1, there is a problem that the search accuracy deteriorates when the speech rate of the voice of the search object is different from that of the query inputter.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach 1

[0027] like figure 1 As shown, the voice search device 100 of Embodiment 1 physically includes: ROM (Read Only Memory: Read Only Memory) 1, RAM (Random Access Memory: Random Access Memory) 2, external storage device 3, input device 4, output A device 5 , a CPU (Central Processing Unit: Central Processing Unit) 6 , and a bus 7 .

[0028] ROM1 stores a sound search program. RAM2 is used as a work area of CPU6.

[0029] The external storage device 3 is constituted by, for example, a hard disk, and stores an audio signal to be searched, a monophone model, a triphone model, and phoneme time lengths described later as data.

[0030] The input device 4 is composed of, for example, a keyboard and a voice recognition device. The input device 4 supplies the search word input by the user to the CPU 6 as text data. The output device 5 includes, for example, a screen such as a liquid crystal display, a speaker, and the like. The output device 5 displays text data output by the CPU 6...

Embodiment approach 2

[0102] In Embodiment 1, the case where the speech rate is assumed to be fixed and only one piece of speech rate information is set has been described. Therefore, the speech rate information can only correspond to one kind. However, in actual speech, it is not limited to pronounce the same word at the same speed. For example, if the word "カテゴリ" is uttered at an average speed, it may also be uttered slowly with emphasis. To cope with this, in Embodiment 2, a plurality of utterance time lengths are derived by using a plurality of speech rate information. In Embodiment 2, a case will be described in which three kinds of speech rate information (change rate of duration length) of 0.7 (fast), 1.0 (normal), and 1.4 (slow) are used as speech rate information.

[0103] The voice search device of Embodiment 2 is the same as the voice search device 100 of Embodiment 1, as figure 1 physically constituted as shown. In addition, regarding the functional structure and figure 2 The stru...

Deformed example 1

[0131] The case where the speech search apparatus 100 of Embodiments 1 and 2 uniformly multiplies the change rate by the duration of each state of a phoneme has been described. However, the present invention is not limited thereto. For example, a case where the rate of change is changed for each state of a phoneme will be described.

[0132] use Figure 12 A case where the rate of change is changed for each state of a phoneme will be described. Let α1 be the rate of change for duration T1 of state 1 of the phoneme, α2 be the rate of change for duration T2 of state 2, and α3 be the rate of change for duration T3 of state 3.

[0133] In this modified example, when the length of duration is extended, the rate of change in state 1 is set to 1.3, the rate of change in state 2 is set to 1.6, and the rate of change in state 3 is set to 1.3 for vowels. Regarding consonants, the rate of change in state 1 was set to 1.1, the rate of change in state 2 was set to 1.2, and the rate of c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A voice retrieval apparatus executes processes of: obtaining, from a time length memory, a continuous time length for each phoneme contained in a phoneme string of a retrieval string; obtaining user-specified information on an utterance rate; changing the continuous time length for each obtained phoneme in accordance with the obtained information; deriving, based on the changed continuous time length, an utterance time length of voices corresponding to the retrieval string; specifying a plurality of likelihood obtainment segments of the derived utterance time length in a time length of a retrieval sound signal; obtaining a likelihood showing a plausibility that the specified likelihood obtainment segment is a segment where the voices are uttered; and identifying, based on the obtained likelihood, an estimation segment where, within the retrieval sound signal, utterance of the voices is estimated, the estimation segment being identified for each specified likelihood obtainment segment.

Description

[0001] This application claims priority based on Japanese Patent Application No. 2014-259418 filed on December 22, 2014, and the contents of the basic application are incorporated in this application as a reference. technical field [0002] The invention relates to a voice retrieval device and a voice retrieval method. Background technique [0003] With the expansion and popularization of multimedia content such as audio and video, high-precision multimedia retrieval technology is required. Among them, a technique of voice retrieval is being studied, which specifies the position where a voice corresponding to a search term (query) set as a search target is emitted from a voice signal. [0004] In voice retrieval, there is no established retrieval method that has sufficient performance compared with character retrieval using image recognition. Therefore, techniques for realizing voice retrieval with sufficient performance have been intensively studied. [0005] For example,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/63G10L25/54

CPCG06F16/60G06F16/367G06F16/683G10L2015/025

Inventor 富田宽基

Owner CASIO COMPUTER CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voice retrieval device and voice retrieval method

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A sound and sound signal technology, applied in the field of sound retrieval devices, can solve problems such as poor retrieval accuracy

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach 1

Embodiment approach 2

Deformed example 1

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A sound and sound signal technology, applied in the field of sound retrieval devices, can solve problems such as poor retrieval accuracy

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology