Voice retrieval apparatus, and voice retrieval method
A sound and sound signal technology, applied in the field of sound retrieval devices, can solve the problems of poor retrieval accuracy and the like
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment approach 1
[0027] like figure 1 As shown, the voice search device 100 of Embodiment 1 physically includes: ROM (ReadOnlyMemory: Read Only Memory) 1, RAM (RandomAccessMemory: Random Access Memory) 2, external storage device 3, input device 4, output device 5, CPU (Central Processing Unit: central processing unit) 6 and bus 7 .
[0028] ROM1 stores a sound search program. RAM2 is used as a work area of CPU6.
[0029] The external storage device 3 is constituted by, for example, a hard disk, and stores an audio signal to be searched, a monophone model, a triphone model, and phoneme time lengths described later as data.
[0030] The input device 4 is composed of, for example, a keyboard and a voice recognition device. The input device 4 supplies the search word input by the user to the CPU 6 as text data. The output device 5 includes, for example, a screen such as a liquid crystal display, a speaker, and the like. The output device 5 displays text data output by the CPU 6 on a screen,...
Embodiment approach 2
[0102] In Embodiment 1, the case where the speech rate is assumed to be fixed and only one piece of speech rate information is set has been described. Therefore, the speech rate information can only correspond to one kind. However, in actual speech, it is not limited to pronounce the same word at the same speed. For example, if the word "カテゴリ" is uttered at an average speed, it may also be uttered slowly with emphasis. To cope with this, in Embodiment 2, a plurality of utterance time lengths are derived by using a plurality of speech rate information. In Embodiment 2, a case will be described in which three kinds of speech rate information (change rate of duration length) of 0.7 (fast), 1.0 (normal), and 1.4 (slow) are used as speech rate information.
[0103] The voice search device of Embodiment 2 is the same as the voice search device 100 of Embodiment 1, as figure 1 physically constituted as shown. In addition, regarding the functional structure and figure 2 The stru...
Deformed example 1
[0131] The case where the speech search apparatus 100 of Embodiments 1 and 2 uniformly multiplies the change rate by the duration of each state of a phoneme has been described. However, the present invention is not limited thereto. For example, a case where the rate of change is changed for each state of a phoneme will be described.
[0132] use Figure 12 A case where the rate of change is changed for each state of a phoneme will be described. Let α1 be the rate of change for duration T1 of state 1 of the phoneme, α2 be the rate of change for duration T2 of state 2, and α3 be the rate of change for duration T3 of state 3.
[0133] In this modified example, when the length of duration is extended, the rate of change in state 1 is set to 1.3, the rate of change in state 2 is set to 1.6, and the rate of change in state 3 is set to 1.3 for vowels. Regarding consonants, the rate of change in state 1 was set to 1.1, the rate of change in state 2 was set to 1.2, and the rate of c...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com