Alignment method and apparatus for parallel spoken language materials
A technology of oral language and corpus, applied in the field of information processing, can solve problems such as inability to achieve results
Inactive Publication Date: 2009-06-24
KK TOSHIBA
View PDF0 Cites 8 Cited by
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
[00011] Due to the difference between spoken language and written language with complete structure, in speech machine translation, even if the alignment method that can well align written language with complete structure is used to align spoken language, satisfactory results cannot be achieved
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View moreImage
Smart Image Click on the blue labels to locate them in the text.
Smart ImageViewing Examples
Examples
Experimental program
Comparison scheme
Effect test
Embodiment Construction
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More PUM
Login to View More
Abstract
The invention provides a method and a device for aligning parallel spoken language material and a phonetic machine translation method and a system respectively adopting the alignment method and the device for parallel spoken language material. The alignment method of parallel spoken language material comprises the following steps: obtaining word alignment set based on a statistical method and a dictionary from the parallel spoken language material; conducting phrase alignment to the parallel spoken language the dictionary, so as to get phrase alignment set; and conducting word alignment in the alignment phrase in the parallel spoken language material, so as to get word alignment set based on the phrase alignment. The invention utilizes word alignment set with high accurate rate obtained from the parallel spoken language material in a corpus and based on the statistical method and the dictionary to conduct phrase alignment and further word alignment to the parallel spoken language material, so as to get phrase alignment set and word alignment set, as well as apply into use in phonetic machine translation, thereby reducing the ambiguity of spoken language alignment through utilizing the word completeness.
Description
Technical field [0002] The present invention relates to information processing technology, in particular, relates to phrase alignment and word alignment of parallel spoken language corpus. Background technology [0004] Machine translation technology is mainly divided into: rule-based translation and corpus-based translation. [0005] In corpus-based machine translation, the main translation resources come from corpus. That is to say, in corpus-based machine translation, the parallel bilingual corpus in the corpus is used as the training basis for machine translation. Moreover, the process of corpus-based machine translation is to first perform word alignment and syntactic analysis on the parallel bilingual corpus in the corpus to form aligned sentence pairs that have undergone syntactic analysis; then, the translation engine converts such sentence pairs into It is regarded as a frame structure. When the user enters the sentence to be translated, the translation engine match...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More Application Information
Patent Timeline
Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
CPCG06F17/2827G06F40/45
Inventor 任登君吴华王海峰
Owner KK TOSHIBA
Who we serve
- R&D Engineer
- R&D Manager
- IP Professional
Why Patsnap Eureka
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com