Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Statistical machine translation method based on predicate argument structure (PAS)

A predicate argument structure, statistical machine translation technology, applied in the direction of instruments, calculations, special data processing applications, etc., can solve problems such as there is no very good solution

Active Publication Date: 2013-04-03
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF3 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, for global reordering, that is, reordering that takes the overall structure of the sentence into account, current machine translation models do not have a very good solution

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Statistical machine translation method based on predicate argument structure (PAS)
  • Statistical machine translation method based on predicate argument structure (PAS)
  • Statistical machine translation method based on predicate argument structure (PAS)

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach

[0034] 1. Perform automatic word segmentation, automatic word alignment, syntactic analysis and bilingual joint semantic role labeling for bilingual sentences in the bilingual corpus. The specific implementation is as follows:

[0035] The source language sentence and the target language sentence in the bilingual sentence pair are segmented, and the word segmentation results of the source language end and the target language end are obtained. If the source language or target language does not contain Chinese, word segmentation is not required. If Chinese is included in the source language or target language, the Chinese word needs to be segmented. In the embodiment of the present invention, the word analysis tool Urheen is used to automatically segment Chinese words. Urheen lexical analysis tool can be downloaded for free at the following URL:

[0036] http: / / www.openpr.org.cn / index.php / NLP-Toolkit-for-Natural-Language-Processing / .

[0037] After obtaining the word segmentation r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a statistical machine translation method based on a predicate argument structure (PAS). The statistical machine translation method comprises the following steps of: carrying out word segmentation, automatic word alignment, syntactic analysis and bilingual combined semantic role labeling on bilingual sentences in a bilingual corpora; extracting PAS conversion rules of the bilingual sentences according to results of the bilingual combined semantic role labeling so as to model the relationship between PASs of two languages; matching a plurality of semantic role labeling results of sentences to be translated by using the PAS conversion rules and carrying out corresponding translation; and structuring a translation hypergraph according to results of matching and translation based on the PAS conversion rules to finally generate a translation result.

Description

Technical field [0001] The present invention relates to the technical field of natural language processing, and is a novel statistical machine translation method based on the predicate argument structure (referred to as PAS for short). Background technique [0002] The current statistical machine translation method is mainly a process of automatically learning translation rules from a bilingual corpus and using these rules to translate test sentences. The statistical machine translation model has experienced word-based, phrase-based, and syntactic structure-based translation models, and the translation quality has also made considerable progress. However, the current translation model only considers the hierarchical structure of the sentence at most, and does not model the semantic knowledge in the sentence. [0003] At the same time, reordering has always been an important and difficult subject in machine translation research. The current translation model is a good model for lo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27
Inventor 宗成庆翟飞飞张家俊周玉
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products