Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Word and expression alignment method and device

One word, another technology, applied in the field of word alignment method and device, can solve the problem of not particularly good effect, and achieve the effect of improving efficiency and complete construction

Inactive Publication Date: 2015-02-25
BEIJING INTERNATIONAL STUDIES UNIVERSITY
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Through experimental observations, GIZA++ is not particularly effective in processing phrases, such as "be able to", "in addition to", "plenty of", etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word and expression alignment method and device
  • Word and expression alignment method and device
  • Word and expression alignment method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] A word alignment method, the method comprising:

[0027] see figure 2 , segment the sentence to be aligned into individual words or phrases, and perform word grouping; query and match in the word grouping, and perform mutual translation pair alignment.

[0028] Further, before the grouping of words, a phrase dictionary is first constructed, and the phrases are phrases in a linguistic sense; as far as possible, the longest phrase in the phrase dictionary is used to match the character strings in the sentence.

[0029] More specifically, word grouping uses the longest word possible to match a string in a sentence. Word matching is divided into forward and reverse matching methods. The forward matching method uses the processing order from the left to the right of the sentence, and the reverse matching method uses the right-to-left method. The invention adopts the forward matching method for English grouping, and the combination of reverse matching method and probabilit...

Embodiment 2

[0091] see Figure 4 , a word alignment device, the device comprising:

[0092] The segmentation unit is used to segment the sentences to be aligned into individual words or phrases for word grouping;

[0093] The comparison unit is used to search for matching in the word groups and perform alignment of mutual translation pairs.

[0094] Further, the segmentation unit is used to construct a phrase dictionary before word grouping, and the phrase is a phrase in the linguistic sense; use the longest phrase in the phrase dictionary to match the phrase in the sentence as much as possible. string.

[0095] Further, the segmentation unit is used for the word grouping, and the translation explanation corresponding to each word or phrase should be queried at the same time during the grouping process; each word or phrase and its corresponding translation interpretation constitute a basic dictionary.

[0096] Further, the segmentation unit is used for correcting the sentence segmentat...

Embodiment 3

[0112] A machine translation system comprising a word alignment device,

[0113] It is used to divide sentences that need to be aligned into individual words or phrases, and perform grouping of words; query and match in the grouping of words, and perform alignment of mutual translation pairs;

[0114] Build a phrase dictionary before word grouping, and the phrase is a phrase in the linguistic sense; use the longest phrase in the phrase dictionary to match the string in the sentence as much as possible;

[0115] The word grouping, in the grouping process, the translation interpretation corresponding to each word or phrase should be inquired at the same time; the basic dictionary is formed by each word or phrase and its corresponding translation interpretation;

[0116] According to the constructed basic dictionary, check whether a word or phrase in one language is in the translation explanation corresponding to the word or phrase in a sentence in another language, and if so, fi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a word and expression alignment method and device and relates to the technical field of machine translation. The word and expression alignment method and device achieve the technical goal of word and expression alignment. According to the technical scheme, the method comprises the steps of segmenting sentences, needing to be aligned, of two languages into words or expressions one by one, carrying out word and expression grouping, carrying out query matching in word and expression groups, and carrying out alignment of intertranslation pairs. The word and expression alignment method and device are used for accurate and complete building of a phrase list in the machine translation process.

Description

technical field [0001] The present invention relates to the technical field of machine translation, in particular to a word alignment method and device. Background technique [0002] Word alignment is a fundamental problem in the field of natural language processing, and many applications based on bilingual corpora (such as statistical machine translation (SMT), example-based machine translation (EBMT), word sense disambiguation (WSD), dictionary compilation, etc.) need Lexical level alignment. Generally speaking, there are different levels of alignment such as sections, paragraphs, sentences, phrases, and words. fragments. Among them, the alignment technology of chapters, paragraphs, and sentences is mainly used for the collation of the corpus, and the alignment of phrases and words is to find out the mutual translation pairs between words and words, words and phrases, and phrases and phrases in the mutually translated texts. . A large part of todays phrase-based statis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28G06F17/27G06F17/30
Inventor 魏子杭
Owner BEIJING INTERNATIONAL STUDIES UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products