Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for recognizing ambiguous words with combinatorial ambiguities

An ambiguous word and combination technology, applied in the field of natural language processing, can solve the problems of unpredictable manual methods, high cost, and difficulty in manually summarizing the ambiguity of corpus, so as to achieve the effect of effective self-adaptation.

Inactive Publication Date: 2014-01-15
FUJITSU LTD
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, existing solutions have many disadvantages
For example, it is difficult to manually summarize the potential ambiguity in the corpus, and the manual method requires a lot of time and manpower, and the cost is very high
Moreover, combined ambiguity has domain-related characteristics, that is, different ambiguities will occur in different domains, and it is difficult to predict artificially.
In addition, manually labeling the corpus is also a costly process. If the word segmentation system is applied in a new field, it is necessary to re-label the new corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for recognizing ambiguous words with combinatorial ambiguities
  • Method and device for recognizing ambiguous words with combinatorial ambiguities
  • Method and device for recognizing ambiguous words with combinatorial ambiguities

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Exemplary embodiments of the present invention will be described below with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. However, it should be understood that many implementation-specific decisions can be made during the development of any such actual implementation in order to achieve the developer's specific goals, and that these decisions may vary from implementation to implementation .

[0031] Here, it should also be noted that, in order to avoid obscuring the present invention due to unnecessary details, only the device structure closely related to the solution according to the present invention is shown in the drawings, and the relationship with the present invention is omitted. Little other details.

[0032] The following will combine figure 1 A method for identifying ambiguous words with combination ambiguity according to an embodiment of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for recognizing ambiguous words with combinatorial ambiguities. The method includes the steps that coarse-grained segmentation is conducted on first-language sentences through a core word list; candidate ambiguous words capable of being split into a plurality of words of smaller granularity are detected from word segmentation results through the core word list after coarse-grained segmentation is finished; fine-grained segmentation is conducted on the first-language sentences by splitting the candidate ambiguous words; translations of the candidate ambiguous words and translations of the words, of the smaller granularity, formed by splitting the candidate ambiguous words are respectively extracted from second-language sentences corresponding to the first-language sentences; whether the extracted translations of the candidate ambiguous words and the extracted translations of the words, of the smaller granularity, appear in translations, of the candidate ambiguous words and the smaller-granularity words, acquired through a first-language dictionary and a second-language dictionary or not is judged to determine whether the candidate ambiguous words are true ambiguous words or not.

Description

technical field [0001] The present application generally relates to the field of natural language processing, and in particular to a method and device for identifying ambiguous words with combination ambiguity. Background technique [0002] In natural language processing, word segmentation is one of the basic topics. Most natural language processing is based on the results of word segmentation, so the quality of word segmentation directly affects the accuracy of subsequent work. Due to the characteristics of natural language itself, in the process of word segmentation of natural language, the problem of word segmentation ambiguity will be encountered. Taking Chinese as an example, Chinese word segmentation ambiguity mainly includes the following two types: intersection ambiguity and combination ambiguity. Generally, assuming that A, X and B are word strings respectively, if the word string AXB composed of them satisfies the condition that AX and XB are words at the same ti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
Inventor 郑仲光孟遥于浩
Owner FUJITSU LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products