The invention relates to a word segmentation
phonetic transcription and
ligature writing method and device based on an SC grammar and belongs to the technical field of computer translation in
computer science. Firstly, based on a word segmentation
ambiguity rule of the SC grammar, an
ambiguity segmentation rule
library is built by means of
abutment constraint conditions in
natural language, and illegal segmentation is eliminated so that the word segmentation precision can be improved; secondly, based on a word segmentation
ligature writing rule
library of the SC grammar and a
ligature writing corpora statistical
library, the ligature writing corpora statistical library is used for performing ligature writing on ligature writing knowledge which cannot be presented as rules; finally, based on a dictionary library of the SC grammar, a dictionary is used for performing maximum matching to perform word segmentation, the word segmentation
ambiguity rule is called for fields where ambiguity happens so that a correct segmentation result can be acquired, and the context of a word is analyzed so that correct part-of-speech tagging and
phonetic transcription can be acquired. Compared with the prior art, word segmentation accuracy is improved, and the word segmentation ambiguity rule library, a combined ambiguity word library, the ligature writing rule library, the dictionary library and the ligature writing corpora statistical library are easy to expand and maintain.