Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for dynamically generating mass language assets in multiple language industry standard formats

An industry-standard, dynamically generated technology that is applied in special data processing applications, instruments, and electrical digital data processing. It can solve problems such as one-way language asset storage architecture, inability to connect corresponding relationships, and huge labor costs to achieve security. , Improve safety, ensure safety effect

Active Publication Date: 2014-04-16
上海佑译信息科技有限公司
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Disadvantages of the existing technology: 1) The existing language asset storage architecture is two-dimensional and one-way, and the corresponding relationship between the source language and each target language cannot be opened; Automatic acquisition of multi-language (multi-dimensional) and multi-directional language pairs for content results in a great waste of resources. If it is necessary to obtain it, it will inevitably result in huge labor costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for dynamically generating mass language assets in multiple language industry standard formats

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0037] The method for dynamically generating a large amount of language assets in a multilingual industry standard format includes the following steps:

[0038] 1) By developing a parser, read out the content in the corpus and term base based on XML-based standard formats such as TMX and TBX and import it into the specified database;

[0039] 2) While importing, it will automatically match and place database tables with the same content and different language pairs, and automatically generate a multilingual database with one source text and multiple sentences matching the target language;

[0040] 3) When the user is using it, according to the language pair specified by the user, the searched results are automatically fed back to the user in the form of translation memory, and presented to the end user in a specific format for reuse;

[0041] 4) When adding and updating a multilingual database, the related content in multiple languages ​​will be automatically updated, so as to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for dynamically generating mass language assets in multiple language industry standard formats. The method comprises the steps that a TMX corpus, a TBX corpus and the like based on the XML standard format and the content in a term bank are read through a development analyzer and led into an appointed database; in the process of leading in, a database table for automatically matching and containing the same content and different language pairs automatically generates a multi-language database of target languages with a source text and multiple matched sentences; in the use process of a user, searched results are automatically fed back to the user in a translation memory mode according to a language pair appointed by the user, and the searched results are presented to the finial user for reuse in a specific format; when the multi-language database is enriched and updated, related content of the multiple languages is automatically updated, and therefore it is ensured that the user continues to obtain the updated translation memory content after the language assets are dynamically updated. The language assets stored in the text database format are directly reused, the data are not damaged or lost easily, and the safety of the assets is improved.

Description

technical field [0001] The invention relates to a method for dynamically generating massive language assets in a multilingual industry standard format, which is used for the development and application of a TM module in CAT software or a multilingual translation system, and belongs to the technical field of multilingual machine translation. Background technique [0002] TM (Translation Memory) is one of the technologies widely used in the field of computer-aided translation (CAT). With the help of TM technology, translation efficiency can be significantly improved and content consistency can be ensured. Due to the wide variety of CAT software developed using TM technology, the storage format of TM content varies greatly. In order to facilitate the exchange of TM data between translation agencies and CAT tools, an open standard called TMX (Translation Memory eXchange) has been successfully applied locally. culture and translation industry. [0003] In the process of software...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28G06F17/30
Inventor 杜金林朱懿杜勇
Owner 上海佑译信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products