Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Construction method of constructional engineering multi-mode bilingual parallel corpus

A parallel corpus and construction engineering technology, which is applied in the construction of multimodal bilingual parallel corpora for construction engineering, can solve the problems of low translation accuracy, irregular corpus format and content, and no proofreading, so as to improve teaching quality and generate good returns. many effects

Active Publication Date: 2019-07-23
SHANDONG JIANZHU UNIV
View PDF12 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are still some problems at present: the specialized architectural corpus at home and abroad is extremely rare, and the architectural multimodal corpus is unprecedented; Irregularity; the source of corpus is not authoritative enough, and some corpus collects various texts on the Internet without distinction, resulting in large noise and low purity of the corpus, which cannot be truly applied to CAT software; currently, parallel corpus is mostly aligned with paragraphs, but when translating , the most valuable reference is the sentence, followed by language fragments, phrases and terms, and the accuracy of the entire translation is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Construction method of constructional engineering multi-mode bilingual parallel corpus
  • Construction method of constructional engineering multi-mode bilingual parallel corpus
  • Construction method of constructional engineering multi-mode bilingual parallel corpus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0030] The construction method of the multimodal bilingual parallel corpus of architectural engineering involved in this embodiment specifically includes the following steps:

[0031] (1) Corpus screening: The original corpus is obtained through network download, scanning and identification, manual entry and web crawler. The main sources of the original corpus are English-Chinese bilingual works on architecture officially published by national publishing houses, official government reports, official certification materials, Audio, video, drawings, pictures, etc. of official conferences in the construction industry;

[0032] (2) Corpus extraction and proofreading: use modern image technology to collect multi-modal construction engineering information (pictures, charts, drawings, videos, audio and text, etc.), and mine and construct it; Add, delete, modify, and check the original corpus, clean and remove the data of the original corpus, save it after proofreading, and make the b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of data processing, and particularly relates to a construction method of a constructional engineering multi-mode bilingual parallel corpus. The method comprises the steps of performing corpus screening, corpus extracting, proofreading, corpus segmenting, aligning and denoising to obtain a parallel corpus, updating the corpus and expanding the capacity,so that the rich comparison samples are provided for the building vocabularies, the meaning of the retrieved vocabularies or syntax is related to the building, and some useless meanings are excluded;a large number of bilingual translation samples are provided for a user, the segmentation is fine, the precision is high, the retrieved vocabularies or syntactic meanings are all related to the building, some useless meanings are eliminated, and a large number of building bilingual translation samples are provided for the user.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to a method for constructing a multimodal bilingual parallel corpus of construction engineering. Background technique [0002] Architectural English is the combination of the construction industry and English, and involves all aspects of the construction industry, such as pre-qualification, bidding, construction, quality assessment, etc. The stylistic characteristics of architectural English belong to scientific and technological texts, with their own professional vocabulary and expression habits, the discourse style is written style, and the tone of discourse is formal style. With the continuous expansion of my country's foreign construction market share and the integration of domestic construction market and foreign construction market, the use of construction English is becoming more and more common, and translations of construction English are also appearing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/36G06F16/953
CPCG06F16/36G06F16/953
Inventor 张晓红王薇张聪颖丁玫高金岭鲍玉平
Owner SHANDONG JIANZHU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products