Multilingual literature classification method and device and storage medium

A document classification, multilingual technology, applied in the field of information processing, can solve problems such as limited coverage and language inability to communicate

Active Publication Date: 2021-04-06
中科大数据研究院
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, building a knowledge base requires classifying documents, and then constructing the classified documents as a knowledge base. The documents in the network include Chinese documents and foreign language documents. Since Chinese documents and foreign language documents are documents in different languages, they cannot communicate with each other in language. , it is difficult to classify multilingual documents at the same time, so usually the academic knowledge base established by companies and enterprises is a single language knowledge base, and the scope of such knowledge base is limited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multilingual literature classification method and device and storage medium
  • Multilingual literature classification method and device and storage medium
  • Multilingual literature classification method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0087]The exemplary embodiment will be described in detail herein, and examples thereof are shown in the drawings. The following description is related to the drawings, unless otherwise indicated, the same numbers in the drawings represent the same or similar elements. The embodiments described in the exemplary embodiments are not meant to all embodiments consistent with the present application. In contrast, examples of apparatus and methods consistent with some aspects of the present application are intended in detailed in the appended claims.

[0088]The terms used in this application are only for the purpose of describing particular embodiments, not to limit the invention. "One", "one", "one", "one", "" "," and "" "as used in the present application and the appended claims are also intended to include many forms unless otherwise clearly indicated. It should also be understood that the terms "and / or" as used herein refer to any or from any or more of the associated listing items.

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multilingual literature classification method and device and a storage medium. The multilingual literature classification method comprises the steps that literature is received, and the literature comprises Chinese literature and foreign language literature; representative word extraction: extracting at least one relational word of the literature according to the literature content, and clustering the representative words to obtain representative words of the literature; receiving a literature category table, wherein the literature category table is provided with a plurality of basic categories; literature classification: converting the representative words into representative word vectors, converting the basic categories into category word vectors, calculating relevancy between the representative word vectors and the category word vectors, and classifying the literatures according to the relevancy; and respectively extracting representative words from the Chinese literature and the foreign language literature, calculating relevancy between representative word vectors and category word vectors, classifying the literature according to the relevancy, and simultaneously classifying the Chinese literature and the foreign language literature.

Description

Technical field[0001]The present application relates to the field of information processing, and more particularly to a multilingual document classification method, device and storage medium.Background technique[0002]With the rapid development of science and technology, a large number of thesis, patents and other scientific literatures are constantly emerging. For some companies or companies, there is a need to retrieve in multiple network libraries, so literature retrieval in the Internet is no longer able to meet the needs of these users. So, in the face of massive literature, more and more companies, enterprises, and groups begin to build their academic knowledge base.[0003]However, constructing a knowledge base requires classification of literature, and then constructs the literature completed by the classification as a knowledge base. The literature in the network includes Chinese literature and foreign literature, due to the literature of Chinese literature and foreign literat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/205G06F40/258G06F40/284G06F40/289
CPCG06F16/355G06F40/205G06F40/258G06F40/284G06F40/289
Inventor 贾士杨冯凯王元卓
Owner 中科大数据研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products