Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method and device for automatic disambiguation of multiple documents with industrial security topics

An industrial security, multi-document technology, applied in the field of document disambiguation, which can solve the problems of inaccurate inheritance structure and failure to consider the difference between different types of topics.

Active Publication Date: 2020-09-11
BEIHANG UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the DAG topic structure graph used in this method does not consider the differences between different types of topics, so the inheritance structure of the graph is correspondingly inaccurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for automatic disambiguation of multiple documents with industrial security topics
  • A method and device for automatic disambiguation of multiple documents with industrial security topics
  • A method and device for automatic disambiguation of multiple documents with industrial security topics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to understand the characteristics and technical contents of the embodiments of the present invention in more detail, the implementation of the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the embodiments of the present invention.

[0054] The following is an explanation of key terms related to the embodiments of the present invention:

[0055] Submodular function: If A is a subset of B, then for the function f(), if f(A+e)-f(A)≥f(B+e)-f(B) holds true, then f( ) function is submodular, f() function is called submodular function. Generally, the submodulus function has a diminishing marginal effect, and the increment brought by a single element decreases with the increase of the base set considered.

[0056] Multi-submodulus function: The multi-submodulus function maintains the property of the submodula...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for automatic disambiguation of multiple documents of industrial safety topics. The method includes the steps that a DAG topic structure chart with multiple dimensions is created, and a topic set is formed by all topics in the DAG topic structure chart; input keywords are acquired, the multiple documents corresponding to the keywords are collected, and a document set is formed by the multiple documents; each document in the document set is tagged with a corresponding tag; the DAG topic structure chart and the document set tagged with the tags are input into a multi-sub-modular function, and the multi-sub-modular function is optimized; according to an optimization result, a target topic subset is determined, and the target topic subset is a subset of the topic set; based on the DAG topic structure chart, topics corresponding to the tags of the documents are determined; aiming at the topics of the target topic subset, the documents corresponding to the topics are classified as a group.

Description

technical field [0001] The invention relates to the technical field of document disambiguation, in particular to a method and device for automatic disambiguation of multi-documents on an industrial security theme based on a multi-submodule optimization method. Background technique [0002] In recent years, machine learning has developed rapidly and has been applied to various fields. At present, machine learning applications often involve grouping experimental objects. Specifically, in the field of natural language processing, many machine learning applications need to classify multiple input documents. The use of disambiguation technology can replace the traditional manual method for efficient and accurate document classification, so it has a profound role in promoting the development of machine learning. [0003] At present, the existing disambiguation technology is to combine the topic structure diagram of Directed Acyclic Graph (DAG, Directed Acyclic Graph) and design t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F40/30
CPCG06F16/35G06F40/30
Inventor 李博陈汉腾冯岩符式定李建欣
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products