Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Structured model training method, text structuring method and related devices

A text structure and structured technology, applied in the field of data processing, can solve problems such as labor costs

Active Publication Date: 2019-04-05
BEIJING HEXIANG WISDOM TECH CO LTD
View PDF10 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the traditional way, users need to artificially represent such professional text information in a structured manner according to their own understanding and cognition, which greatly consumes labor costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Structured model training method, text structuring method and related devices
  • Structured model training method, text structuring method and related devices
  • Structured model training method, text structuring method and related devices

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0120] Please combine figure 1 To understand, the text structuring method provided in this embodiment will be described in detail below. The text structuring method mainly includes two parts. The first part is to train the structured model, and the second part is to structure the text. Representation.

[0121] First, train the structured model;

[0122] The structured model includes an entity extraction model for extracting entities and an extraction model for extracting relationships between the entities. The training method includes the following steps:

[0123] Step 101: Obtain a labeled first corpus set, the first corpus set being obtained by performing entity corpus labeling on each text in the first text set according to a first preset rule.

[0124] The first text collection includes, but is not limited to, technical documents, patents, academic papers, etc. The first text collection in the embodiments of the present application is described by taking a patent as an example. F...

Embodiment 2

[0220] See Figure 5 As shown, the embodiment of the present application also provides a method for determining text similarity. The method in this example is applied to an electronic device. The electronic device may be a server or a terminal. The method may include the following steps:

[0221] Step 301: Obtain a target text and a candidate data set. The candidate data set includes a plurality of arrays, each of the plurality of arrays represents a semantic vector of an entity; the entity is included in the candidate text.

[0222] The server may receive the target text sent by the terminal, for example, the target text may be a patent.

[0223] The specific method for the server to obtain the candidate data set includes at least the following two methods:

[0224] In the first possible implementation:

[0225] First, obtain a text collection. The text collection includes n candidate texts, where n is an integer greater than or equal to 2. It is understandable that the text collection...

Embodiment 3

[0327] See Figure 7 As shown, the embodiment of the present application also provides a method for determining the novelty of a text. The method is applied to an electronic device. The electronic device can be a server or a terminal. In this embodiment, the electronic device can be a terminal As an example, the method specifically includes the following steps:

[0328] Step 401: Determine the target text.

[0329] For example, the target text may be a patent or a paper. In this embodiment, the target text is explained by taking a patent as an example.

[0330] Step 402: Extract multiple target entities in the target text to obtain a set of target entities.

[0331] In this example, multiple target entities in the target text are extracted through the entity extraction model in Embodiment 1. Specifically, the target text is input to the entity extraction model, and the target text is identified through the entity extraction model Multiple target entities in, the multiple target entit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text structuring method. The method comprises the steps that a target text to be structured is acquired; Inputting the target text into an entity extractionmodel, and identifying a target entity set in the target text through the entity extraction model; Inputting the identified target text of the target entity set into a relation extraction model, and extracting a relation between the target entities through the relation extraction model; And according to the target entities in the target entity set and the relationship between the target entities,performing structured representation on the target text to generate a target structure. In the embodiment of the invention, the conversion speed is high, and the labor cost is saved.

Description

Technical field [0001] The invention relates to the field of data processing, in particular to a method for training structured models, text structured and related devices. Background technique [0002] In today’s information age, it has become a routine way for users to retrieve the text information they need in their daily work and study life. Currently, the text information that users can retrieve is unstructured text information, such as academic documents or patents. However, texts such as academic literature and patents that are professional, logical, and more technical terms are more complicated for users to understand. However, such texts that are professional and more technical terms are converted into structured expressions ( Such as tables, structure diagrams, flowcharts, etc.) can help users reveal the obscure structure in the text and express the information in the text more clearly. [0003] In the traditional way, users need to artificially structure such profession...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/36G06F17/27
CPCG06F40/295
Inventor 姜庭欣王志强王希桢李静毅刘乾楠郭永红何佳陈伟然杨冠梅段博超
Owner BEIJING HEXIANG WISDOM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products