Institution named entity normalization method and system based on LEAM model

A named entity and normalization technology, applied in unstructured text data retrieval, text database query, instrument, etc., can solve the problems of inapplicability in a wide range and low accuracy rate

Active Publication Date: 2021-01-12
SHANGHAI JIAO TONG UNIV
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although the method of using rules for normalization is superior in some examples, it has certain requirements for author naming conventions, so it cannot be applied in a wide range, and the accuracy rate is not high, so most normalization algorithms use knowledge-based approach

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Institution named entity normalization method and system based on LEAM model
  • Institution named entity normalization method and system based on LEAM model
  • Institution named entity normalization method and system based on LEAM model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0084]The present invention designs and implements a system for the normalization of institutional named entities, which involves the collection and sorting of institutional named entity data, the screening of institutional named entity data, denoising, the construction based on LEAM classification model, and the use of data to model Training and adjustment, etc.; specifically, such asfigure 1 , The method of the present invention includes the following steps:

[0085]Step S1: Use statistical rules to screen all academic institution information data, and remove obviously wrong data.

[0086]Step S2: In the data that has been screened, use regularities or some other rules to remove noise in the data.

[0087]Step S3: Divide the processed data into training set, validation set and test set according to the category and corresponding proportion.

[0088]Step S4: Input the training set and the validation set into the LEAM model, and train a model for the normalization of the organization's named en...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an institution named entity normalization method and system based on an LEAM model, and the method comprises the steps: S1, screening all academic institution information data through a preset statistical rule, and removing data which does not accord with a preset condition; S2, in the screened data, removing noise existing in the data according to a regular expression; S3,dividing the denoised data into a training set, a verification set and a test set according to categories and a preset proportion; S4, inputting the training set and the verification set into an LEAMmodel, and training a model for institution named entity normalization; and S5, inputting the test set into the trained model, testing the effect of the model, and performing fine adjustment. According to the method, the paper publication number of each academic institution can be counted, so that the academic ability of a certain academic institution can be judged more scientifically and visually.

Description

Technical field[0001]The present invention relates to the technical field of organization named entity normalization, and in particular, to a method and system for organization named entity normalization based on a LEAM model.Background technique[0002]The main purpose of institutional named entity normalization in academic big data is to identify and map various institutional aliases to real institutional entities. Institutional named entity normalization is essential for academic institution capability evaluation, institution cooperation network, scholar name disambiguation, scholar trajectory tracking, talent flow, academic paper management, and academic ranking. With the increasing number of academic papers today, the normalization of institutional named entities is also an indispensable step for constructing an academic network knowledge graph.[0003]With the advancement of modern science and technology, the number of scientific research papers has increased sharply. In recent ye...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F16/33G06F16/35
CPCG06F40/295G06F16/3344G06F16/35
Inventor 亓杰星彭金波傅洛伊王新兵陈贵海
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products