Method and device for author naming disambiguation and electronic equipment
An author, disambiguation technology, applied in metadata text retrieval, special data processing applications, unstructured text data retrieval, etc., can solve problems such as slow speed, insufficient precision, and complex naming disambiguation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0040] Such as figure 1 As shown, the embodiment of the present invention provides a method for author name disambiguation, including:
[0041] S101, according to the relevant information of the paper, using the pre-trained classification model to determine the unique author of the paper from the academic data set;
[0042] S102, for a paper whose only author cannot be determined, search the academic data set to obtain a set of candidate papers by using relevant information of the paper;
[0043] S103, clustering the papers in the candidate thesis set to obtain multiple categories, performing reverse classification on the papers in the candidate thesis set to determine their categories, and creating a unique author for the papers according to the categories.
[0044] The method provided in this embodiment integrates classification and clustering processing for disambiguation. Classification processing is used as the threshold of clustering processing, which effectively solves...
Embodiment 2
[0075] Such as figure 2 As shown, another aspect of the present invention also includes a functional module architecture completely corresponding to the aforementioned method flow, that is, an embodiment of the present invention also provides a device for author name disambiguation, including:
[0076] The unique author determination module 201 is used to determine the unique author of the paper from the academic data set by using the pre-trained classification model according to the relevant information of the paper;
[0077] The candidate collection of papers acquisition module 202 is used to search academic data sets using the relevant information of the papers to obtain the candidate collection of papers for papers for which the only author cannot be determined;
[0078] The unique author creation module 203 is used to cluster the papers in the candidate papers to obtain multiple categories, and perform reverse classification on the papers in the candidate papers to deter...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com