Data searching and capturing method of information of a plurality of high-end talents
A technology of information data and talents, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of low accuracy of search results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0017] Step 1. Prepare a real resume page.
[0018] Provide 5000 resume pages, divided into ten groups, 500 resume pages in each group. These resumes are all English resumes, which can be crawled from the Internet by computer using existing web crawler technology, or retrieved and screened manually from the Internet.
[0019] A list of pre-prepared resume urls such as figure 1 shown.
[0020] Step 2, obtaining the text content of each resume in the first set of resumes.
[0021] Manually obtain the content of the text on each resume web page, that is, remove advertisements, web page headers, web page tails and other non-text information on each resume web page; finally remove the label code by the program.
[0022] Step 3: Count the total number T of words in each resume.
[0023] Use word segmentation technology (or manual processing) to further process the content of the text obtained in step 2, that is, remove function words and retain content words. Save all the words...
Embodiment 2
[0066] Step 1 to Step 16 of Embodiment 2 are exactly the same as Embodiment 1. After the sixteenth step, the following steps seventeen' to twenty-one' are also included.
[0067] Step seventeen', calculate the final negative evaluation score A of the new web page.
[0068] Based on the same principle, take 10 groups of 500 web pages that are not resumes in each group, and calculate the top 100 words that appear most frequently in each group in these 10 groups of web pages that are not resumes according to steps 2 to 5, and divide the 10 groups The scores of the first 100 words that appear most frequently in the web pages that are not resumes are defined as negative scores, and the first group of web pages that are not resumes are used to obtain the first 100 words in step 5 for the new words captured in step 16. The webpage is scored according to the method in step 6, and the first negative evaluation score A of the new webpage is obtained 1 , and so on, using the first 100 ...
Embodiment 3
[0078] Step 1: Prepare a real resume.
[0079] Provide 5000 resumes, divided into ten groups, 500 resumes in each group. These resumes are all Chinese resumes, or Japanese resumes, or Korean resumes, or resumes in any language, which can be crawled by computer using existing web crawler technology, or manually retrieved and screened from the Internet. The remaining steps are the same as Step 2 to Step 19 of Example 1.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com