Method and device for recommending series documents
A document and series of technologies, applied in the field of network communication, can solve problems such as user inconvenience and reduce reading experience, so as to meet the reading needs and improve the reading experience
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0064] In the above step 101, obtaining the document title of the uploaded document may grab more than one document title from the document (Meta) metadata database storing the uploaded document.
[0065] When crawling document titles from the document metadata database, in order to increase the probability of a series of documents, the following crawling strategies can be adopted but not limited to:
[0066] 1) Grab the document title of the document uploaded by the same user.
[0067] It may further specifically include: capturing document titles of documents uploaded by the same user within a time interval; or capturing documents uploaded by the same user within two or more time intervals with regular intervals.
[0068] For the same series of documents, users usually upload them within a time interval. Therefore, capturing documents uploaded by the same user within a time interval has a high probability of integrating document series. In addition, for serialized documents...
Embodiment 2
[0076] The process of character normalizing the document title can be as follows figure 2 As shown, it specifically includes the following steps:
[0077] Step 201: Remove characters irrelevant to pattern matching processing in the document title.
[0078] Characters irrelevant to pattern matching processing can be set in advance, for example, other symbols except text symbols such as Chinese, English and numbers, and regional identification symbols such as book title numbers and brackets can be set as symbols irrelevant to pattern matching processing.
[0079] In this way, symbols that may interfere with pattern matching, such as redundant space symbols, dots, meaningless symbols, etc., in the document title can be removed. Among them, symbols that are meaningful to the content of the document title can be reserved, for example, the "3-4" method may be used to represent the serial number, where the existence of dashes is meaningful to the serial number, here you can be res...
Embodiment 3
[0088] image 3 The process flowchart of the pattern matching process that the present invention provides, in the present invention can adopt the mode of regular expression (regular expression) matching to carry out pattern matching, as image 3 As shown, it mainly includes the following steps:
[0089] Step 301: Determine the pattern identified by the serial number of each document title after character normalization processing.
[0090] Various patterns of document titles may be set in advance, and then the document titles after character normalization processing are matched with the preset patterns of document titles to determine the matched patterns and record the determined pattern IDs.
[0091] For example, various modes of document titles may be pre-configured, and these modes are set according to the serial number identification after normalization processing, as shown in Table 1. It should be noted that Table 1 is only an example, and the present invention does not ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com