Method, device and apparatus for checking duplication of text
A text and device technology, applied in the field of computer-readable storage media, can solve the problems of low efficiency of duplication checking and large amount of calculation, and achieve the effect of saving the amount of calculation and improving the efficiency of text duplication checking.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0081] The following introduces Embodiment 1 of a text plagiarism checking method provided by the present application, see figure 1 , embodiment one includes:
[0082] Step S101: Obtain the target text.
[0083] The above-mentioned target text may specifically be a text input by a user, and the main purpose of this embodiment is to find a text similar to the target text from multiple texts to be checked for duplicates.
[0084] Step S102: Segment the target text to obtain a target text sequence including multiple words.
[0085] Text segmentation, also known as text segmentation, refers to the process of automatically identifying the boundaries between fragments with independent meaning in a text. As an optional implementation manner, in this embodiment, the target text may be interleavedly cut according to a preset text interval, and the interval size of the preset text interval may be specifically determined according to actual requirements.
[0086] Step S103: Calculate ...
Embodiment 2
[0099] The second embodiment is mainly used to find text similar to the target text from a large amount of repeated texts to be checked, such as figure 2 As shown, embodiment two specifically includes the following steps:
[0100] Step S201: Create a text fingerprint database in advance.
[0101] Specifically, a plurality of duplicate texts to be checked are determined in advance, and the fingerprint sequence of each duplicate text to be checked is obtained by calculation, and the fingerprint sequence is stored in the text fingerprint database, such as image 3 As shown, the fingerprint sequences of the text to be checked from 1 to the text to be checked are respectively calculated, and the calculated fingerprint sequences are stored in the text fingerprint database, where M is the number of texts to be checked, so as to facilitate the subsequent retrieval tasks. implement.
[0102] Step S202: Obtain the target text A input by the user.
[0103] Step S203: Interleave and c...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com