Method for discovering sensitive data in text big data
A technology of sensitive data and big data, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as difficulties in sensitive information, and achieve comprehensive and accurate analysis, easy implementation, and improved efficiency.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0029] Such as figure 1 As shown, the basic idea of the method for discovering potential sensitive data in text big data proposed by the present invention is:
[0030] (1) First, establish a sensitive word information database, which contains the normative description of each predefined sensitive word, as well as various methods of artificial interference and deformation, including character splitting, cyberspeak, typographical processing, pinyin translation, etc. (standardized description and corresponding The variation descriptions belong to the same sensitive information), and at the same time, the weight coefficient of each sensitive word is determined in the thesaurus according to the context and word semantics where the word appears;
[0031] (2) then set up a sensitive word retrieval search tree for all sensitive words in the sensitive lexicon;
[0032] (3) Preprocessing the text, including removing punctuation marks and removing auxiliary words, stop words, etc.;
...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com