Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Character string processing method and apparatus

A technology of character strings and character string lengths, which is applied in the fields of electrical digital data processing, special data processing applications, natural language data processing, etc., and can solve the problem that character strings cannot appear as mask candidate characters

Inactive Publication Date: 2007-06-27
IBM CORP
View PDF1 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In conventional techniques, there has always been the problem that character strings that are not in the dictionary cannot appear as mask candidates

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character string processing method and apparatus
  • Character string processing method and apparatus
  • Character string processing method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Hereinafter, specific embodiments of the present invention (hereinafter simply referred to as "embodiments") will be described in detail by referring to the accompanying drawings. Below, if in the embodiment each partial character string is a morpheme, word, clause, sentence or display letter type, no matter what each partial character string is, the embodiment can be executed without affecting the essence of the present invention .

[0026] FIG. 1 is a diagram showing the system configuration of the embodiment. Document 110 is a document mainly composed of text. In the text, there are strings of strings that should be kept secret. Strings are finally masked according to the invention. The partial character string analysis section 120 analyzes the read text into a partial character string. As an analysis method, there are known methods for analyzing characters into morphemes, words, clauses, sentences, or display letter types. Ideally, the text should be parsed int...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

In order to solve the above problem, disclosed as a first aspect is a method including the steps of analyzing a character string in a document into partial character strings; calculating, with respect to each of the partial character strings, a score incorporating appearance frequency of the partial character string; presenting the partial character strings and the scores to a user; determining which ones of the partial character strings have been selected by the user; storing the selected partial character strings as a safe partial character string list; and replacing, with predetermined replacement character strings, the partial character strings excluding the partial character strings existing in the safe partial character string list.

Description

technical field [0001] The present invention relates to a method, device and program for replacing information that should be kept secret in a document with different information. Background technique [0002] In recent years, techniques for masking (replacing) character strings in documents need to be enhanced from the viewpoint of personal information protection. There is a known technique that meets this need. With this technique, words to be masked are not displayed by using a dictionary in which strings that should be masked are stored. For example, Patent Document 1 employs the following masking technique. First, based on the dictionary, the parts to be masked are detected from the input document. The detected parts are then presented to the user as a list of masked results to allow the user to correct the list, the contents of the corrected list serving as the mask body part. [0003] With the described method, there is a possibility that there are mask candidates...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/22G06F17/28
CPCG06F17/276G06F40/274
Inventor 伊川洋平金山博宅间大介
Owner IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products