Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Detection of Obscured Copying Using Discovered Translation Files and Other Operation Data

a technology of operation data and translation files, applied in the field of obscure copying detection using discovered translation files and other operation data, can solve the problems of false indication of no copying, small insignificant changes, and all kinds of diff-like programs are limited in detecting illegal copying, so as to achieve easy maintenance and increase reliability

Inactive Publication Date: 2011-12-29
ROMAN KENDYL A +1
View PDF10 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0050]Such a program would be able to be used “as is” on many projects without custom programming for each project, and thus would be much more easily maintained and enhanced, would have increased reliability, and could be used without internal programming knowledge or effort.

Problems solved by technology

However all of these diff-like programs are limited in detecting illegal copying because they only report lines that match exactly.
Small insignificant changes can easily be made to each copied line and these diff-like programs will report that no lines are identical, giving a false indication that there is no copying.
These diff-like programs cannot detect such global changes.
Further, the diff program algorithm is limited.
If a block of code is copied but moved out of order, the diff program may fail to detect the identical lines simply because they have been rearranged within the file.
This makes it difficult for conventional comparison programs to detect the copying.
This makes individual line-by-line comparison impossible because the equivalent elements may be split across non-contiguous lines.
While these situation specific test programs validated this basic approach, and saved a significant amount of time preparing exhibits that could be edited by hand for completeness, it was clear that I had not yet developed a complete solution that would meet the needs of general use over a wide range of situations.
One problem was that the translation rules and terms are built-in to each custom program.
The required repeated modification of the program resulted in multiple versions and constant changing of the program.
Another problem was that each project required its own custom program so that the program could never be finished.
Another problem was maintaining a growing set of custom programs.
It was difficult to fix software defects or to add general enhancements.
Further, testing with a broader range of test cases revealed that many techniques for hiding illicit copying were still not covered by these simple test programs.
For example, a situation where the illicit copier added carriage returns, words or comments that didn't change the essential function of the code, still defeated my early test programs.
My early test programs could not handle multiple translations of the same words.
Also, the process of finding pairs of files to be compared was still a time consuming manual process.
However, often that person may be limited because of protective orders from seeing both sides of the comparison.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Detection of Obscured Copying Using Discovered Translation Files and Other Operation Data
  • Detection of Obscured Copying Using Discovered Translation Files and Other Operation Data
  • Detection of Obscured Copying Using Discovered Translation Files and Other Operation Data

Examples

Experimental program
Comparison scheme
Effect test

example files

[0118]FIGS. 2A and 2B shows example files. In this example, as shown in FIG. 2A, file A 110 is named jump.c, and as shown in FIG. 2, file B 120 is named leap.c. In this example the files are both written in the same computer programming language called the C Programming Language, or just C. At first glance, these two files do not appear to be similar or that one is a copy of another. The present invention provides a way to automatically detect and format a report that will show the true similarity between these two files.

Discovered Translations

[0119]FIG. 2C shows an example of discovered translations list 2300 data. The original words 2300a from file A are shown in the first column. The translation equivalents 2300b found in file B are shown in the second column. Each row of data represents correlated pairs of words, which the user (typically, a computer forensic expert) discovers and confirms have been used to obscure copying. The first line 2310 contains a correlated pair of words...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Systems and methods that automatically compare sets of files to determine what has been copied even when sophisticated techniques for hiding or obscuring the copying have been employed. The file compare system comprises a file compare program that uses various operational data and user interface options to detect illicit copying, highlight and align matching lines, and to produced a formatted report. A discovered translations file is used to match translated tokens. Other operation data files specify rules that the file program then uses to improve its results. The generated report contains statistics and full disclosures of the discovered translations used and the other methods used in creating the exhibits. The system includes a bulk compare program that automatically detects likely file pairings and candidates for validation as suspected translations, which can be used on iterative runs. The user is given full control in the final output and the system automatically reforms the reports and recalculations the statistics for consistent and accurate final presentation.

Description

RELATED APPLICATIONS[0001]This application claims priority of U.S. provisional application Ser. No. 60 / 635,908, filed Dec. 10, 2004, entitled “DETECTION OF OBSCURED COPYING USING KNOWN TRANSLATIONS FILES AND OTHER OPERATIONAL DATA”, which is hereby incorporated by reference, and U.S. provisional application Ser. No. 60 / 635,562, filed Dec. 13, 2004, entitled “DETECTION OF OBSCURED COPYING USING KNOWN TRANSLATIONS FILES AND OTHER OPERATIONAL DATA”, which is hereby incorporated by reference.[0002]This application also claims priority of U.S. application Ser. No. 11 / 299,529, filed Dec. 12, 2005, entitled “DETECTION OF OBSCURED COPYING USING KNOWN TRANSLATION FILES AND OTHER OPERATIONAL DATA,” which is expressly incorporated herein by reference.BACKGROUNDField of the Invention[0003]This invention relates to systems and methods for comparing files to detect the use of copied information, and more particularly to such systems and methods that detect copying where the copying has been obscu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F8/70G06F2221/2101G06F17/2211G06F21/10G06F40/194G06F21/6218
Inventor ROMAN, KENDYL A.RAPOSO, PAUL
Owner ROMAN KENDYL A
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products