Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

89results about How to "Improve deduplication efficiency" patented technology

Image replication removing method and image replication removing system

The invention discloses an image replication removing method and an image replication removing system. The image replication removing method comprises the following steps of S1, performing gray scale processing on two images, and zooming the two images to a standard dimension according to an initial width-to-height ratio; S2, calculating a repetition degree between the two images by means of a local two-value characteristic of the image; S3, determining whether the two images are repeated, if yes, performing a step S4; and S4, comparing qualities of the two images, and removing the image with low quality. According to the image replication removing method and the image replication removing system, local information and global information of the image are combined for performing repeated image determining so that advantage complementing is realized between different kinds of image information, thereby better ensuring high image replication removing speed, high replication removing effect and high completeness.
Owner:CTRIP COMP TECH SHANGHAI

Storage apparatus and duplicate data detection method

An optimum chunk cutout method is selected according to the type of content.A storage apparatus is a storage apparatus for storing content in a backup volume in response to a content storage request from a host system connected to the storage apparatus via a network and includes a chunk cutout unit for cutting out the content into one or more chunks and a duplication judgment unit for managing a duplicate state of the chunk or chunks which have been cut out by the chunk cutout unit; wherein the chunk cutout unit selects a method for cutting out the chunk based on content type identification information indicating a type of the content.
Owner:HITACHI LTD +1

Data identification method and data identification system

The invention provides a data identification method and a data identification system. The data identification method comprises extracting a matched field from data to be duplicate removed, calculating a key field contained by the data to be duplicate removed and obtaining a hashed value of the key field; obtaining a duplicate removal file corresponding to the matched field; positioning duplicate-removed data in the duplicate removal file according to the hashed value; and judging whether the data to be duplicate removed and the duplicate-removed data are identical, and identifying the data to be duplicate removed to be duplicate data if the data to be duplicate removed and the duplicate-removed data are identical. Therefore, when every datum to be duplicate removed is identified, the duplicate-removed data in the duplicate removal file relative to the data to be duplicate removed can be obtained, thereby reducing quantity of obtained duplicate-removed data, namely reducing judging times and improving duplicate removal efficiency. Further, if a follow-up system needs to analyze the data in the duplicate removal file, the analyzing process is quickened due to the improvement of the duplicate removal efficiency.
Owner:GUANGZHOU SUNRISE ELECTRONICS DEV

Increase in deduplication efficiency for hierarchical storage system

Exemplary embodiments provide improvement of deduplication efficiency for hierarchical storage systems. In one embodiment, a storage system comprises a storage controller; and a plurality of first volumes and a plurality of external volumes which are configured to be mounted to external devices. The storage controller controls to store related data which are derived from one of the plurality of first volumes in a first external volume of the plurality of external volumes. In another embodiment, the storage controller receives object data from a server and allocates the object data to the plurality of pool volumes. The plurality of pool volumes include a plurality of external volumes which are configured to be mounted to external devices. The storage controller controls to store the object data to the plurality of pool volumes based on object allocation information received from a backup server.
Owner:HITACHI LTD

Method for rapidly removing repeated list through a memory

The invention discloses a method for rapidly removing repeated list through a memory. The method includes: step 1, reading a history list information table in a data base, uploading the history list information table to the memory and storing the history list information table in a history list collection, step 2, uploading a list which is needed to be led to a temporary table of the data base, step 3, reading a data item which is needed to remove a repeated list in the temporary table, uploading the data item to the memory and storing the data item in a current leading list collection, step 4, removing repeated lists in bulk through an operation between the current leading list collection and the history list collection, updating the history list collection and updating the history list information table, and step 5, removing the temporary table. According to the method, in a set operation bulk removing repeated lists mode, the lists are led and repeated lists in the lists are removed so that the speed of removing the repeated lists is increased.
Owner:北京讯鸟软件有限公司

Block-level data de-duplication method for supporting dynamic ownership management in fog storage

The invention belongs to the technical field of fog computing and information security, and discloses a block-level data de-duplication method for supporting dynamic ownership management in fog storage. The method provides an improved block-level client de-duplication technology, and solves the problem of data sensitive information revealing of the current block-level client during de-duplicationwhile saving network bandwidth. The method also provides a secondary ownership list and a key update mechanism, can realize the access control of fine grain at a lower cost while effectively saving storage space, and fills up the blank of no compatibility between ownership management technologies and the block-level client de-duplication technologies. In addition, the method also introduces a datablock dynamic storage mechanism, data blocks in a system can be transferred in the system according to service needs, and therefore, service costs and file access delay can be reduced, system security and user service experience can be enhanced, and the problem of low system resource utilization rates in a current fog storage de-duplication scheme can be solved.
Owner:XIDIAN UNIV

Laser de-weight dynamic balance device and method applied to rotation workpiece

The invention provides a laser de-weight dynamic balance device applied to a rotation workpiece. The laser de-weight dynamic balance device is provided with a frame. The frame is provided with a laser and is internally provided with a first control unit connected with the laser. The frame is provided with a clamp connected with the first control unit in irradiation direction of the laser, or a dynamic detection device arranged in a split or one-piece manner. The laser de-weight dynamic balance device is provided with an updraft ventilator. The invention also provides a laser de-weight dynamic balance method applied to the rotation workpiece. The laser is used as a de-weight device, and de-weight powder is timely exhausted by the updraft ventilator. According to the invention, the de-weight precision and efficiency are high, the structure is simple, the operation is convenient, automation is easy, the automation degree is high, and the like.
Owner:邱玉兰

Duplication eliminating method based on multidimensional lattice data spatial model

The invention discloses a duplication eliminating method based on a multidimensional lattice data spatial model. The method includes following steps: loading local cache data and building the multidimensional lattice data spatial model; transferring the data into customized data format and cutting the data into data points; searching the data one by one, positioning coordinates of each data point on dimensions corresponding to the data model, searching each data point from a first digit down digit by digit, characterizing each data point if the same does not exist in the data model, and marking the data as absence; and traversing the data points of the data, outputting the cache if the data is marked as absence, and searching next data until all the data is searched. The duplication eliminating method based on the multidimensional lattice data spatial model is suitable for filtration and duplication eliminating of various data, high in duplication eliminating efficiency and has fine application value in engineering. In addition, by the method, the problem of severe resource consumption caused by length difference of the data is solved.
Owner:苏州云端信息科技有限公司

Data deduplication method and device

The invention provides a data deduplication method and device. The method comprises the following steps: determining a first area, wherein the first area is an area of which the data writing frequency is lower than a preset frequency threshold value, and the area comprises at least one data block; calculating the Hash value of a first data block in the first area; judging whether the Hash value of the first data block is the same with the Hash value in a deduplication mapping table item or not; when the Hash value of the first data block is the same with the Hash value in the deduplication mapping table, obtaining the physical address of a deduplication data block in the deduplication mapping table item; reading data in the deduplication data block; when the data in the deduplication data block is the same with the data in the first data block, changing a mapping relationship between the logic address, which is recorded in the data mapping table item, of the first data block and the physical address of the first data block into a mapping relationship between the logic address of the first data block and the physical address of the deduplication data block; and recovering the first data block. By use of the data deduplication method and device, deduplication frequency can be improved, data writing time delay is lowered, and the working efficiency of a memory system is improved.
Owner:MACROSAN TECH

Distributed storage system data storage method, apparatus, system, and storage medium

The invention discloses a data storage method of a distributed storage system, which comprises the following steps: dividing a file to be stored into blocks to obtain a plurality of file blocks to bestored; comparing a file block to be stored with a file block stored in advance to judge whether there is a file block matched with the content of the file block to be stored in the system. If yes, obtaining a data storage location of a file block whose content matches in the system; indexing the matching file block to be stored according to the data storage location. By dividing the file to be stored into blocks and comparing the data to determine the redundant data, the method can improve the detection probability of the partial duplicate data in the file and realize the accurate data deletion. The invention also provides a distributed storage system data storage device, a system and a readable storage medium, which have the beneficial effects.
Owner:ZHENGZHOU YUNHAI INFORMATION TECH CO LTD

A repeated data deleting method and device

The invention provides a repeated data deleting method and device. The method comprises the steps of dividing data flow into data blocks of a preset block size; performing fingerprint calculation on each data block and adding calculated fingerprint information to the attributes of data block structures; acquiring a fixed length prefix of the calculated fingerprints and distributing the data blocks into different processing queues according to the fixed length prefix. The working threads in the processing queues perform repetition checking operations in a parallel manner to delete repeated data in the data blocks. The method distributes received data blocks into different processing queues based on a fixed length prefix of fingerprints of data blocks and a single thread is used for processing data blocks in each processing queue; repetition checking is only performed from repetition deletion metadata block sub-lists corresponding to the fixed length prefix of fingerprints, so that the expenses of uniformity locks are avoided; the working threads of the processing queues realize the repetition checking operations in a parallel manner, so that the consumption of system resources in repetition removal computing is reduced and the data repetition deletion efficiency is increased.
Owner:ZHENGZHOU YUNHAI INFORMATION TECH CO LTD

Ciphertext image deduplication method used in cloud environment and cloud server

The invention belongs to the technical field of image deduplication and discloses a ciphertext image deduplication method used in the cloud environment and a cloud server. An image in a database is partitioned, characteristic values of image blocks are calculated, and the image blocks and the characteristic values are encrypted with any one rapid symmetrical encryption algorithm; the encrypted image blocks, a sequence matrix of the image blocks and the encrypted characteristic values are uploaded to the cloud server, and ciphertext image deduplication operation is executed by the server; if other authorized users upload the image again, the encrypted image blocks and the encrypted characteristic values are required to be sent to the cloud server, the cloud server performs retrieval operation in an encrypted image library, and storage of the image blocks or deduplication is decided according to the fact that whether the same characteristic values of the image blocks or the characteristic values of the image blocks in a set threshold value range exist. On the basis of conventional image deduplication, safe deduplication of ciphertext images is realized, deduplication of different images is realized, the expected safety purpose is achieved, and besides, the storage efficiency is improved under the condition that the accuracy is guaranteed.
Owner:XIDIAN UNIV

Test question duplicate removal method and test question duplicate removal system

The invention relates to the technical field of education, and discloses a test question duplicate removal method and system, which can reduce the calculated amount of the system and improve the testquestion duplicate removal efficiency. The method comprises the steps of obtaining a target test question; determining a question type of the target test question and an involved knowledge point set;generating a feature code of the target test question according to the question type and the knowledge point set and a preset coding rule, and obtaining each test question with the feature code from atest question resource library, if the content similarity between the target test question and any test question in the test questions is greater than a preset threshold, judging that the target testquestion is a repeated test question and deleting the target test question, and otherwise, storing the target test question and the feature code of the target test question into the test question resource library.
Owner:浙江蓝鸽科技有限公司

Distributed storage apparatus, and distributed storage de-duplication, writing, deletion and reading methods and systems

The invention discloses a distributed storage apparatus, and distributed storage de-duplication, writing, deletion and reading methods and systems. The methods and the systems are applied to the distributed storage apparatus. The distributed storage de-duplication method comprises the steps of obtaining a target object data fingerprint of a target data object in a unified storage layer, and storing the target object data fingerprint in a corresponding OSD (Object Storage Device); calculating the target object data fingerprint by utilizing a preset algorithm to obtain a target OSD of the target data object; judging whether the target OSD stores a historical data object or not; and if the historical data object is stored, adding 1 to a count of reference counting of the historical data object. The target OSD is found by directly utilizing the target object data fingerprint; a corresponding relationship between the object data fingerprint and the OSD is established, so that whether repeated data exists or not is directly judged; and therefore, the problem of low efficiency caused by performing matching query in a distributed storage network by utilizing a fingerprint library is avoided and the working efficiency of distributed storage de-duplication is improved.
Owner:ZHENGZHOU YUNHAI INFORMATION TECH CO LTD

Method and device for obtaining maximum conversion step number of session

The invention discloses a method and device for obtaining the maximum conversion step number of a session. The method comprises the steps that route configuration information of a conversion route chain configured in advance is obtained; route information generated by a browsed webpage during a session process of a user is received; the route information and the route configuration information are matched, the conversion step number matched on each webpage in the session and a previous conversion step number are obtained; according to the conversion step number matched on each webpage and the previous conversion step number, the reaching step number of each webpage in the session is set; the difference value of the matched previous conversion step number and the reaching step number of each webpage is computed, the conversion step number difference value of each webpage in the session is generated; the webpage with the conversion step number difference value of 1 is extracted, and according to the conversion step number corresponding to the webpage with the conversion step number difference value of 1, the maximum conversion step number of the session is obtained. According to the method and device, overmuch performance loss during the process of obtaining the optimized conversion step number by screening processing is lowered, and duplicate removal efficiency is improved.
Owner:BEIJING GRIDSUM TECH CO LTD

Method and device for de-repetition selection of repeated data based on cloud computing

The invention discloses a method for de-repetition selection of repeated data based on cloud computing. The method comprises the steps that at the step S10, when to-be-stored data containing the repeated data is acquired, a load value of a client side and a load value of a server side existing in a storage system at present as well as a current network bandwidth value are acquired; at the step S11, whether the load value of the client side, the load value of the server side and the current network bandwidth value satisfy preset conditions is judged, and the step S12 can be started if the conditions are satisfied; and at the step S12, under a preset de-repetition selection mode, a manner for the de-repetition selection of the repeated data in the to-be-stored data is determined. The load value of the client side, the load value of the server side and the current network bandwidth value greatly influence the selection of the de-repetition manner of the to-be-stored data, so that the three parameters are taken as reference objects, and thus de-repetition efficiency can be increased, and an overall utilization rate of the storage system can be increased. In addition, the invention also discloses a device for the de-repetition selection of the repeated data based on the cloud computing. The device has the same effects.
Owner:INSPUR BEIJING ELECTRONICS INFORMATION IND

Method and system for data deduplication in cloud backup process

The invention is suitable for the field of data processing, and provides a method for data deduplication in cloud backup process. The method comprises the following steps: classifying the data to be backed up by a cloud backup client; switching the classified data to be backed up by the cloud backup client through a preset switching algorithm; storing the fingerprint information of the switched data to be backed up by the cloud backup client through a secondary database and a primary database, and sending the fingerprint information to a cloud backup server; and globally searching a local data of the cloud backup server by the cloud backup server according to the fingerprint information, and carrying out subsequent processing according to the searching result. The method provided by the invention has the advantage of improving the data deduplication efficiency.
Owner:ZHEJIANG GONGSHANG UNIVERSITY

IP address duplication eliminating method and device

The invention provides an IP address duplication eliminating method and device. The method comprises the steps of extracting boundary IP addresses in a preset IP address range, wherein the boundary IP addresses comprise left boundary IP addresses and right boundary IP addresses; converting the boundary IP addresses into corresponding address parameters based on a preset rule; arranging the address parameters corresponding to the boundary IP addresses sequentially according to a magnitude sequence and dividing the boundary IP addresses into one or more address groups based on an arrangement result, wherein the number of the left boundary IP addresses is equal to that of the right boundary IP addresses in each address group; and selecting the boundary IP addresses arranged at the first place and the last place in each address group and taking the selected boundary IP addresses as the final boundary IP addresses of the IP address range. The technical scheme of the method and the device is easy to realize and the IP address duplication eliminating efficiency can be effectively improved.
Owner:杭州迪普信息技术有限公司

Increase in deduplication efficiency for hierarchical storage system

Exemplary embodiments provide improvement of deduplication efficiency for hierarchical storage systems. In one embodiment, a storage system comprises a storage controller; and a plurality of first volumes and a plurality of external volumes which are configured to be mounted to external devices. The storage controller controls to store related data which are derived from one of the plurality of first volumes in a first external volume of the plurality of external volumes. In another embodiment, the storage controller receives object data from a server and allocates the object data to the plurality of pool volumes. The plurality of pool volumes include a plurality of external volumes which are configured to be mounted to external devices. The storage controller controls to store the object data to the plurality of pool volumes based on object allocation information received from a backup server.
Owner:HITACHI LTD

Method for judging repeatability of data reported by edge computing node by cloud monitoring center

A method for judging repeatability of data reported by edge computing nodes by a cloud monitoring center belongs to the technical field of network security and comprises the following steps: S1, enabling each edge computing node to be in signal connection with a central cloud platform; the central cloud platform is provided with a monitoring center, and the monitoring center receives report information from the edge computing node; S2, comparing the reported information with recently received data by the monitoring center, and judging the repeatability of the reported information; S3, according to a comparison result in the step S2, if the report information is reported by other edge computing nodes if the report information is repeated, entering the step S4; otherwise, entering the step S5; S4, directly discarding the repeated reported information without disposal; and S5, updating the reported data into the central cloud platform. According to the method, repeated data reported by a plurality of edge computing nodes are filtered, the data computing amount of the cloud platform is greatly reduced while the duplicate removal efficiency is improved, and the resource utilization rate of the central cloud platform is indirectly improved.
Owner:杭州御安数科信息技术有限公司

Repeated data deleting method targeted at backup task

ActiveCN105786651ASolve the query bottleneck problemNarrow down the scope of the duplicate checkRedundant operation error correctionTheoretical computer scienceFingerprint
The invention discloses a repeated data deleting method targeted at a backup task.The method includes the steps that firstly, the backup task is divided; a fingerprint storehouse which completes the whole duplicate checking process on a hard disk is placed into a set B-bucket; then, local caching and global caching are established in the internal storage; elements in the B-bucket are placed into the global caching; all fingerprints of the current backup task are sequentially placed into a fingerprint storehouse C-bucket; the C-bucket is updated after reaching a filled state, and the updated biggest fingerprint and the smallest fingerprint are traversed and recorded; then, the fingerprint storehouse containing the two fingerprints is searched for in the B-bucket, and the local caching is added; after each updated fingerprint is researched and marked in the local caching and the global caching, the unmarked fingerprints are preserved to a fingerprint storehouse N-bucket; the marked fingerprints are all deleted; finally, the N-bucket is replaced after reaching a filled state, the local caching is added, and the global caching is updated.The repeated data deleting method has the advantages that the problem of fingerprint duplicate checking bottleneck is solved, the duplicate checking range is reduced, and duplicate checking efficiency is improved; a high throughput rate is maintained.
Owner:BEIHANG UNIV

Distributed web crawler performance optimization system for mass data acquisition

The invention belongs to the technical field of software engineering, and particularly relates to a distributed web crawler performance optimization system for mass data acquisition. In the system, aninitialization module is used for newly establishing a deduplication character string and a junk link feature character string. The main node crawler is used for reading the initial URL address, andthe crawling module crawls the initial URL address to generate a URL task queue. The crawling module is used for crawling the webpage according to the URL task queue to finish crawling work. Comparedwith the prior art, the crawling performance bottleneck of the distributed web crawler is broken through, and the crawling performance is improved by 50% or above. The duplicate removal efficiency ofthe URL task queue is improved, and the efficiency requirement of mass data collection is met. The storage space of the URL task queue is optimized, and server memory resources are greatly saved. A junk link filtering link is added, so that server memory resources are saved, and crawler efficiency is remarkably improved.
Owner:北京京航计算通讯研究所

Video processing method and device, electronic equipment and storage medium

The invention relates to a video processing method and device, electronic equipment and a storage medium, and the method comprises the steps: segmenting a received target video into a plurality of video clips which comprise at least one target object; performing deduplication operation on at least N continuous video frames with the same target object in the plurality of video clips in parallel to obtain a first deduplication result; executing the deduplication operation on at least N continuous video frames at the joint of the adjacent video clips to obtain a second deduplication result; and combining the first duplicate removal result and the second duplicate removal result to obtain a duplicate removal result of each target object in the target video. According to the embodiment of the invention, the deduplication efficiency can be improved, and accurate deduplication of the target video is realized.
Owner:BEIJING SENSETIME TECH DEV CO LTD

Data deduplication method and device

The invention discloses a data deduplication method and a data deduplication device. The method comprises the steps of sorting the multiple pieces of data in a first file and a second file according to the preset sorting conditions, wherein the corresponding pointers are arranged in the first file and the second file, and the pointers are used for indicating the sorting bits of the rows where thedata in the files are located; judging whether the first character string data and the second character string data are the same or not according to a sorting result; if it is judged that the first character string data are the same as the second character string data, recording the position information of the same character string data in the corresponding file; and performing duplicate removal processing on the same character string data in the first file and the second file according to the recorded position information. According to the method and the device, the technical problem that theefficiency is relatively lower when the repeated data is matched for files with relatively larger data volumes in related technologies is solved.
Owner:BEIJING GRIDSUM TECH CO LTD

Large-data-volume secret key duplication removal method and system based on Bloom filter

The invention discloses a large-data-volume secret key deduplication method based on a Bloom filter. The method comprises the following steps: obtaining data to be subjected to deduplication; initializing a deduplication system; dividing and storing the data; performing Bloom deduplication on the data; performing traversal statistics on positive data; performing accurate duplicate removal on the data; and completing precise duplicate removal of the large-data-volume key data. The invention further provides a large-data-volume secret key duplicate removal system based on the Bloom filter. The accurate duplicate removal of the large-data-volume key data is completed. Compared with the prior art, a divide-and-conquer storage method and an accurate duplicate removal method based on positive data are provided for large-data-volume key duplicate removal, the large-data-volume keys are uniformly guided and stored to different storage units according to hash remainder, it is guaranteed that the duplicate keys are in the same data set, the BitSet space occupation and deduplication operation consumption required by a single Bloom filter are reduced, that is, the space and time efficiency of the Bloom filter during deduplication operation is improved, accurate deduplication of key data is realized based on positive data HashSet set traversal statistics, and the deduplication accuracy and the key quality are improved.
Owner:ZHEJIANG QUANTUM TECH CO LTD

Data security deduplication method based on auto-encoder

The invention discloses a data security deduplication method based on an auto-encoder, relates to the field of information security and artificial intelligence, solves the problem of low efficiency ofan existing data deduplication method based on random message lock encryption, introduces abstract tags in efficiency, quickly screens out a very small subset from a tag library by means of the similarity of the tags, and executes bilinear mapping calculation on the subset, so that the frequency of bilinear mapping calculation is greatly reduced, and the label comparison efficiency is improved. According to the method, a self-encoding technology commonly used in image processing is introduced, the deduplication efficiency is improved by greatly reducing the number of times of label comparison, and the deduplication efficiency is improved by nearly 10 times compared with a data deduplication method based on random message lock encryption. According to the method, the non-monotonic functionis introduced, so that similar labels can be possibly generated even if data with large difference exists, namely, the similar labels can be generated by the similar data but cannot be established inturn, and the difficulty of deducing the data by an attacker according to the labels is further improved.
Owner:CHANGCHUN UNIV OF SCI & TECH

Video content repeated judgment method and device

The invention discloses a video content repetition judgment method and a device. The method comprises the following steps: establishing a picture similarity judgment model comprising a picture comparison value calculation process and a picture similarity judgment process; calculating sample frame comparison value information of each video sample content by utilizing a picture similarity judgment model; generating a video content comparison data set; calculating picture frame comparison value information of the target video content by utilizing a picture similarity judgment model; and finally,comparing the comparison value information of the target frame picture with the comparison value information of the sample frame picture of the video sample content, and judging the repetition condition between the target video content and the video content comparison data set according to a video deduplication strategy. The picture similarity judgment method is quick and high in accuracy, video content duplication elimination is summarized into similarity judgment of different frames of pictures, for massive video content, key information is extracted, duplication elimination workloads are reduced, and the duplication elimination efficiency of the video content is greatly improved.
Owner:XIAMEN MEET YOU INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products