Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

154results about How to "Improve clustering efficiency" patented technology

Clustering method based on mobile object spatiotemporal information trajectory subsections

The invention discloses a clustering method based on mobile object spatiotemporal information trajectory subsections. The clustering method based on mobile object spatiotemporal information trajectory subsections comprises the steps that the three attributes of time, speed and direction are introduced, and a similarity calculation formula of the time, speed and direction is provided for analyzing an internal structure and an external structure of a mobile object trajectory; firstly, according to the space density of the trajectory, the trajectory is divided into a plurality of trajectory subsections, then the similarities of the trajectory subsections are judged by calculating differences of the trajectory subjections on the space, time, speed and direction, finally, trajectory subsections in a non-significant cluster are deleted or are merged into adjacent significant clusters on the basis of a first cluster result, and therefore an overall moving rule is displayed on the clustering spatial form. According to the clustering method based on the mobile object spatiotemporal information trajectory subsections, the clustering result is improved, higher application value is provided, a space quadtree is adopted to conduct indexing on the trajectory subsections, clustering efficiency is greatly improved under the environment of a large-scale trajectory number set, and trajectories can be effectively clustered.
Owner:胡宝清

Distributed data stream clustering method and system

The invention discloses distributed data stream clustering method and system and overcomes the defect that the existing most data steam clustering algorithms are unable to run in the distributed cloud environment, unable to easily extend and low in operational time efficiency. The method includes: summarizing data streams to obtain a plurality of eigenvectors of the data streams; performing locality-sensitive hashing algorithm to obtain a plurality of clusters with each comprising at least one eigenvector, and selecting at least one cluster as a candidate cluster; periodically using the candidate cluster to cluster eigenvectors of newly arrived data streams. The real-time performance better than that of the prior art is guaranteed by the use of the method and system based on the locality-sensitive hashing algorithm.
Owner:CHINA INFORMATION TECH SECURITY EVALUATION CENT +1

Power system load data identification and recovery method

The invention discloses a power system load data identification and recovery method. Firstly, according to user historical load data, the number of clusters and initial cluster centers of sample data are determined on the basis of the hill climbing method; secondly, the final cluster center and the characteristic curve of the historical load data are obtained on the basis of the fuzzy C-means clustering algorithm; thirdly, each kind of load characteristic curve is processed, and the feasible region interval where normal data of the load curve is located is obtained; fourthly, according to correlation coefficients with the load characteristic curves, the category to which a to-be-tested load curve belongs is determined; finally, on the basis of the feasible region interval and the to-be-tested load curve whose category is judged, bad data of to-be-tested load data is identified and corrected. According to the method, the fuzzy C-means algorithm serves as the basis, the hill climbing function method is used, the number of clusters and the initial cluster centers are determined at the same time to improve clustering efficiency, and the initial cluster center determination problem and identification effect judgment randomness problem of bad data are solved.
Owner:TIANJIN UNIV

CoMP downlink dynamic cooperative cluster selection method based on SINR threshold and token

The invention provides a CoMP downlink dynamic cooperative cluster selection method based on an SINR threshold and a token, which belongs to the field of wireless communication and mobile communication. The method comprises the following steps: a cooperative cluster is determined according to the cell-edge signal-to-interference-and-noise-ratio (i.e., the poorest user of SINR) to ensure the performance of an edge user; a neighboring cell with maximum interference to the poorest user of SINR is selected for cooperation, and the cooperative gain is maximal; the SINR after user cooperation meets the criterion of a given threshold to determine the size of the cooperative cluster and the threshold can be flexibly adjusted so that a compromise is obtained between the performance and the complexity; and in addition, a cellular cell is divided into different areas, the token is issued in each area and the cooperative cluster is selected based on the token to avoid the problem of collision when different users synchronously select a cooperative cell.
Owner:BEIHANG UNIV

Modeling method for parallel smart case recommendation model

The invention relates to a modeling method for a parallel smart case recommendation model. The method comprises the following steps of obtaining existing patient cases from an electronic case database, carrying out denoising, clustering and word segmentation on the patient cases, and establishing a patient case corpus database; defining that TFIDFi, j shows the importance degree of a word or an expression in a case of the patient case corpus database, establishing an LSI vector space model according to the TFIDFi, j, and moreover, establishing a BOW word bag model according to all words and expressions in the patient case corpus database; calculating history case vectors and to-be-processed case vectors in the patient case corpus database through utilization of the LSI vector space model and the BOW word bag model; calculating cosine similarity among the history patient cases and storing the cosine similarity; and calculating the cosine similarity between the to-be-processed case vectors and the history patient case vectors, and searching similar cases of to-be-processed cases according to the cosine similarity. The model established through adoption of the method provided by the invention is high in accuracy and low in error. A recommendation result is high in quality.
Owner:QINGDAO ACADEMY OF INTELLIGENT IND

Cluster partitioning processing method and cluster partitioning processing device for virus files

The invention discloses a cluster partitioning processing method and a cluster partitioning processing device for virus files. The method comprises the following steps of: (A) statically analyzing binary data of virus files to be partitioned, and analyzing portable executable (PE) structure data of the virus files from the binary data; and (B) comparing the PE structure data of the virus files to be partitioned, and partitioning the virus files with the PE structure data according with appointed similarity into the same category. The device comprises a first data analyzing module and a first cluster partitioning module, wherein the first data analyzing module is used for statically analyzing the binary data of the virus files to be partitioned and extracting the PE structure data of the virus files from the binary data; and the first cluster partitioning module is used for comparing the PE structure data of the virus files to be partitioned and partitioning the virus files with the PE structure data according with the appointed similarity into the same category. By the cluster partitioning processing method and the cluster partitioning processing device for the virus files, the cluster partitioning efficiency of the virus files of a computer can be improved, resource consumption is reduced, and the virus catching risk caused by the virus files which run dynamically is eliminated.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Balance clustering compression method based on data similarity

The invention discloses a cluster compression method based on data similarity. By analyzing file data, structural characteristic vector of characteristic fingerprint is extracted from files to calculate the data similarity; files are input in cluster by utilizing a graph partitioning method with a restricted condition, so that a plurality of categories in even sizes are formed; and finally, compression is respectively performed on each category by utilizing compression methods, such as BMCOM, so as to remove the redundant data in category interior. The invention adopts a clustering method based on data sampling; and key data with a high condensability serves as sample data. Firstly, clustering is performed on the sample data; then, the remaining data is classified through a marriage stabilizing method, thereby improving clustering efficiency under a condition that the compressing effect is not reduced. As a compressing and filing method, the invention can be applied to a distributed storage system, so that the problem of uneven data dependence and load in the prior method can be solved.
Owner:ZHEJIANG UNIV

Time-sensitive self-adaptive on-line subtopic detecting method and system

The invention relates to a time-sensitive self-adaptive on-line subtopic detecting method and system. The method comprises the following steps: (1) vectorizing each document in a document flow; (2) carrying out incremental clustering on the documents and adjusting the central weight of subtopics according to the document weight decaying with time; (3) combining the subtopics or deleting the meaningless subtopics when the quantity of the subtopics generated by clustering or the weight ratio of a certain subtopic meets the threshold conditions or the subtopics meet the long tail detection conditions; and (4) generating a summary of the new subtopics, outputting and displaying according to the weight of each new subtopic and the internal document distribution thereof. The system comprises a document representing module, an incremental clustering module, a new subtopic discovering module and a summary generating module. According to the method and the system which are disclosed by the invention, the historical document weight decays with time and the dynamic update on the quantity and the content of the subtopics is carried out on the basis of the threshold judgment and the long tail detection, so that the subtopic detecting efficiency can be effectively increased.
Owner:INST OF INFORMATION ENG CAS +1

Farmland wireless self-organizing network topology density correlation path selecting and optimizing method

The invention relates to a farmland wireless self-organizing network topology density correlation path selecting and optimizing method. The method includes the steps that firstly, node energy is calculated, voting is carried out, and cluster heads are determined; secondly, nodes within a preset threshold value are received by the cluster heads and added, and clustering operation is carried out; thirdly, nodes corresponding to which clustering operation is not completed become intra-cluster sub-nodes according to a pre-established method; fourthly, energy advantage nodes are selected from the intra-cluster sub-nodes to serve as fake cluster heads; fifthly, reverse scattering and networking are carried out according to the cluster heads and the fake cluster heads; sixthly, environment information collecting and reporting are carried out on all nodes through a networking route; seventhly, whether a cluster head arrives at an energy approximating target or not is judged; eighthly, if a cluster head arrives at the energy approximating target, the next round is marked to carry out cluster head election, collection of the current round ends, and otherwise collection of the round ends. By means of the method, clustering efficiency is improved, cluster head selection frequency is lowered, and overall network energy efficiency is improved.
Owner:北京市农林科学院信息技术研究中心

Improved Mean Shift-based extraction method and system of road marking lines of Lidar (Light Detection And Ranging) point cloud data

The invention relates to an extraction method of road marking lines, belongs to the field of GIS information technology, and particularly relates to an improved Mean Shift-based extraction method andsystem of road marking lines of Lidar (Light Detection And Ranging) point cloud data. According to the method, a vehicle-mounted Lidar collection system and an inertial measurement system are utilizedto acquire the road point cloud data, adjacency-graph relationships are established for the point cloud data through an octree, and supervoxel clustering is carried out; a Mean Shift algorithm fusedwith multiple features of intensity, normal vectors and the like is utilized to carry out road marking line extraction on the point cloud data; and road marking line points can be accurately extracted, the calculation amount of the Mean Shift algorithm is also greatly decreased at the same time, and thus the purpose of improving precision and efficiency of automated extraction of the road markinglines of the Lidar data is achieved.
Owner:HUBEI UNIV

Clustering method and device for webpage

ActiveCN104504086ASolve the problem of low clustering efficiencyImprove clustering efficiencyWeb data indexingSpecial data processing applicationsWeb page clusteringInformation retrieval
The invention discloses a clustering method and a clustering device for webpages. The clustering method of the webpage comprises the following steps: acquiring a first block element of a web page to be compared; sequentially calculating the similarity index values of the web page to be compared and each page type according to the first block element and a second block element of each page type in a page type set; when calculating that the similarity index value of the web page to be compared and a present page type is greater than a preset threshold, clustering the web page to be compared into the present page type, and updating the second block element of the present page set so as to acquire an updated page type of the present page type; when the similarity index value of the web page to be compared and each page type of the present page type set is smaller than the preset threshold, adding the web page to be compared into the page type set as a new page. Through the adoption of the clustering method, the problem that the web page clustering efficiency in the prior art is low is solved, and an effect of improving the web page clustering efficiency is achieved.
Owner:BEIJING GRIDSUM TECH CO LTD

Grid-based spatial multi-scale fast clustering method

The invention discloses a grid-based spatial multi-scale fast clustering method. The method includes the following steps that: S1, a data scale is selected, the size of grids is determined, gridding is performed on sample data, and the density values of the grids is put into statistics; S2, an initial density threshold is specified, all grids satisfying the threshold condition are reserved, and apreliminary density matrix is obtained; S3, a filter template is specified according to an observation scale, and convolution operation is performed on a global grid space; S4, a connected region is generated through neighborhood search so as to be adopted as a preliminary clustering result, integration operation is performed on the grids, the grid space is mapped onto an original point set, and an original point set clustering result is obtained; S5, the observation scale is adjusted, a transformed new filter is adopted to perform operation in the S3 and S4 on a result matrix again, a clustering result of the next observation scale is obtained; and S6, the data scale is changed, the S1 to S5 are repeated, clustering results under different data scales are obtained. The algorithm of the invention has the advantages of low complexity, high clustering efficiency and high precision, and can meet the requirements of real-time multi-scale clustering and visual analysis of massive point sets.
Owner:WUHAN UNIV

Professional field-oriented on-line theme detection method

ActiveCN107066555ASolve the difficulty of satisfying the user's need for more professional informationSolve needsCharacter and pattern recognitionSpecial data processing applicationsState of artAlgorithm
The invention discloses a professional field-oriented on-line theme detection method. The method comprises the following steps: obtaining a text vector matrix of a preprocessed text set, and extracting a dictionary from the text set; modeling the text vector matrix; calculating a mixed weight p (thetak|d) from a text d to a theme thetak and a frequency p (w|thetak) that a feature word appears in each theme thetak; obtaining the similarity between two texts di and dj, defining a theme model-based theme distance between the texts into a relative entropy distance of a text vector, and calculating a similarity matrix; compressing the text set, thus obtaining a new text sample sect; calculating a similarity matrix of the new text sample set, and selecting a deviation parameter p according to the similarity matrix; combining clustering results, thus generating a new clustering result; calculating distances between all texts in the original text set and compressed classified texts, and performing classification; outputting a text set theme and a final clustering result. Compared with the prior art, the professional field-oriented on-line theme detection method disclosed by the invention has the advantage that by the adoption of an optimal clustering algorithm, the accuracy and the efficiency of the clustering effect are improved.
Owner:TIANJIN UNIV

Stochastic gradient Bayesian SAR image segmentation method based on sketch structure

The invention discloses a stochastic gradient Bayesian SAR image segmentation method based on a sketch structure, mainly used for solving the problem that SAR image segmentation in the prior art is inaccurate. The stochastic gradient Bayesian SAR image segmentation method comprises the following implementation steps of: (1), sketching an SAR image to obtain a sketch image of the SAR image; (2), according to an area chart of the SAR image, and dividing a pixel subspace of the SAR image; (3), performing hybrid aggregation structured surface feature pixel subspace segmentation through a method based on a stochastic gradient variational Bayesian network model; (4), performing independent target segmentation based on the sketch line aggregation feature; (5), performing line target segmentation based on a visual semantic rule; (6), performing segmentation of a pixel subspace in a homogeneous area by adopting a polynomial-based logistic regression prior model; and (7), combining segmentation results to obtain a segmentation result of the SAR image. By means of the stochastic gradient Bayesian SAR image segmentation method based on the sketch structure disclosed by the invention, the good segmentation effect of the SAR image is obtained; and the stochastic gradient Bayesian SAR image segmentation method can be used for semantic segmentation of the SAR image.
Owner:XIDIAN UNIV

Travel trajectory clustering method, apparatus and device

Embodiments of the invention disclose a travel trajectory clustering method, apparatus and device. The calculation amount of travel trajectory clustering is reduced and the travel trajectory clustering efficiency is improved. The method comprises the steps of obtaining multiple travel trajectories of a user, wherein each travel trajectory comprises a starting point, an ending point and a middle point located between the starting point and the ending point; by utilizing the starting points and / or the ending points of the travel trajectories, clustering the travel trajectories to obtain a first travel trajectory set, wherein the first travel trajectory set comprises the travel trajectories with the matched starting points and / or ending points, and the number of the travel trajectories in the first travel trajectory set is greater than or equal to a first threshold; and by utilizing the middle points in the travel trajectories, clustering the travel trajectories in the first travel trajectory set to obtain a second travel trajectory set, wherein the second travel trajectory set comprises the travel trajectories with the matched starting points and middle points, and / or, the travel trajectories with the matched ending points and middle points.
Owner:NEUSOFT CORP

Cluster analysis method and system for consumer power consumption behavior based on regulating potential index

The invention relates to a method and a system for clustering analysis of user power consumption behavior based on an adjustment potential index, characterized by comprising the following steps of: 1)constructing a user load transfer rate model considering the peak-valley time-of-use price according to the user daily load curve information and the peak-valley time-of-use price information obtained in advance, and calculating a user regulation potential index; 2) taking the user regulation potential index as a sample space set, combining with K. Means clustering algorithm, getting the consumerbehavior clustering results based on the adjustment potential index. The invention has good aggregation effect when clustering analysis is carried out on users with obvious peak-valley characteristics, and can ensure clustering effect while improving clustering efficiency when clustering analysis is carried out on large-scale users, and can be widely used in the field of power system data analysis of peak-valley time-of-use electricity price.
Owner:STATE GRID CORP OF CHINA +3

Trajectory clustering method and device, and storage medium

A trajectory clustering method comprises the steps of acquiring a target trajectory set, and the target trajectory set comprises a plurality of trajectories; dividing the tracks in the target track set according to the position data of each track in the target track set to obtain a plurality of target track subsets; respectively calculating the similarity between different tracks in each target track subset; and according to the similarity between different trajectories in each target trajectory subset and a preset similarity threshold, clustering the trajectories in each target trajectory subset to obtain a clustering result. According to the method, the tracks in the target track set are divided into the corresponding different target track subsets according to the position data of the tracks in the target track set. When trajectory clustering is carried out on different target trajectory subsets, other trajectories except the target trajectory subsets do not need to be considered, the trajectory similarity can be quickly calculated, and the discovery overhead of similar trajectories is reduced, so that the overall calculation overhead of trajectory clustering is reduced.
Owner:CHENGDU HUAWEI TECH

Method and system for clustering network files

The invention provides a method for network file clustering and a system thereof. The method includes inputting a plurality of network files, collecting link relationship and a directory structure of the network files, extracting a hierarchical structure of the network files according to the link relationship and the directory structure, further, outputting one or multiple clusters for the network files based on the hierarchical structure. In some embodiments, hierarchical relationship among the clusters can be output simultaneously. Compared with the prior art, the method for network file clustering can greatly increase accuracy and efficiency of network file clustering.
Owner:NEC (CHINA) CO LTD

A method and system for early diagnosis and early war of transformer fault

The invention discloses a transformer fault early diagnosis and early warning method and system, comprising: S1, acquiring the current monitoring data of the transformer and the historical monitoringdata of known fault types; S2, standardizing the acquired data; 3, clustering the standardized eigenvalue by using the classical ant colony clustering algorithm; S4, merging the outliers and the clusters with high similarity in the clustering result as the final clustering result; S5, searching the cluster position where the current monitoring data is located, and judging the fault type corresponding to the current monitoring data according to the fault type to which most of the fault data in the corresponding cluster belong; S6, outputting the judgment result of the fault type. The inventiondynamically analyzes the gas characteristic data of the transformer fault based on the clustering algorithm, so as to accurately judge the health condition of the equipment and early warn the equipment fault.
Owner:NANJING NARI GROUP CORP +1

Grid-based data clustering method

A grid-based data clustering method performed by a computer system includes a setup step, a dividing step, a categorizing step and an expanding / clustering step. The setup step sets a grid quantity and a threshold value. The dividing step divides a space containing a data set having a plurality of data points into a two-dimensional matrix. The matrix has a plurality of grids G(i,j) comprising a plurality of target sequences and a plurality of non-target sequences interlaced with the plurality of target sequences. The indices “i” and “j” of each grid G(i,j) represents the coordinate thereof. The categorizing step determines whether each of the grids is valid based on the threshold value. The expanding / clustering step respectively retrieves each of the grids of the target sequences, performs an expansion operation on each of the grids retrieved and clusters the plurality grids G(i,j).
Owner:NAT PINGTUNG UNIV OF SCI & TECH

Air conditioner reliability influence factor-based regional clustering method

The invention discloses an air conditioner reliability influence factor-based regional clustering method. The method includes the following steps that: a system analyzes the regional differences of working environment factors and user use habit factors that influence the reliability of air conditioners and extracts working environment reliability key influence factors and user use habit reliability key influence factors; an air conditioner reliability influence factor-based regional clustering analysis comprehensive evaluation model is constructed; judgment criterion of air conditioner startup refrigeration and heating are formulated, the average consumption tendency indexes of the air conditioners are accurately quantified; and a weighted Ward clustering algorithm is adopted to carry out clustering analysis in the aspects of the working environment influence factors and the user use habit factor influence factors, so that an working environment influence factor clustering analysis result and a user use habit factor influence factor clustering analysis result are obtained, a secondary clustering method is adopted to integrate the working environment influence factor clustering analysis result and the user use habit factor influence factor clustering analysis result, so that final regional distribution can be obtained. With the air conditioner reliability influence factor-based regional clustering method of the invention adopted, the use reliability difference of air conditioners distributed in different areas can be minimum, and more scientific and more precise regional classification results can be obtained.
Owner:NANCHANG HANGKONG UNIVERSITY

Cluster and outlier detection method based on multi-agent evolution

The invention discloses a cluster and outlier detection method based on a multi-agent evolution, and mainly achieves that current traditional outlier detection algorithms can be used for detecting the outlier of high efficiency data cluster on data sets of different densities. The method comprises the steps of S1, initializing, S2, conducting K-means cluster algorithms to each intelligent agent, S3, calculating the energy of the intelligent agent, S4, performing a neighborhood competition operator, S5, performing a neighborhood crossover operator, S6, performing a mutation operator, S7, conducting K-means cluster algorithms, S8, conducting a self-learning operator, S9, updating a global optimization agent, S10, detecting the outlier, S11, obtaining a judgment result, S12, exporting outlier data, and S13, exporting data points with categories. The cluster and outlier detection method based on multi-agent evolution can effectively enhance the clustering efficiency and the outlier detection precision on different density data, reduce the calculation time, and be applicable to data sets of different densities.
Owner:XIDIAN UNIV

Hybrid clustering algorithm based on adaptive cellular inheritance and optimal fuzzy C-means

The invention discloses a hybrid clustering algorithm based on adaptive cellular inheritance and optimal fuzzy C-means. The algorithm comprises the following steps that: utilizing Arnold Cat mapping to generate an initial population, and constructing a fitness function on the basis of the clustering criteria of the C-means; decoding individuals in the population to obtain a corresponding clustering center, distributing a degree of membership, and calculating a fitness value and the entropy of the population; carrying out state evolution on each individual, carrying out selection, dynamic intersection and a combined variation operation based on the entropy; automatically determining the fusion opportunity of the fuzzy C-means, and utilizing an implementation criteria to carry out a fuzzy C-means iterative operation; and judging whether an end condition is achieved or not, and outputting a final clustering result if the end condition is met. By use of the algorithm, the characteristics of the high global search capability of an adaptive cellular genetic algorithm and the high local search capability of a fuzzy C-means algorithm are further utilized. Compared with the prior art, the algorithm is characterized in that higher clustering efficiency and accuracy can be obtained.
Owner:NANCHANG HANGKONG UNIVERSITY

Self-adapting preferred fuzzy kernel clustering based naphtha attribute clustering method

The invention relates to a fuzzy kernel clustering method for self-adapting preferred naphtha attribute clustering numbers based on Gaussian nucleation validity indexes. The density and distance based initial clustering center selection method successfully solves the problems that fuzzy kernel clustering methods are sensitive to initial values, the operation speed is low and the implementing time is long. According to the method, a density measurement method is defined to select data objects with high densities to serve as initial clustering centers, the problem that the method is sensitive to initial values is effectively solved, and simultaneously, a two-dimensional array is set to store distances among naphtha attribute data, so that the calculation time is greatly shortened, and the clustering efficiency is improved.
Owner:EAST CHINA UNIV OF SCI & TECH

Clustering model training method and device, electronic equipment and computer storage medium

The embodiment of the invention discloses a clustering model training method and device, electronic equipment and a computer storage medium. The method comprises the steps that a global clustering model is acquired based on master nodes; according to any one of at least one slave node in a distributed system, the global clustering model is acquired from the master nodes in the distributed system,clustering estimation is performed based on the global clustering model and calculation data allocated to any corresponding slave node, and a local clustering model corresponding to any slave node isobtained; the distributed system comprises the master nodes and at least one slave node in communicating connection with the master nodes, wherein the master nodes are in communicating connection withall the slave nodes; and the global clustering model is trained based on the obtained local clustering models corresponding to all the slave nodes. Through the clustering model training method in theembodiment, the synchronization rate among the calculation nodes is lowered, and clustering efficiency is improved.
Owner:BEIJING SENSETIME TECH DEV CO LTD

A machine learning method for locally missing multi-view clustering based on matrix-guided regularization

The invention relates to a machine learning method for locally missing multi-view clustering based on matrix-guided regularization.The method fuses filling and clustering, fills missing kernel under the guidance of clustering, clusters with filled kernel, and introduces matrix-guided regularization when filling missing kernel. The method comprises the following steps: 1) obtaining target data samples and clustering target numbers, mapping the target data samples to multi-kernel space; 2) introducing matrix-guided regularization to establish regularized locally missing multi-kernel k-means clustering optimization objective function; 3) solving the regularized locally missing multi-kernel k-means clustering optimization objective function in a cyclic manner to realize clustering. Compared with the prior art, the present invention has the advantages of good clustering effect, low calculation amount and the like.
Owner:聚时科技(上海)有限公司

Method for establishing electrical power system clustering load model

The invention discloses a method for establishing an electrical power system clustering load model. Timing sequence load curves are firstly sequenced and load duration time curves are obtained, then according to contribution degree of a load level to adequacy index, the load duration curves are divided into three subareas, namely, a high contribution degree, a medium contribution degree and a lower contribution degree, if the subarea is the high contribution degree area, a hierarchical clustering algorithm is adopted to select a clustering center initial value for the high contribution degree area, if the subarea is the medium contribution degree area, a mean value-standard deviation method is adopted to select a clustering center initial value for the medium contribution degree area, and if the subarea is the lower contribution degree area, a clustering center initial value is confirmed according to experience or is confirmed in a random mode for the lower contribution degree area. Improved efficiency index is defined, improved efficiency is regarded as a convergence condition, and then a clustering number in a K-mean clustering algorithm is confirmed. The clustering load model obtained through the above method has high computational accuracy and rapid convergence properties when used in power system adequacy evaluation.
Owner:HOHAI UNIV

Method and device of determining webpage clustering mode

The invention discloses a method and a device of determining a webpage clustering mode. The method comprises the steps of obtaining a main prefix of a uniform resource locator of a webpage to be clustered; segmenting the main prefix to obtain a plurality of fields; segmenting in a matching way to obtain a plurality of fields according to at least one preset reserve field in a reserve field dictionary and the position information of each preset reserve field, taking the field parts which are matched and identical with the preset reserve field and are provided with corresponding positions as the reserve fields, and generating the clustering mode of the uniform resource locator of the webpage to be clustered according to the reserve fields of a plurality of fields and the position information of the reserve fields. The invention also discloses a device used for realizing the method. According to the technical scheme, more webpages can be clustered under the clustering mode, the clustering effect of the webpages can be effectively optimized, and the clustering efficiency of the webpages is improved.
Owner:BEIJING QIHOO TECH CO LTD +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products