Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

53 results about "Precision and recall" patented technology

In pattern recognition, information retrieval and Classification (machine learning), precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known as sensitivity) is the fraction of the total amount of relevant instances that were actually retrieved. Both precision and recall are therefore based on an understanding and measure of relevance.

Automated evaluation systems & methods

This invention uses linguistic principles, which together can be called Collocational Cohesion (CC), to evaluate and sort documents automatically into one or more user-defined categories, with a specified level of precision and recall. Human readers are not required to review all of the documents in a collection, so this invention can save time and money for any manner of large-scale document processing, including legal discovery, Sarbanes-Oxley compliance, creation and review of archives, and maintenance and monitoring of electronic and other communications. Categories for evaluation are user-defined, not pre-set, so that users can adopt either traditional categories (such as different business activities) or custom, highly specific categories (such as perceived risks or sensitive matters or topics). While the CC process is not itself a general tool for text searches, the application of the CC process to large collections of documents will result in classifications that allow for more efficient indexing and retrieval of information. This invention works by means of linguistic principles. Everyday communication (letters, reports, emails-all kinds of communication in language) does follow the grammatical patterns of a language, but forms of communication also follow other patterns that analysts can specify but that are not obvious to their authors. The CC process uses that additional information for the purposes of its users. Any communication exchange that can be recognized as a particular kind of discourse may be used as a category for classification and assessment. Specific linguistic characteristics that belong to the kind of discourse under study can be asserted and compared with a body of general language, both by inspection and by mathematical tests of significance. These characteristics can then be used to form the roster of words and collocations that specifies the discourse type and defines the category. When such a roster is applied to collections of documents, any document with a sufficient number of connections to the roster will be deemed to be a member of the category Larger documents can be evaluated for clusters of connections, either to identify portions of the larger document for further review, or to subcategorize portions with different linguistic characteristics. The CC process may be extended to create a roster of rosters belonging to many categories, thereby increasing the specificity of evaluation by multilevel application of this invention. The CC process works better than other processes used for document management that rely on non-linguistic means to characterize documents. Simple keyword searches either retrieve too many documents (for general keywords), or not the right documents (because a few keywords cannot adequately define a category), no matter how complex the logic of the search. Application of statistical analysis without attention to linguistic principles cannot be as effective as this invention, because the words of a language are not randomly distributed. The assumptions of statistics, whether simple inferential tests or advanced neural network analysis, are thus not a good fit for language. This invention puts basic principles of language first, and only then applies the speed of computer searches and the power of inferential statistics to the problem of evaluation and categorization of textual documents.
Owner:TEXT TECH

Method and apparatus for semantic search of schema repositories

Mechanisms for searching XML repositories for semantically related schemas from a variety of structured metadata sources, including web services, XSD documents and relational tables, in databases and Internet applications. A search is formulated as a problem of computing a maximum matching in pairwise bipartite graphs formed from query and repository schemas. The edges of such a bipartite graph capture the semantic similarity between corresponding attributes of the schema based on their name and type semantics. Tight upper and lower bounds are also derived on the maximum matching that can be used for fast ranking of matchings whilst still maintaining specified levels of precision and recall. Schema indexing is performed by ‘attribute hashing’, in which matching schemas of a database are found by indexing using query attributes, performing lower bound computations for maximum matching and recording peaks in the resulting histogram of hits.
Owner:IBM CORP

Method and apparatus for automatic information filtering using URL hierarchical structure and automatic word weight learning

Disclosed method and apparatus for automatic information filtering are capable of improving both precision and recall and accurately judging inappropriateness of the content even for a page that contains very few or no text information and only displays images, by utilizing an upper level URL of a URL given in a hierarchical structure. Also, disclosed method and apparatus for automatic information filtering are capable of setting weights of words easily and accurately and judging inappropriateness of the information by utilizing these weights, by using an automatic learning based on a linear discrimination function that can discriminate the inappropriate information and the appropriate information on a vector space.
Owner:KDD CORP

System and method for indexing electronic discovery data

Systems and methods for efficiently processing electronically stored information (ESI) are described. The systems and methods describe processing ESI in preparation for, or association with, litigation. The invention preserves the contextual relationships among documents when processing and indexing data, allowing for increased precision and recall during data analytics.
Owner:PLANET DATA SOLUTIONS INC

Agricultural field ontology library based semantic retrieval system and method

ActiveCN102073692ASemantic retrieval is accurate and efficientImprove accuracySpecial data processing applicationsData OriginExtension set
The invention relates to an agricultural field ontology library based semantic retrieval system and method, belonging to the technical field of intelligent retrieval. In order to improve the accuracy and the efficiency of an agricultural field information semantic retrieval process, only the useful structured data in a webpage are extracted by using an information extraction technology and used as the basic resource for retrieving, thus the structural property and the accuracy of the retrieval data source are greatly ensured in the stage of the basic resource of data; and then the comprehensive and professional agricultural-industry oriented ontology library is established, the semantic extension and inference is carried out according to the inquiry request of the user on the basis of a semantic ontology inference engine through the participation of the user, and the natural language submitted by the user is processed or the extension result is returned to the user once again so that the weight of each ontology example in a semantic extension set can be determined accurately in the participation process of the user, the extended ontology example set meets the inquiry requirement of the user, and further the final retrieval precision and recall rate are improved.
Owner:BEIJING RES CENT FOR INFORMATION TECH & AGRI

Fishing mail inspection method based on text characteristic analysis

The invention provides a fishing mail inspection method based on text characteristic analysis, which is characterized by comprising the following steps: eliminating non-text contents in mails; utilizing a mail analyzer to analyze the mails; utilizing a regular expression algorithm to extract sitelinks in the mails; utilizing the regular expression algorithm to extract relevant characteristics in the sitelinks again; and using a domain name to register for a search engine to obtain the site registration date characteristics. The extracted text characteristics are the characteristic vectors of the mails. A test proves that the method is used to improve precision and recall of the fishing mails as well as save time and overhead of program operation. In the method, original text characteristics are subjected to screening, so that a plurality of characteristics with preferable effects are selected. The plurality of characteristics with preferable effects are combined with the characteristics of the fishing mails and the current research base so as to provide several new text characteristics aiming at the inspection of the fishing mails. The method is utilized to inspect suspicious mails.
Owner:NANJING UNIV OF POSTS & TELECOMM

Method and apparatus for incremental computation of the accuracy of a categorization-by-example system

Disclosed are methods and for incrementally updating the accuracy provided by documents in training set of used for automatic categorization. A k-nearest neighbor database includes the documents in the training set, categories, category assignments of the documents and category scores for the documents. A list made up of the nearest neighbors of the documents and corresponding similarity scores contains is maintained by the method. On adding or deleting documents or category assignments, the documents influenced by the changed documents or category assignments are identified. The category scores of the identified documents are updated to be consistent for the updated training set and a new precision and recall curves are computed for the categories including updated category scores. The precision and recall curves may be used to determine an optimal number of documents to maximize the return of relevant documents while minimizing the total number of documents.
Owner:SAP AMERICA

An Improved YOLOV3 Target Recognition Algorithm Embedded in SENet Structure

The invention relates to the field of depth learning, in particular to an improved YOLOV3 target recognition algorithm embedded with a SENet structure, comprising the steps of S100: collecting characteristic information of an object to be recognized and making a data set; the feature information includes image information; S300, taking a part of the data set as a training set and the rest of the data set as a test set; Step S500:embedding SE structure in YOLOV3 algorithm to obtain SE-YOLOV3 algorithm; Step S600: training SE- YOLOV3 on the training set; Step S700: testing SE-YOLOVE3 performanceon the test set. An improved YOLOV3 target recognition algorithm embedded in a SENet structure of the present invention can accurately identify target parts when there are more incomplete parts disturbing in a sample picture, so as to obtain higher precision and recall ratio.
Owner:SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV

Remote sensing target detection method based on boundary constraint CenterNet

The invention provides a remote sensing target detection method based on boundary constraint CenterNet, which is used for solving the technical problems of relatively low detection precision and recall rate of dense small targets in the prior art. The method comprises the following implementation steps: obtaining a training sample set; the method comprises the following steps: constructing a boundary constraint CenterNet network; obtaining a prediction label and an embedded vector of the training sample set; calculating the loss of the boundary constraint CenterNet network; carrying out the training of a boundary constraint CenterNet network; and obtaining a target detection result based on the trained boundary constraint CenterNet network. Through performing maximum pooling in the constrained pooling area through the corner constraint pooling layer, the fine features around the target are extracted, the detection precision and recall rate of dense small targets are effectively improved, meanwhile, the boundary constraint label generated by the boundary constraint convolutional network is utilized to constrain the prediction box, a more accurate target prediction box is obtained, and the detection precision of the target is further improved.
Owner:XIDIAN UNIV

A network attack type recognition method based on multi-layer detection

The invention relates to a network attack type identification method based on multi-layer detection, belonging to the technical field of information security. The specific operation steps are as follows: Step 1, acquiring the original training data and preprocessing. 2, construct an integrated classification model. 3, train that ensemble classification model. 4, preprocess that test data. 5, classify that test data. Compared with the existing technology, the network attack type identification method based on multi-layer detection proposed in this patent has the following advantages: (1) adopting smart algorithm to upsample a small number of samples and downsample a large number of samples, so as to solve the problem of imbalance of data set samples. (2) Using the integrated model, the precision and recall rate of the detection are improved. (3) The Drosophila optimization algorithm FOA is combined with support vector machine SVM to realize the optimal and adaptive selection of parameters C and gamma in SVM.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Disease prediction system using open source data

Described is a disease prediction system using open source data. The system includes a preprocessing module, a learning module, and a prediction module. The preprocessing module receives a dataset of N trend results related to a disease event and generates an enhanced filter signal (EFS) curve related to the disease event. The learning module receives the EFS curve and generates a predicted number of cases of the disease event and, using a plurality of machine learning methods, generates a plurality of predictions that the disease event will happen within a future time period. The prediction module determines precision and recall for each of the plurality of predictions and, based on the precision and recall, provides a likelihood that the disease event will occur.
Owner:HRL LAB

A large-scale knowledge map fusion method based on reduced anchor points

The invention provides a large-scale knowledge map fusion method based on reduced anchor points, which comprises the following steps: large-scale knowledge map analysis and pretreatment; constructionof a reduction set: calculating the similarity of semantic description documents between entities in two knowledge maps; determining positive reduction anchor points and negative reduction anchor points; hybrid matching algorithm: predicing a large number of matching positions in subsequent matching computation according to the reduced anchor points; matching result extraction. The invention can effectively handle the large-scale knowledge fusion task in practical application, and has good effect and performance. The invention does not need to divide the large knowledge map in the matching process, thereby avoiding the semantic information loss caused by the division failure of the large knowledge map, ensuring the accuracy and recall rate of the matching result, and having the matching efficiency equal to that of the dividing and treating method adopted for dividing the knowledge map.
Owner:SOUTHEAST UNIV

Convolutional neural network model-based violence and terrorism video detection method

The invention discloses a convolutional neural network model-based violence and terrorism video detection method, and relates to computer vision and machine learning. The method comprises the following steps of 1) training a deep neural network model; and 2) detecting violence and terrorism videos online. By utilizing a low-level feature of a deep learning model combination, a more abstract high-level representation attribute or feature is formed to discover distributed feature representation of data. Video image feature descriptors with good description capabilities can be obtained through the model. The feature descriptors cover feature information, at all levels from low to high, of video images, so that the detection precision and recall rate of violence and terrorism videos are greatly improved and increased. A deep convolutional network is trained through a small amount of samples to obtain excellent detection performance. The detection precision of terrorism pictures reaches more than 99%, and the recall rate of the terrorism pictures reaches more than 98%. The detection precision of terrorism videos reaches 95%, and the recall rate of the terrorism videos reaches 99%. The training process is free from artificial participation, and massive data is generated automatically according to a small amount of the samples.
Owner:XIAMEN UNIV

Building recognition method based on multi-feature fusion

The invention provides a building identification method based on multi-feature fusion, which comprises the following steps: extracting Gabor-HOG feature from an input multi-spectral image; fusing theextracted Gabor-HOG feature and RGB color feature to form low-level feature vector. The low-level eigenvectors are inputted into the trained deep confidence network model to extract the high-level eigenvectors and generate the posterior probability of each pixel. The posterior probability of each pixel is inputted into the trained conditional random field model, and the contextual feature of the neighborhood information of each pixel is extracted, and the building target is identified according to the maximum posterior probability. As the low-level visual characteristic are designed, by usingdeep confidence network to extract high-rise building features and conditional random field to extract building context features, the problems of low building recognition rate caused by simplificationof building feature extraction and low building recognition rate caused by traditional methods are solved, and the precision and recall rate of building recognition can be improved.
Owner:NORTH CHINA UNIVERSITY OF TECHNOLOGY

Method for detecting same name of document writers

ActiveCN106021424AAvoid situations where less than desired results are achievedAvoid Over-Identification ProblemsSpecial data processing applicationsText database clustering/classificationPattern recognitionName disambiguation
The present invention discloses a method for detecting the same name of document writers, belonging to the technical field of data mining. The method fully uses a characteristic of same name disambiguation of a single characteristic similarity and single characteristic fusion in scientific literature. The method includes the steps of firstly modeling for a to-be-used document, then, calculating a similarity of every two single characteristics by using a single characteristic similarity detection method, and calculating identification capability of each single characteristic by using a disambiguation method based on the single characteristic similarity, so as to design multi-characteristic fusion disambiguation rules, and provide a method for detecting the same name of the document writers. The detection method integrates advantages of single characteristics of disambiguating the physical writer names, so that the method has high accuracy and callback rate in identification.
Owner:NANJING UNIV OF POSTS & TELECOMM

Method for detecting the similarity of the patent documents on the basis of new kernel function luke kernel

A method for detecting the similarity of the patent documents based on a new kernel function Luke kernel comprises: dividing a patent document into five elements, i.e. patent title, abstract, claims, description, and main classification, constructing a new kernel function Luke kernel, calculating the similarity of the first four elements of two patent documents by using the Luke kernel, calculating the similarity between the main classifications of the two patent documents by means of character string matching, and then performing a weighted summation of the similarities of the five elements of the two patent documents to obtain an overall similarity of the patent documents. The method further improves the precision and recall in detecting the similarity of the patent documents, and can be applied to detection for the similarity of the patent documents.
Owner:JIANGSU UNIV +1

Scenarized merchant recall method and device, electronic equipment and readable storage medium

The invention discloses a scenarized merchant recall method and device, electronic equipment and a readable storage medium. The method comprises the steps that given scene words are acquired, the given scene words are expanded into a keyword set containing N words, and N is a natural number; an initial seed merchant is determined from platform merchants based on each keyword in the keyword set; the initial seed merchant is filtered according to the keyword set to obtain a high-precision seed merchant; and a final recall merchant of the given scene word is determined according to the high-precision seed merchant. Therefore, the technical problems that an existing scenarized merchant recall method is high in labor cost and time cost, cannot give consideration to both precision and recall rate and is not high in adaptability are solved. And the beneficial effects of reducing the time cost and the labor cost and improving the recall precision and the recall rate are achieved.
Owner:BEIJING SANKUAI ONLINE TECH CO LTD

Intelligent advertisement identifying method

The invention discloses an intelligent advertisement identifying method and especially relates to a solving method for identifying an advertisement from mass information. The method comprises the following steps: establishing a word stock and a disabled word stock, wherein the disabled word stock contains some adverbs and modal particles with higher probability of occurrence; selecting some samples including the advertisement and common information; respectively extracting the characteristics of the advertisement and common information; calculating two-classified characteristic probability according to the bayesian algorithm and generating a model; continuously optimizing the model during a use process, thereby increasing the judging accuracy and recall rate of the model for the advertisement; if the probability of the judged advertisement is higher than the information probability, judging the information as the advertisement.
Owner:SHENZHEN INVENO TECH

Vibe moving target detection method based on gray level image feature matching

ActiveCN109978916ASolve the problem of incomplete suppressionSolve ghostingImage analysis2D-image generationMatch algorithmsImaging Feature
The invention provides a Vibe moving target detection method based on gray level image feature matching. The method is used for solving the problem that in the prior art, the moving target detection precision and recall rate are low. The method comprises the following implementation steps: (1) inputting a video A; (2) converting the first frame of image of the video A into a grayscale image G0; (3) constructing a Vibe background model of the grey-scale map G0; (4) labeling a foreground point area in each frame of image behind the first frame of image of the video A; (5) based on an image feature matching algorithm, performing ghosting area discrimination on a T-2R-1 frame grey-scale image; (6) updating the Vibe background model of the grayscale image G0; and (7) obtaining a moving target area which does not contain the ghosting area. According to the method, the foreground area is discriminated and the ghosting area and the noise area are eliminated by adopting a grayscale image feature matching algorithm, so that high-precision detection on the moving target is realized, and the method can be used for tracking the moving target and analyzing behaviors in a monitoring video.
Owner:XIDIAN UNIV

Spinous slow complex wave detection model construction method and system

The invention discloses a recurrent neural network and priori knowledge-based spinous slow complex wave detection model construction method. The method comprises the following steps: processing samples to obtain a training set, a verification set and a test set; performing artifact discrimination on the test set through an artifact filter, and outputting brain waves which are not artifacts to forma target test set; inputting the training set into a long-short-term memory model for training, calculating the probability value of whether the input training set is a spinous slow complex wave or not, and finally outputting a corresponding data label of which the probability value is greater than T according to a set probability threshold T; performing verification through a verification set toobtain a target long-short-term memory model; performing target model detection. According to the invention, neurons of a recurrent neural network are utilized to autonomously learn non-linear features which are not easy to design and describe artificially in spinous slow complex wave classification; pseudo-error filtering is carried out before detection, so that the accuracy of the model is improved; and by setting a threshold value, detection results with different precisions and recall rates are output according to different requirements.
Owner:CHINA ELECTRONIC TECH GRP CORP NO 38 RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products