Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

361 results about "Counter propagation" patented technology

Relevance score assignment for artificial neural networks

The task of relevance score assignment to a set of items onto which an artificial neural network is applied is obtained by redistributing an initial relevance score derived from the network output, onto the set of items by reversely propagating the initial relevance score through the artificial neural network so as to obtain a relevance score for each item. In particular, this reverse propagation is applicable to a broader set of artificial neural networks and / or at lower computational efforts by performing same in a manner so that for each neuron, preliminarily redistributed relevance scores of a set of downstream neighbor neurons of the respective neuron are distributed on a set of upstream neighbor neurons of the respective neuron according to a distribution function.
Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV +1

Fiber Coupling Technique on a Waveguide

ActiveUS20130022316A1Easy to manufactureReduce the difficulty of polishingCoupling light guidesTip positionOptical coupling
An optical coupling assembly for coupling light from an optical fiber including an angled tip into a planar waveguide via a waveguide coupling element is provided. In one embodiment, the optical fiber extends along the planar waveguide with the angled tip positioned such that light propagating in the optical fiber is coupled by the waveguide coupling element to propagate in the planar waveguide in counter propagation with respect to a fiber propagation direction. In another embodiment, the optical fiber includes a tapered peripheral portion tapering toward the angled tip and is disposed over the planar waveguide with the tapered peripheral portion extending therealong such that light propagating in the optical fiber is coupled to propagate in the planar waveguide with either forward or counter propagation. Embodiments of the present invention may be part of various photonic integrated circuits and may be manufactured more easily than known optical coupling assemblies.
Owner:CIENA

Character detection method and device based on deep learning

The invention discloses a character detection method and device based on deep learning. The method comprises the steps: designing a multilayer convolution neural network structure, and enabling each character to serve as a class, thereby forming a multi-class classification problem; employing a counter propagation algorithm for the training of a convolution neural network, so as to recognize a single character; minimizing a target function of the network in a supervision manner, and obtaining a character recognition model; finally employing a front-end feature extracting layer for weight initialization, changing the node number of a last full-connection layer into two, enabling a network to become a two-class classification model, and employing character and non-character samples for training the network. Through the above steps, one character detection classifier can complete all operation. During testing, the full-connection layer is converted into a convolution layer. A given input image needs to be scanned through a multi-dimension sliding window, and a character probability graph is obtained. A final character region is obtained through non-maximum-value inhibition.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI +1

Image analysis neural network systems

A method includes determining object class probabilities of pixels in a first input image by examining the first input image in a forward propagation direction through layers of artificial neurons of an artificial neural network. The object class probabilities indicate likelihoods that the pixels represent different types of objects in the first input image. The method also includes selecting, for each of two or more of the pixels, an object class represented by the pixel by comparing the object class probabilities of the pixels with each other, determining an error associated with the object class that is selected for each pixel of the two or more pixels, determining one or more image perturbations by back-propagating the errors associated with the object classes selected for the pixels of the first input image through the layers of the neural network without modifying the neural network, and modifying a second input image by applying the one or more image perturbations to one or more of the first input image or the second input image prior to providing the second input image to the neural network for examination by the neurons in the neural network for automated object recognition in the second input image.
Owner:GENERAL ELECTRIC CO

Image marking method based on multi-mode deep learning

The invention discloses an image marking method based on multi-mode deep learning. The method comprises the following steps: firstly, a depth neural network is trained by utilization of images without labels; secondly, each single mode is optimized by utilization of counter propagation; finally, weights among different modes are optimized by utilization of on-line learning power gradient algorithm. The method employs a convolution neural network technology to optimize parameters of the depth neural network, and the marking precision is raised. Experiments of public data sets show that the method can raise the image marking performance effectively.
Owner:NANJING UNIV OF POSTS & TELECOMM

Deformable convolutional neural network-based infrared image object identification method

The invention discloses a deformable convolutional neural network-based infrared image object identification method. The method comprises the steps of constructing a training set and a test set; establishing a convolutional neural network architecture; adding a softmax classifier to the last layer, and setting an objective function; performing sampling by adopting a convolution kernel of linear ornonlinear deformation; performing pooling operation in a pooling layer by adopting a rule block sampling-based ROI pooling method which is the best in the industry at present; setting learning rate parameters according to experience; and easily performing standard back propagation end-to-end training, thereby obtaining a deformable convolutional neural network. An experiment proves that the spatial geometric deformation learning capability is introduced in the convolutional neural network, so that an identification task of an image with spatial deformation is better finished; the geometric transformation modeling capability of the convolutional neural network and the effectiveness of target detection and visual task identification are improved; and dense geometric deformation in space issuccessfully learnt.
Owner:GUANGDONG POWER GRID CO LTD +1

Training device for memristor-based neural network and training method thereof

The invention discloses a training device for a memristor-based neural network and a training method thereof. The neural network comprises N neuron layers which are connected one by one. The trainingmethod comprises the following steps of: inputting input data into a first neuron layer of the neural network so as to output an output result of the neural network at the Nth neuron layer, and calculating an output error of the Nth neuron layer; and counter-propagating the output error of the Nth neuron layer in a layer-by-layer manner so as to correct weight parameters between the neuron layers;and in the layer-by-layer counter-propagation process, three-valuing an output error of the mth neuron layer, and reversely inputting a voltage signal corresponding to an output result of the three-valuing operation to the mth neuron layer so as to correct a weight parameter of the mth neuron layer, wherein N is an integer greater than or equal to 3, and m is an integer greater than 1 and smallerthan N. According to the training method, the calculation ability of the memristor-based neural network is improved.
Owner:TSINGHUA UNIV

APT attack detection method based on deep belief network-support vector data description

The invention discloses an advanced persistent threat (APT) attack detection method based on deep belief network-support vector data description. A deep belief network (DBN) is used for feature dimension-reduction and excellent feature vector extraction; and support vector data description (SVDD) is used for the data classification and detection. At a DBN training state, the feature dimension-reduction is performed by using the DBN model after obtaining a standard data set; a low-level restricted Boltzmann machine (RBM) receives simple representation transmitted from the low-level RBM by usingthe high-level RBM so as to learn more abstract and complex representation after performing the initial dimension-reduction, and back propagation of a back propagation (BP) neural network is used forrepeatedly adjusting a weight value until the data with excellent feature is extracted. The data processed by the DBN is divided into a training set and a testing set, and the data set is provided for the SVDD to perform training and identification detection, thereby obtaining the detection result. The attack detection method disclosed by the invention is suitable for the unsupervised attack datadetection with large data size and high-dimension feature, is fit for the APT attack detection and can obtain an excellent detection result.
Owner:SHANGHAI MARITIME UNIVERSITY

Text classification method and device

The invention discloses a text classification method and device. The method comprises the following steps of: receiving a plurality of training texts, the categories of which are known; constructing graph structures of the training texts by adoption of a collinear relationship of words after preprocessing the training texts; training parameters of a convolutional neural network through a counter-propagation algorithm according to the graph structures of the training texts, so as to obtain a trained convolutional neural network; receiving an input to-be-classified text; constructing a graph structure of the to-be-classified text by adoption of the collinear relationship of the words after preprocessing the to-be-classified text; and predicting a category technical scheme of the to-be-classified text through the trained convolutional neural network according to the graph structure of the to-be-classified text. According to the method and device, the convolutional neural network is used for carrying out text classification, so that the text classification correctness and credibility are improved.
Owner:GUANGZHOU HKUST FOK YING TUNG RES INST

Super-short-term prediction method of photovoltaic power station irradiance

ActiveCN103559561AUltra-short-term forecasting is effectively completedForecast effectively doneForecastingAlgorithmShort terms
The invention discloses a super-short-term prediction method of photovoltaic power station irradiance. The method includes the steps that irradiance data are extracted from a history database, data of a night time quantum are removed, corresponding extraterrestrial theoretical irradiance is calculated, data abnormal detection is carried out based on the preceding operations, and the data are normalized in the difference value ratio method of an extraterrestrial irradiance theoretical value and practical irradiance; a training sample set is extracted according to input and output dimensionality of a model; a model of an irradiance time sequence is built through an ANFIS, a the rule quantity and an initial parameter of the ANFIS model are determined in a subtractive clustering method, and a fuzzy model parameter is optimized in a counter propagation algorithm and a least square method; a prediction sample is input, and a prediction value is obtained through calculation; the prediction value is added to form a new sample set, and multiple steps of prediction are achieved in a cycling mode; counter normalization processing is carried out on the prediction value. Super-short-term prediction of the irradiance can be achieved only by means of a history irradiance time sequence, prediction accuracy is good and the method is easy to carry out.
Owner:SHANGHAI ELECTRICGROUP CORP

Deep belief network model based cement clinker free calcium content prediction method

InactiveCN106202946AAccurately reflect actual operating conditionsQuality assuranceInformaticsSpecial data processing applicationsDeep belief networkReal-time data
The invention relates to a deep belief network model based cement clinker fCaO prediction method. The method comprises the steps that major variables capable of reflecting the firing situation of a cement clinker are preliminarily selected to form an auxiliary variable set, and a prediction variable is the cement clinker fCaO content; a field instrument and an operator recorder respectively acquires auxiliary variables and field data of the cement clinker fCaO content, a grey relational analysis method is adopted conduct dimensionality reduction on the auxiliary variable set; parameters in a deep belief network structure, namely parameters training the deep belief network are determined according to a deep belief network algorithm and sample data volume, and further optimization of weighting and bias of the whole network is achieved; a counter-propagation algorithm is adopted to conduct error correction on the determined parameters in a deep belief network structure, and further a prediction model of the cement clinker fCaO is determined; real-time data of the auxiliary variable set is acquired, and errors of the obtained real-time data of the auxiliary variable set are eliminated according to 3delta criterions; further, the cement clinker fCaO content is predicted.
Owner:YANSHAN UNIV

Overlay convolutional network-based rolling bearing failure mode recognition method and device

The invention discloses an overlay convolutional network-based rolling bearing failure mode recognition method and device, and relates to the field of rolling bearing failure diagnosis. The method comprises the following steps of: extracting a time-frequency domain feature of a vibration signal of a state-known rolling bearing; normalizing the obtained time-frequency domain feature of the state-known rolling bearing into a feature pixel according to a CNN network input format; inputting the feature pixel into a CNN network, and adjusting a model parameter of the CNN network through carrying out forward self-learning and gradient descent-based counter-propagation on the CNN network so as to obtain a trained CNN network; and during the recognition of a practical rolling bearing failure mode, extracting high-order features capable of reflecting intrinsic information layer by layer by utilizing the trained CNN network by taking a time-frequency domain feature of a vibration signal of a state-unknown rolling bearing, and inputting results of the feature self-learning into a top classifier layer by layer, so as to realize failure mode recognition of the rolling bearings under multiple working conditions and strong noises.
Owner:北京恒兴易康科技有限公司

Multivariable logistics freight volume prediction method based on LSTM network

The invention discloses a multivariable logistics freight volume prediction method based on an LSTM network, which is used for solving the technical problem of low prediction precision in time seriesdata prediction in the prior art. The method comprises the following steps of: screening logistics freight volume influence factors and preprocessing influence factor data; converting a time series data set supervised learning mode; normalizing the time series data variables of the supervised learning format; dividing a data training set and a test set; setting parameters of an LSTM prediction model and carrying out forward training on the model; and performing back propagation of the model and back normalization of a logistics freight volume prediction value. According to the method, the long-term memorability of the LSTM network for the flow data is fully utilized, the relation between variables can be effectively explored through supervised learning, and the logistics freight volume prediction precision is improved.
Owner:HOHAI UNIV +1

General disturbance generation method based on generative adversarial network

ActiveCN111461307ASmall disturbance rangeAdversarial examples should not be detectedNeural architecturesNeural learning methodsNetwork generationGenerative adversarial network
The invention discloses a general disturbance generation method based on a generative adversarial network. Firstly, the generative adversarial network generates general disturbance to obtain an adversarial sample; discriminating the adversarial sample and the original sample by a discriminating network, calculating a discriminating network objective function, and performing back propagation for optimization; and finally, predicting adversarial sample classification by a deep learning model, discriminating adversarial samples by a discriminant network, calculating and generating a network objective function, and performing back propagation for optimization. The GAN-based general disturbance generation method provided by the invention can provide a thought of machine learning model safety research for users in the fields of computer vision, deep learning and the like.
Owner:WUHAN UNIV

Deep migration learning system and method based on entropy minimization

The invention provides a deep migration learning method based on entropy minimization, and the method comprises the steps: S1) dividing a source field and a target field according to different migration learning tasks, constructing a migration learning network, and initializing hyper-parameters of the migration learning network; S2) inputting respective data samples of the source domain and the target domain into a transfer learning network and carrying out forward propagation to obtain a network prediction label; training the whole network by using a stochastic gradient descent method according to the proposed loss function, completing updating of network parameters by using back propagation, and stopping training until the model converges or reaches the maximum number of iterations; S3)storing the network model and a training result; according to the target domain label prediction method based on the entropy minimization technology. By maximizing the distribution diversity of each batch of data prediction categories in the target domain, a common solution result occurring when only the entropy minimization technology is used is avoided, and the reliability of transfer learning is guaranteed.
Owner:NANJING UNIV OF POSTS & TELECOMM

Language task model training method and device, electronic equipment and storage medium

The invention provides a language task model training method and device, electronic equipment and a storage medium. The method comprises the steps of performing hierarchical pre-training in a languagemodel based on corpus samples of corresponding language tasks in a pre-training sample set; carrying out forward propagation on corpus samples corresponding to language tasks in a training sample setin the language task model; fixing parameters of the language model, and performing back propagation in the language task model to update the parameters of the task model; and performing forward propagation and reverse propagation on corpus samples corresponding to the language tasks in the training sample set in the language task model so as to update parameters of the language model and the task model. By means of the method and device, the catastrophic forgetting phenomenon of the language model can be prevented, and meanwhile it is guaranteed that the language model and the task model canachieve the training effect meeting the corresponding learning rate.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Base-band digital pre-distortion-based method for improving efficiency of rf power amplifier

The present invention relates to a BDPD-based method for improving efficiency of RF power amplifier, comprising: first, choose key neural network architecture and scale and input initial values of modeling data and network parameters necessary for establishing the neural network model for RF power amplifier; second, correct network parameters with back propagation method and output the neural network model for RF power amplifier when the error meets the criterion; next, solve the pre-distortion algorithm of the RF power amplifier with said model and then carry out pre-distortion processing for the input with the pre-distortion algorithm and feed the input to the RF power amplifier. The present invention can be used to establish a neural network model with adequate accuracy and easy to solve corresponding pre-distortion algorithm for RF power amplifier, in order to improve RF power amplifier efficiency, reduce costs, and suppress out-of-band spectrum leakage effectively through base-band digital pre-distortion technology.
Owner:HUAWEI TECH CO LTD

BP neural network-based state estimation bad data identification method

The invention discloses a BP neural network-based state estimation bad data identification method. The method comprises the following steps of: aiming at relatively high requirement, for training samples, of BP neural network-based bad data identification method, establishing a BP neural network model for carrying out bad data identification on the basis of state estimation results; carrying out training by taking an online state estimation calculation result section as a sample; taking a measured value as input data and taking a state estimation value as an expected output; correcting a connection weight value and a threshold value on the basis of repeated iteration of the sample through counter-propagation of errors between the input and the output; training a measurement-based neural network; detecting a new measured section through the trained neural network; and when deviation between the measured value and a predicted value is relatively large, judging the data as bad data. According to the method, state estimation calculation results are directly utilized as the samples to carry out training, and samples with relatively high correctness are provided, so that the bad data identification precision of neural network methods is improved.
Owner:NARI TECH CO LTD +2

Collaborative multi-agent reinforcement learning method

PendingCN112364984ARelaxation of unreasonable assumptionsNeural architecturesNeural learning methodsMulti-agent systemEngineering
The invention discloses a collaborative multi-agent reinforcement learning method. The method comprises the following steps: obtaining observation information of each agent and a global state of a system; transmitting the obtained observation information of each intelligent agent into a deep neural network to calculate and obtain state action values of all actions of the intelligent agent; performing action selection by utilizing a greedy rule; transmitting the state action value corresponding to the adopted action and the global observation information into a reward highway network; rewardingthe highway network to perform information fusion and inputting a combined state action value; and performing gradient back transmission by utilizing a reward signal given by the environment and updating parameters of the neural network so as to obtain a strategy model of each intelligent agent. The data volume required in the training process of the multi-agent system can be reduced, and the invention is suitable for being popularized to large-scale multi-agent systems.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Method and apparatus for frequency domain reverse-time migration with source estimation

Provided is seismic imaging, particularly, reverse-time migration for generating a real subsurface image from modeling parameters calculated by waveform inversion, etc. A frequency-domain reverse-time migration apparatus includes: a source estimator configured to estimate sources from data measured on a plurality of receivers; and a migration unit configured to receive information about the sources estimated by the source estimator and to perform reverse-time migration in the frequency domain. The source estimator estimates the sources by updating an initial source vector using incremental changes according to a full Newton method. In more detail, the migration unit includes: a back-propagation unit configured to back-propagate the measured data; a virtual source estimator configured to estimate virtual sources from the sources estimated by the source estimator; and a convolution unit configured to convolve the back-propagated measured data with the virtual sources and to output the results of the convolution.
Owner:SEOUL NAT UNIV R&DB FOUND

Federal neural network model training method, device and equipment and storage medium

The embodiment of the invention provides a federated neural network model training method, device and equipment and a storage medium, and relates to the technical field of artificial intelligence andthe technical field of cloud. The method comprises the following steps: inputting sample data into a federated neural network model to process the sample data through a first lower-layer model to obtain a lower-layer model output value; respectively inputting the lower-layer model output value, an interaction layer model parameter generated by the first participant and an encryption model parameter obtained by encrypting the interaction layer model parameter based on an RIAC encryption mode into an interaction layer to obtain an output vector of the interaction layer; inputting the output vector into the upper-layer model to obtain an output value of a federated neural network model; inputting the output value into a preset loss function to obtain a loss result; and performing back propagation processing on the federated neural network model according to the loss result. Through the embodiment of the invention, the calculation complexity can be greatly reduced, the calculation amount is reduced, and the time consumption is reduced.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Neural network video deblurring method based on multi-attention mechanism fusion

The invention discloses a neural network video deblurring method based on multi-attention mechanism fusion. The method comprises the following steps: S1, constructing a video deblurring model; S2, acquiring an original video sequence, and extracting spatial local and global information of different positions between video frames and similarity information between continuous video frames by using aspatial-temporal attention module in video deblurring; S3, capturing low-frequency and high-frequency different types of information of the input fuzzy video sequence by using a channel attention module in the video deblurring model; S4, fusing the extracted different information to obtain deblurred features, and mapping the deblurred features into an image from a feature space by using an imagereconstruction module to obtain a clear intermediate frame; and S5, calculating content loss and perception loss of the recovered intermediate frame and the corresponding clear image, and performing back propagation to train a network model. Effective deblurring processing can be carried out on the blurred video to obtain clear and real video data.
Owner:WENZHOU UNIVERSITY

Dataflow all-reduce for reconfigurable processor systems

Roughly described, a system for data parallel training of a neural network on multiple reconfigurable units configured by a host with dataflow pipelines to perform different steps in the training CGRA units are configured to evaluate first and second sequential sections of neural network layers based on a respective subset of training data, and to back-propagate the error through the sections to calculate parameter gradients for the respective subset. Gradient synchronization and reduction are performed by one or more units having finer grain reconfigurability, such as an FPGA. The FPGA performs synchronization and reduction of the gradients for the second section while the CGRA units perform back-propagation through the first sequential section. Intermediate results are transmitted using a P2P message passing protocol layer. Execution of dataflow segments in the different units is triggered by receipt of data, rather than by a command from any host system.
Owner:SAMBANOVA SYST INC

Small adversarial patch generation method and device

The invention discloses a small adversarial patch generation method and device, and the method comprises the steps: carrying out the random initialization of an adversarial patch image, adding the initialized adversarial patch image to a selected pasting region on a target object in training data, and manufacturing an adversarial sample; transmitting the adversarial samples into a deep learning model for adversarial feature extraction, and transmitting benign samples without adversarial patch images into the deep learning model for benign feature extraction; jointly inputting the adversarial features and the benign features into a feature enhancement loss function for loss calculation to obtain a loss result; adding a loss result into a model loss function, and updating a pixel value of the adversarial patch through an optimizer after back propagation; and after preset times of iteration, enabling the adversarial patch to enable the deep learning model to output an error result, and ending the adversarial patch processing process. According to the method, the size of the anti-patch in the physical world can be smaller, the manufacturing cost is reduced, the identifiability of the anti-patch is reduced, and a defense method based on detection is broken through more easily.
Owner:BEIJING REALAI TECH CO LTD

Unsupervised domain adaptive method combining deep attention features and conditional adversarial

The invention belongs to the technical field of artificial intelligence, and relates to an unsupervised domain adaptive method combining deep attention features and conditional adversarial. The methodcomprises the following steps: dividing a to-be-processed image data set into a source domain and a target domain; designing a network capable of migrating attention and conditional confrontation; preprocessing the image source domain and the target domain before the image source domain and the target domain are inputa network capable of migrating attention and conditional adversarial; importingthe preprocessed source domain and the preprocessed target domain into the designed network in batches in sequence, obtaining weighted feature maps through a migratable attention network, inputting the weighted feature maps into a conditional adversarial network for training, and finally performing probability operation through a full connection layer; respectively calculating the image classification accuracy of the source domain and the target domain; and finally, directly applying the network which is trained on the source domain and can migrate attention and conditional adversarial to thetarget domain to perform image classification through iteration and back propagation training. According to the method, the generalization ability of the unsupervised domain adaptive network is greatly improved.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Runtime extension for neural network training with heterogeneous memory

Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.
Owner:ADVANCED MICRO DEVICES INC

Automatic picture toning method based on generative adversarial network

The invention discloses an automatic picture toning method based on a generative adversarial network. The method comprises the following steps: 1) obtaining a training group; 2) carrying out trimmingprocessing on each training group by utilizing a generator network; 3) calculating the confrontation loss of the reconstructed picture and the target picture by using a discriminator network; 4) feeding back the adversarial loss to the discriminator network to update the weight of the discriminator network; 5) calculating perception loss by using a VGG network; 6) taking the weighted sum of the perception loss and the adversarial loss as the total loss, performing back propagation to the generator model, and guiding the generator model to adjust the parameters of the generator model for performing trimming processing on the original picture; 7) repeating the operations above until training is finished; and 8), performing picture toning by means of the trained generator model. The inventionfurther discloses a picture automatic color matching system based on the generative adversarial network, automatic color matching can be achieved, the color adjusting effect is unified, the color adjusting style is stable, the final finished product pixel is high, and the calculated amount is small. The picture automatic color matching method and the system based on the generative adversarial network have the advantages that the color matching is automatic, the color adjusting effect is uniform and the color adjusting style is stable.
Owner:HANGZHOU HUOSHAOYUN TECH CO LTD

Face search method and system

The invention relates to a face search method and system. The method includes the following steps that: a network framework for face retrieval is constructed, and the network framework is optimized; parameters in the network framework are adjusted according to a reverse propagation mode; training sample pictures are inputted into the adjusted network framework, so that test cases can be generated; quantization calculation is performed on the test cases according to a sign function, so that the binary codes of the test cases are obtained, and Hamming distances between the training sample pictures are calculated through the binary codes; and the approximation degrees of the training samples are sequenced according to the Hamming distances, and therefore, the training of the network framework for face retrieval is completed; and face pictures to be retrieved are inputted into the trained network framework for face retrieval, so that the face pictures to be retrieved can be retrieved, and the approximation degree-sequenced face pictures to be retrieved can be obtained. According to the face search method and system of the invention, the network and outputted features are optimized from the above aspects, so that the same accuracy can be maintained under a large-scale face database or the retrieval of face images can be performed quickly with accuracy decreased in a range as small as possible.
Owner:GUILIN UNIV OF ELECTRONIC TECH

Multi-model knowledge distillation method and device, electronic equipment and storage medium

The invention relates to a multi-model knowledge distillation method and device, electronic equipment and a storage medium. The method comprises the steps of extracting features of a training image toobtain training data; inputting the training data into a plurality of sub-models included in a teacher model for operation, and obtaining a first feature output by the teacher model according to sub-features output by the plurality of sub-models; inputting the training data into a student model for operation to obtain a second feature output by the student model; determining a loss function of the student model according to the first feature and the second feature; and performing back propagation on the student model according to the loss function. According to the embodiment of the invention, different feature representations in the training data can be obtained by utilizing the plurality of sub-models in the teacher model, and the student model can learn features in the teacher model byutilizing a knowledge distillation mode, so that the problem of limited expression capability of a single model is solved, and the model precision of the student model is improved.
Owner:BEIJING SENSETIME TECH DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products