Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

403results about "Audio data clustering/classification" patented technology

Remote control system for connected devices

Systems and methods utilize a display to provide universal remote control functionality. A menu of options appears on a display of a display device. A user can navigate the various options to control devices that are separate from the display device. The menu may be dynamically configured according to the context in which the display device is being used. The menu may be activated using a remote control device or other device and the menu may include options for controlling one or more devices that are absent from the remote control device.
Owner:LOGITECH EURO SA

Chinese song emotion classification method based on multi-modal fusion

The invention discloses a Chinese song emotion classification method based on multi-modal fusion. The Chinese song emotion classification method comprises the steps: firstly obtaining a spectrogram from an audio signal, extracting audio low-level features, and then carrying out the audio feature learning based on an LLD-CRNN model, thereby obtaining the audio features of a Chinese song; for lyricsand comment information, firstly constructing a music emotion dictionary, then constructing emotion vectors based on emotion intensity and part-of-speech on the basis of the dictionary, so that textfeatures of Chinese songs are obtained; and finally, performing multi-modal fusion by using a decision fusion method and a feature fusion method to obtain emotion categories of the Chinese songs. TheChinese song emotion classification method is based on an LLD-CRNN music emotion classification model, and the model uses a spectrogram and audio low-level features as an input sequence. The LLD is concentrated in a time domain or a frequency domain, and for the audio signal with associated change of time and frequency characteristics, the spectrogram is a two-dimensional representation of the audio signal in frequency, and loss of information amount is less, so that information complementation of the LLD and the spectrogram can be realized.
Owner:BEIJING UNIV OF TECH

Method of training a neural network and related system and method for categorizing and recommending associated content

A property vector representing extractable measurable properties, such as musical properties, of a file is mapped to semantic properties for the file. This is achieved by using artificial neural networks “ANNs” in which weights and biases are trained to align a distance dissimilarity measure in property space for pairwise comparative files back towards a corresponding semantic distance dissimilarity measure in semantic space for those same files. The result is that, once optimised, the ANNs can process any file, parsed with those properties, to identify other files sharing common traits reflective of emotional perception, thereby rendering a more liable and true-to-life result of similarity / dissimilarity. This contrasts with simply training a neural network to consider extractable measurable properties that, in isolation, do not provide a reliable contextual relationship into the real-world.
Owner:EMOTIONAL PERCEPTION AI LTD

Systems and methods for the identification and/or distribution of music and other forms of useful information

The present invention relates generally to the field of telecommunications systems and methods. More specifically, the present invention is directed to systems and methods for identifying and / or distributing music and other types of useful information for users in a very simple and convenient manner. A variety of systems and methods are disclosed which provide users with quick and convenient access to various forms of information, such as, for example, audio information including music and news items as well as coupons and other information. The systems and methods allow users to store data representative of a time of transmission and preferably a source of transmission so that data of interest may be identified for ordering an / or downloading.
Owner:DEPKE BERNADETTE +2

Music recommendation method, device and system

The embodiment of the invention provides a music recommendation method and device, an electronic device and a storage medium. The method specifically comprises the following steps: receiving video data and to-be-matched multi-segment music data, and inputting the video data and the music data into a pre-trained multi-modal matching network model for processing to obtain a video embedding vector ofthe video data and an audio embedding vector of each segment of music data; inputting the video embedding vector and the audio embedding vector into a pre-trained personalized sorting model for calculation to obtain a matching degree between the video data and each section of music data; and finally, outputting the music data of which the matching degree conforms to a preset standard as target music to be matched, so that the user can utilize the target music to carry out music matching on the video. The target music is obtained through objective calculation, and the result is objective and reliable, so that the matching degree between the dubbing music and the corresponding video work is effectively improved, and finally the satisfactory audio and video work is obtained.
Owner:BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Three-dimensional generalized space

According to one embodiment, audio and non-audio data can be represented as sound sources in a three-dimensional sound space adapted to also provide visual data. Non-audio data can be associated with audio sound sources presented in the sound space. Navigation within this combined three-dimensional audio / visual space can be based primarily on the audio aspects of the sound sources with the details of the non-audio data being presented on demand, for example, when the listener navigates through the combined three-dimensional audio / visual space to a particular sound source at which point the non-audio data associated with that sound source can be presented.
Owner:AVAYA INC

Method for making music recommendations and related computing device, and medium thereof

This application discloses a method for making music recommendations. The method for making music recommendations is performed by a server device. The method includes obtaining a material for which background music is to be added; determining at least one visual semantic tag of the material, the at least one visual semantic tag describing at least one characteristic of the material; identifying a matched music matching the at least one visual semantic tag from a candidate music library; sorting the matched music according to user assessing information of a user corresponding to the material; screening the matched music based on a sorting result and according to a preset music screening condition; and recommending matched music obtained through the screening as candidate music of the material.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Objection detection by robot using sound localization and sound based object classification bayesian network

An object detection system includes at least one sound receiving element, a processing unit, a storage element and a sound database. The sound receiving element receives sound waves emitted from an object. The sound receiving element transforms the sound waves into a signal. The processing unit receives the signal from the sound receiving unit. The sound database is stored in the storage element. The sound database includes a plurality of sound types and a plurality of attributes associated with each sound type. Each attribute has a predefined value. Each sound type is associated with each attribute in accordance with Bayesian's rule, such that a conditional probability of each sound type is defined for an occurrence of each attribute.
Owner:TOYOTA MOTOR ENGINEERING & MANUFACTURING NORTH AMERICA

A twin network model training method, a twin network model measuring method, a twin network model training device, a twin network model measuring device, a medium and equipment

The invention relates to a twin network model training method, a twin network model measuring method, a twin network model training device, a twin network model measuring device, a medium and equipment. The method comprises the steps of pre-training a label classification model; then constructing a twin network model by using the trained label classification model in a mode of increasing coding neural network branches; therefore, a twin network model used for data similarity measurement in a recommendation system can be obtained through training based on a multi-task learning mode including label classification learning and measurement learning. Through a mode of staged training and multi-task learning constraint, the stability and generalization ability of the model can be effectively improved, and the accuracy of the trained twin network model for data similarity measurement in the recommendation system is improved. Furthermore, data similarity measurement can be carried out based on the trained twin network model, and the accuracy of data similarity measurement is effectively improved. And the trained twin network model is used for song similarity measurement, so that the accuracy of song similarity measurement can be effectively improved.
Owner:HANGZHOU NETEASE CLOUD MUSIC TECH CO LTD

Music recommendation method and apparatus

A music recommendation method may include obtaining the music belongingness function of music, which is the set of granularity of music in different dimensions, wherein the dimension is the classification of music and the granularity is the classification of the dimension; obtaining the user belongingness function of a user, which is the set of granularity indicating likes of user in different dimensions; calculating a granularity correlation function by using the music belongingness function and the user belongingness function; calculating the value of the probability function indicating likes of user for music by using the granularity correlation function and a dimension weighting coefficient; and recommending the music to the user when the value of the probability function indicating likes of user for music is greater than a preset threshold. An apparatus applying to the method comprises corresponding modules.
Owner:BEIJING RUIXIN ONLINE SYST TECH

Audio labelling-processing method and device and computing device

PendingCN109493881ALabeling is accurateImprove search hit rateSpeech analysisAudio data clustering/classificationFeature vectorLabelling
The invention provides an audio labelling-processing method and device. The method comprises the steps that original audio signals are obtained; the original audio signals are discretized to obtain target audios; characteristics of the target audios are extracted through a timing-sequence convolutional neural network (CNN) to obtain characteristic vectors of the target audios; clustering analysisis conducted on the characteristic vectors to obtain different categories of original audios corresponding to the characteristic vectors; according to the different categories of the original audios corresponding to the characteristic vectors, keywords in titles corresponding to the original audios in the same category are extracted, and one or more keywords are selected from the keywords according to a predetermined rule to serve as audio labels of the category. By adopting the scheme, efficient and accurate audio classification is achieved, high-accuracy and comprehensive audio labelling isachieve, and accordingly the search hit rate and recommending accuracy of audios can be improved.
Owner:BEIJING QIHOO TECH CO LTD

Music audio classification method based on convolutional recurrent neural network

The invention discloses a music audio classification method based on a convolutional recurrent neural network. The method comprises the following steps: S1, annotating music audios to obtain a music annotation data set; S2, enhancing the training data of the data set by adopting a music data enhancement method; S3, framing and windowing the audio signals of the music in the data set, and obtaininga Mel sound spectrum corresponding to the audio through short-time Fourier transform and Mel scale transform; S4, constructing a music audio classification model based on a convolutional recurrent neural network; S5, inputting the Mel sound spectrum of the training data into the music audio classification model based on a convolutional recurrent neural network for iterative training; and S6, inputting a Mel sound spectrum corresponding to the music, and predicting the label of the music. The method provided by the invention can improve the ability of the network to extract the sound spectrumfeatures, and obtain better music overall feature representation, thereby improving the accuracy of music audio classification.
Owner:SOUTH CHINA UNIV OF TECH

Scene annotation using machine learning

A system enhances existing audio-visual content with audio describing the setting of the visual content. A scene annotation module classifies scene elements from an image frame received from a host system and generates a caption describing the scene elements. A text to speech synthesis module may then convert the caption to synthesized speech data describing the scene elements within the image frame
Owner:SONY COMPUTER ENTERTAINMENT INC

Information interaction method, apparatus, computer device and storage medium

The embodiment of the invention discloses an information interaction method, a device, a computer device and a storage medium. The method comprises the following steps of: generating a dialogue scenedatabase according to the actual conversation between users under a real scene; Acquiring the interaction problem of user input, and searching in the dialogue scene database according to the interaction problem, acquiring the scene problem that matches the interaction problem best as the target scene problem; The answer corresponding to the question of the target scene is obtained in the dialoguescene library and sent to the user. The technical proposal provided by the embodiment of the invention solves the problem of low identification accuracy caused by matching corresponding answers through a human-maintained semantic template in the prior art, establishes a dialogue scene database according to a dialogue actually occurring under a real scene, is closer to the interactive problem inputby a user, and improves the accuracy rate of problem matching.
Owner:SHANGHAI XIAOI ROBOT TECH CO LTD

Image classification method and device, electronic equipment and storage medium

The invention relates to an image classification method and device, electronic equipment and a storage medium, relates to the technical field of computers, and is used for solving the problem of relatively low accuracy of an image classification technology in related technologies, and the method comprises the steps: classifying images in a to-be-recognized data set, and determining category labelsof the images in the to-be-recognized data set; extracting text features of each image and text features of the category label of each image, wherein the text features of the images are used for representing the state of an object in the images; determining the matching degree of each image and the corresponding category label according to the text feature of each image and the text feature of the category label of the corresponding image; and determining a target image corresponding to the category label from the images corresponding to the same category label according to the determined matching degree. According to the embodiment of the invention, after the images are classified, the images of the same class label are further screened according to the matching degree of the state of the object in the image and the class label of the image, so that the classification accuracy is improved.
Owner:BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Voiceprint retrieval method based on deep Hash

The invention discloses a voiceprint retrieval method based on deep Hash by which the effects of low storage space and efficient retrieval in a voiceprint retrieval task are achieved. The method comprises a step of training a deep voiceprint hash model, a step of constructing a hash coding database and a step of retrieving the query voice in the database, and is characterized by firstly constructing an end-to-end deep neural network structure, and training the deep neural network model by utilizing the voice data marked with a speaker identity to obtain a deep voiceprint hash function, and then calculating the Hash codes corresponding to the training set through the deep voiceprint Hash function, and constructing a database; for the newly inputted voice data, using the deep voiceprint hashfunction to calculate a corresponding hash code, and adding the hash code to a database in real time. During the retrieval process, for the given voice, the deep voiceprint hash function is used forcalculating the corresponding hash code, and finally a retrieval result is obtained in the database based on index or Hamming distance sorting.
Owner:NANJING UNIV

Music style classification method and device, computer device and storage medium

PendingCN110188235AFast classificationAddressing the Limitations of Manual ClassificationNeural architecturesSpecial data processing applicationsShock waveFrequency spectrum
The invention discloses a music style classification method and device, a computer device and a storage medium. The method comprises: acquiring a data set; preprocessing audio in the data set, and inputting the preprocessed audio into a preset deep convolutional neural network for training to obtain a trained network model; preprocessing the to-be-classified audio, and inputting the to-be-classified audio into the network model to obtain a music style recognition result of the to-be-classified audio; wherein the preprocessing comprises the step of separating a harmonic sound source and a shockwave sound source of the processed audio; and converting the original sound source, the harmonic sound source and the shock wave sound source of the processed audio into spectrograms. According to the music style classification method, the computer device and the storage medium provided by the invention, the audio is converted into the spectrogram, the deep convolutional neural network is trainedby using the spectrogram, and the to-be-classified audio frequency is classified and identified by using the trained network model, so that the high-precision classification of the audio frequency can be successfully realized, the classification speed is high, and the limitation of manual classification is solved.
Owner:PING AN TECH (SHENZHEN) CO LTD

Sound categorization system

A system method and computer program product for hierarchical categorization of sound comprising one or more neural networks implemented on one or more processors. The one or more neural networks are configured to categorize a sound into a two or more tiered hierarchical course categorization and a finest level categorization in the hierarchy. The categorization of a sound may be used to search a database for similar or contextually related sounds.
Owner:SONY COMPUTER ENTERTAINMENT INC

Multimedia resource classification method, apparatus, computer device, and storage medium

The invention discloses a multimedia resource classification method, a device, a computer device and a storage medium, belonging to the computer technical field. The method comprises the following steps of: acquiring multimedia resources and extracting a plurality of characteristic information of the multimedia resources; Clustering a plurality of feature information to obtain at least one clustering set, determining clustering description information of each clustering set, each clustering set comprising at least one feature information, and each clustering description information for indicating a feature of a clustering set; Determining at least one target feature description information of the multimedia resource based on the clustering description information of each clustering set, each target feature description information representing an association between one clustering description information and the rest of the clustering description information; The multimedia resources are classified based on at least one target feature description information of the multimedia resources, and the classification result of the multimedia resources is obtained. By adopting the invention,the accuracy of multimedia resource classification can be improved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Song tag prediction method and device, server and storage medium

The invention discloses a song tag prediction method and device, a server and a storage medium, and the method comprises the steps: obtaining a target lyric text of a song to be predicted, the targetlyric text comprising a plurality of target words; determining a target word vector of the target word according to a mapping relationship between each word and a word vector in the preset sample wordlibrary and a mapping relationship between stroke elements and stroke vectors of each word in the preset sample word library; and obtaining a text classification result according to the target lyrictext, the target word vector and a trained text classification model, the text classification result being a song tag corresponding to the target lyric text. According to the scheme, the target word vector of each target word in the target lyric text can be acquired, so that the meaning of each target word in the lyric text can be comprehensively considered during label prediction, and the label prediction accuracy of songs is improved.
Owner:ADVANCED NEW TECH CO LTD

Music emotion classification method based on multi-modal learning

The invention discloses a music emotion classification method based on multi-modal learning, and the method comprises the following steps: data preprocessing: carrying out the preprocessing of the audio, lyrics and comments of music according to the needed modal information, so as to obtain the effective input of a model; representation learning: mapping each modal to a respective representation space by using different modeling modes; feature extraction: extracting feature vectors of different modals after model mapping, and reducing dimensions to the same dimension; multi-modal fusion:, carrying out cascade early fusion on the features of the three different modals, and establishing more comprehensive feature representation; and emotion classification decision making: performing supervised emotion classification on the music by using the fused features. According to the music sentiment classification method, a method based on multi-modal joint learning is provided, the defect that noise or data loss exists in a current mainstream single-modal model method can be effectively reduced, and the accuracy and stability of music sentiment classification are improved.
Owner:HOHAI UNIV

A method and system for classifying speech by word segmentation

The invention provides a method and system for classifying speech by word segmentation. The method comprises the following steps of acquiring a corpus sample library; establishing an audio library anda semantic slot according to the corpus sample in the corpus sample library; acquiring voice audio; comparing the speech audio with the word segmentation audio in the audio library, and generating the matching word segmentation audio in the speech audio; merging the same segmentation audio, and counting the frequency of each segmentation audio in the speech audio after merging; obtaining the wordsegment semantics corresponding to the word segment audio according to the semantic slot; selecting the semantics corresponding to one or more semantic sets according to the semantics and frequency of word segmentation fragments as classification tags of speech and audio; and classifying the voice and audio according to the classification label. The invention classifies the contents of the voiceand audio quickly and accurately through word segmentation, so that the voice and audio can be clearly stored for subsequent searching.
Owner:GUANGDONG XIAOTIANCAI TECH CO LTD

A music automatic labeling method based on label depth analysis

InactiveCN109918535AImprove performanceOvercoming problems such as poor learning effectMetadata audio data retrievalNeural architecturesLearning basedData set
The invention discloses a music automatic labeling method based on label depth analysis. The method comprises the following steps: S1, collecting music data and cleaning the data by combining a musiclabel system; S2, sampling the music data, converting the music data into a Mel-frequency spectrogram, and slicing the Mel-frequency spectrogram; S3, constructing an audio multi-level feature extraction network based on the one-dimensional convolutional network, and performing parameter pre-training through supervised learning; S4, performing music label vector representation learning based on thetwo-dimensional convolutional network, and obtaining music label characteristics; S5, realizing feature aggregation of the audio multi-level features and the music tag features; and S6, performing final music label prediction based on the aggregation characteristics. According to the method, the difficulty that a traditional music labeling mode cannot be applied to a large-scale music data set isovercome, the music is automatically labeled according to the audio content, the workload of manually maintaining a music label library is reduced, and the method has very good usability.
Owner:SOUTH CHINA UNIV OF TECH

Model generation method, audio processing method and device, terminal and storage medium

The embodiment of the invention provides a model generation method, an audio processing method and device, a terminal and a computer readable storage medium, and the method comprises the steps: carrying out the marking of sample audio data according to a preset music style label, and generating a marked audio sample; cutting the labeled audio sample into a plurality of labeled audio data segmentswith preset lengths; processing each labeled audio data segment into a plurality of labeled sample audio segment feature vectors with preset dimensions, and taking the labeled sample audio segment feature vectors as labeled sample sets; updating the preset music style labels of the annotation sample audio segment feature vectors in the annotation sample set to obtain an annotation sample audio training set; and training the annotated sample audio training set by using a deep learning method to obtain a first music style annotation model. The purpose of inputting the target audio data into thefirst music style labeling model to obtain the music style label is achieved.
Owner:BEIJING QIYI CENTURY SCI & TECH CO LTD

Music recommendation method and computer readable recording medium storing computer program performing the method

A music recommendation method and a computer readable recording medium storing a computer program performing the method are provided. In the music recommendation method, music items and rating data matrix comprising ratings and user IDs are first provided. Then, the ratings of each music item are classified into positive ratings and negative ratings. Thereafter, a pre-processing phase comprising a frame-based clustering step and a sequence-based clustering step is performed to transform the music items into perceptual patterns. Then, a prediction phase is performed to determine an interest value of a plurality of target music items for an active user. Thereafter, the target music items arranged into a music recommendation list in accordance with the first interest value and the second interest values, wherein the music recommendation list is a reference for the active user to select one of the target items.
Owner:NAT CHENG KUNG UNIV

Control method for scene sound effect and electronic equipment

Embodiments of the invention disclose a control method for a scene sound effect and electronic equipment. The method comprises the following steps that the electronic equipment starts a service with a monitoring function after the electronic equipment is started; the electronic equipment monitors an audio track of the electronic equipment through the service with the monitoring function and determines whether the audio track of the electronic equipment has audio output; the audio track of the electronic equipment and an application in the electronic equipment have a mapping relation; if the electronic equipment determines that the audio track of the electronic equipment has audio output, the electronic equipment determines an application having the mapping relation with the audio track of the electronic equipment according to the mapping relation; and the electronic equipment obtains the scene sound effect corresponding to the application and sets the current sound effect of the electronic equipment to the scene sound effect. The process does not need a person to participate in setting of the scene sound effect, so that the operations are simplified and the using efficiency of the electronic equipment is improved on the premise of guaranteeing the higher accuracy rate of the scene sound effect.
Owner:GUANGDONG OPPO MOBILE TELECOMM CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products