Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

41 results about "UTF-8" patented technology

UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. The encoding is defined by the Unicode Standard, and was originally designed by Ken Thompson and Rob Pike. The name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit.

Embedded type terminal and UTF-8 and GB2312 code conversion method thereof

InactiveCN101655836ATroubleshoot character handling issuesSolve the problem that characters cannot be processedSpecial data processing applicationsUTF-8Operational system
The invention discloses an embedded type terminal and a UTF-8 and GB2312 code conversion method thereof. The UTF-8 and GB2312 code conversion method based on the embedded type movable terminal comprises the following steps: receiving a request which comes from an application program and converts a GB2312 code into a UTF-8 code; reading a GB2312 code character in the embedded type terminal according to the request; converting the read GB 2312 code character into a Unicode code character; directly converting the converted Unicode code character into a UTF-8 code character; and returning a resultof the converted UTF-8 code character to the application program. Under the environment without an operating system window or linux and the condition without other usable APIs, the UTF-8 and GB2312 code conversion method can solve the problem of character treatment of the communication between the embedded type movable terminal and a background server.
Owner:XIAMEN STELCOM INFORMATION & TECH

Method, apparatus and server for processing data packet

The present invention provides a method, an apparatus and a server for processing a data packet. The method comprises the steps of: generating XMPP (Extensible Messaging and Presence Protocol)-based message data to be transmitted, wherein the XMPP-based message data comprises a message header and a message body; performing self-defined packaging for the message header to obtain a packaged message header; converting the message body into a message body conforming to a preset protocol format; packaging the packaged message header and the message body conforming to the preset protocol format to obtain a new message packet; and transmitting the new message packet. According to the method, the apparatus and the server of the present invention, a communication mode of data protocol packets is adopted, an XMPP data packet is packaged by a protocol header, integrity of data is ensured through length field of the protocol header; in addition, communication quality of different types of data packets is defined, so that communication quality modes of different types of data packets can be ensured; and an original packet body is converted into UTF-8 byte stream for processing, thereby facilitating compression with a compression algorithm.
Owner:CHINA MOBILE GRP GUANGDONG CO LTD

Method and Apparatus for XML Data Processing

Method and apparatus for at least one of coding or decoding of data. The method comprising retrieving Extensible Markup Language (“XML”)-Unicode Transformation Format 8 (“UTF-8”) data, confirming XML-UTF-8 data in a proper format converting a prolog located within said XML-UTF-8 data, initializing a tag and attribute lookup table, comparing a current character to a plurality of multi-character patterns, determining whether said current character can be converted to a multi-character pattern in said plurality and Unicode, converting said current character to one of ASCII and Unicode when said current character cannot be converted to said multi-character pattern in said plurality, comparing at least one subsequent character to said plurality of multi-character patterns to determine conversion of at least the current character when said current character can be converted more than one way, determining whether there are more characters.
Owner:TEXAS INSTR INC

Voice recognition character string processing comparison method based on Pinyin

The invention relates to a voice recognition character string processing comparison method based on Pinyin. For application of an existing voice recognition technology to certain special occasions ofperson name recognition, equipment name recognition and the like, errors are generated easily due to incorrect comparison. The method is "secondary processing" based on a general Chinese character recognition algorithm; and recognized Chinese character strings are converted into Pinyin strings, and then the Pinyin strings are compared with target Pinyin strings. The method comprises the followingsteps of 1, performing Pinyin coding: performing coding on all Chinese character Pinyin, wherein the coding is similar to coding of unicode; and enumerating all Chinese character Pinyin combinations;2, performing code conversion: converting the character strings, with coding modes of GBK, Unicode, UTF-8 and the like, for expressing Chinese characters converted into the Pinyin strings; and 3, performing polyphone processing: enumerating polyphones of all family names; performing special processing; and distributing the same Pinyin codes. According to the method, accurate recognition can be rapidly realized, so that misjudgment is avoided.
Owner:深圳市艾塔文化科技有限公司

Universal forum text extraction method

The invention relates to a universal forum text extraction method. The method comprises the following steps that a complete html code of a website is extracted, a webpage coded format is tested, and the webpage coded format is uniformly coded into a utf 8 format; a html label type is analyzed, a DOM tree of a webpage is obtained, title information and div label content containing publishing time information are extracted, and the extracted information is classified to generate a list after useless information is filtered; the data length of the list is calculated, and the information is classified with time as a mark and is output in a formatted mode. The extraction method is high in universality, can be applied to most forums, and can accurately extract corresponding data fields of main posts, replies, titles and posting time and output the corresponding data fields in a formatted mode, so that forum information is better utilized.
Owner:NORTHEASTERN UNIV

Migration support device

This work PC 1 (migration support device) includes: a character code converting unit (11) that converts an EBCDIK+KEIS code into a UTF-8 code; a program converting unit (12) that converts an input program (22) into an output program (32); an exchange information generating unit (13) which causes the output program (32) to read character data to which the UTF-8 code is allocated, and which generates exchange information E that defines, regarding the read character data, the number of areas on a memory specified by the output program (32) so as to be the same as the number of bytes of a byte sequence expressing the EBCDIK+KEIS code allocated to the character data; and an area storing unit (14) that stores a UTF-8 code allocated to the read character data in the areas the number of which is defined by the exchange information E.
Owner:HITACHI SOCIAL INFORMATION SERVICES LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products