Skip to main content

Unicode Decoder

Decode Unicode to plain text

Unicode to Text

How it works

This tool converts Unicode encoded strings into plain text. It supports three common Unicode formats:

  • JavaScript escape format: \u4e2d\u6587
  • HTML hex format: 中文
  • HTML decimal format: 中文

Examples

UnicodeText
\u4e2d\u6587中文
中文中文
中文中文
\u0041\u0042\u0043ABC

JavaScript Code Explanation

The conversion logic handles three Unicode formats:

// 1. JavaScript escape format: \u4e2d\u6587
const jsFormat = input.replace(/\\u([0-9a-fA-F]{4}|{[0-9a-fA-F]+})/g, (_, code) => {
const hex = code.replace(/^{|}$/g, '');
return String.fromCharCode(parseInt(hex, 16));
});

// 2. HTML hex format: 中文
const htmlHexFormat = input.replace(/&#x([0-9a-fA-F]+);/g, (_, hex) =>
String.fromCharCode(parseInt(hex, 16))
);

// 3. HTML decimal format: 中文
const htmlDecFormat = input.replace(/&#[0-9]+;/g, (_, dec) =>
String.fromCharCode(parseInt(dec, 10))
);

How it works:

  1. Regex Matching: Each format uses a regular expression to find Unicode sequences in the input string
  2. Hex/Decimal Parsing: parseInt(hex, 16) converts hexadecimal to decimal, parseInt(dec, 10) handles decimal
  3. Character Conversion: String.fromCharCode(code) converts the numeric code point to the actual character
  4. Replacement: The matched Unicode sequence is replaced with the decoded character

Key functions:

  • String.fromCharCode() - converts Unicode code point to character
  • parseInt(string, radix) - parses string to integer in given base (16=hex, 10=decimal)
  • RegExp with capture groups - extracts the numeric code from each format