Articles & Guides
In-depth guides on Unicode, character encoding, HTML entities, emoji, and international text. Whether you're trying to understand UTF-8, demystify emoji sequences, or learn how confusable characters work — these articles have you covered.
Control Characters: The Hidden Unicode Code Points
The first 32 code points in Unicode (U+0000–U+001F) and code point U+007F are control characters—inherited directly from ASCII and the older…
Currency Symbols in Unicode: A Complete Reference
Unicode contains over 60 currency symbols, from the ubiquitous dollar sign to obscure historical currencies. Using the correct Unicode symbo…
Mathematical Characters in Unicode
Unicode contains a rich collection of mathematical symbols, operators, and alphanumeric characters spread across several blocks. Whether you…
Right-to-Left Text in Unicode: Arabic, Hebrew, and Beyond
Unicode supports writing systems that run right-to-left (RTL), including Arabic, Hebrew, Persian, Thaana, Syriac, and several others. Handli…
The History of ASCII and Its Relationship to Unicode
Before Unicode, before UTF-8, there was ASCII—the American Standard Code for Information Interchange. Published in 1963 and revised in 1967 …
How to Use HTML Character References in Your Web Pages
Every HTML document can reference any Unicode character—whether or not you can type it on your keyboard—using character references. Masterin…
Unicode Planes Explained: From BMP to Supplementary Characters
Unicode organises its entire code space—1,114,112 possible code points—into 17 planes. Each plane contains 65,536 code points (U+xx0000 to U…
Understanding Unicode Categories
Every Unicode character is assigned a General Category—a two-letter classification code that describes the character's basic type and intend…
Confusable Characters: Why Look-Alike Letters Are a Security Risk
The Unicode standard assigns code points to characters from hundreds of scripts. Many of these characters look visually identical—or nearly …
Unicode Scripts: How Writing Systems Are Classified
Unicode assigns every character not just to a block but also to a script—a named writing system. Scripts are a fundamental concept in Unicod…
How Emoji Work: Skin Tones, ZWJ Sequences, and Modifiers
Emoji look like simple pictures, but under the hood many of them are complex sequences of multiple Unicode code points. Understanding how em…
HTML Entities vs Unicode Code Points: What's the Difference?
When inserting special characters in HTML, you have two main options: HTML entities (like & or ©) and Unicode numeric refer…