咢

Cjk Compatibility Ideograph-2f840

U+2F840
SIP Unicode 3.1
Character 咢
Decimal 咢
Hex 咢

Classification

Unicode properties assigned to this character by the Unicode Consortium. The codepoint is its unique numeric identifier. Category, block, and script determine how text systems render and process it.

Codepoint
U+2F840
Decimal
194624
Plane
SIP — Supplementary Ideographic Plane
Category
Other Letter (Lo)
Script
Han
Bidi class
L Left-to-Right
East Asian Width
W Wide
Properties
Alphabetic ID Start ID Continue

Looks Like (Confusables)

Characters that are visually similar — relevant for security, font design, and homoglyph detection.

Encodings & Escape Sequences

Every Unicode character can be represented in multiple ways depending on context. HTML entities let you embed it safely in web pages. UTF-8 bytes are what gets stored on disk and sent over the network. Escape sequences let you reference it in source code without pasting the raw glyph. All formats below refer to the same character — Cjk Compatibility Ideograph-2f840.

Click the copy icon to copy any value.

Format Value
HTML Decimal
咢
HTML Hex
咢
UTF-8 Hex Bytes
F0 AF A1 80
UTF-16 Hex Bytes
D8 7E DC 40
UTF-32 Hex
0002F840
CSS Escape
\2F840
JavaScript Escape
\uD87E\uDC40
Python Escape
\U0002F840
URL Encoded
%F0%AF%A1%80
Have a string containing this character? Decode it to see every codepoint. UnicodeDecoder →

Unihan Data

Readings and dictionary data from the Unicode Han Database (Unihan).

Cantonese (Jyutping)
ngok6

Normalization Forms

Unicode defines four normalization forms that affect how characters with diacritics, compatibility variants, and combining marks are represented. This character has a non-trivial normalization — the forms below differ from its codepoint. Mismatched normalization is the most common cause of failed string comparisons across systems.

NFC = Canonical Decomposition then Canonical Composition (preferred for storage) · NFD = Canonical Decomposition · NFKC/NFKD = Compatibility forms (fold variants like fi → fi)

Decomposition

This character can be broken down into a sequence of simpler Unicode codepoints. This is a canonical decomposition — the character and its components are semantically identical and interchangeable in NFC/NFD normalization.