Em Dash

U+2014
BMP Unicode 1.1
Character
Entity —
Decimal —
Hex —

Classification

Unicode properties assigned to this character by the Unicode Consortium. The codepoint is its unique numeric identifier. Category, block, and script determine how text systems render and process it.

Codepoint
U+2014
Decimal
8212
Plane
BMP — Basic Multilingual Plane
Category
Dash Punctuation (Pd)
Script
Common
Bidi class
ON Other Neutral
East Asian Width
A Ambiguous
Properties
Dash

Looks Like (Confusables)

Characters that are visually similar — relevant for security, font design, and homoglyph detection.

Encodings & Escape Sequences

Every Unicode character can be represented in multiple ways depending on context. HTML entities let you embed it safely in web pages. UTF-8 bytes are what gets stored on disk and sent over the network. Escape sequences let you reference it in source code without pasting the raw glyph. All formats below refer to the same character — Em Dash.

Click the copy icon to copy any value.

Format Value
HTML Named Entity
—
HTML Decimal
—
HTML Hex
—
UTF-8 Hex Bytes
E2 80 94
UTF-16 Hex Bytes
20 14
UTF-32 Hex
00002014
CSS Escape
\2014
JavaScript Escape
\u2014
Python Escape
\u2014
URL Encoded
%E2%80%94
Have a string containing this character? Decode it to see every codepoint. UnicodeDecoder →

Characters That Include This

These characters decompose to a sequence that includes Em Dash as a component. They are effectively precomposed versions or compounds built on this base character.