Html — entities

Parameter Values

Parameter Description
string Required. Specifies the string to decode
flags Optional. Specifies how to handle quotes and which document type to use.

The available quote styles are:

  • ENT_COMPAT — Default. Decodes only double quotes
  • ENT_QUOTES — Decodes double and single quotes
  • ENT_NOQUOTES — Does not decode any quotes

Additional flags for specifying the used doctype:


  • ENT_HTML401 — Default. Handle code as HTML 4.01
  • ENT_HTML5 — Handle code as HTML 5
  • ENT_XML1 — Handle code as XML 1
  • ENT_XHTML — Handle code as XHTML
character-set Optional. A string that specifies which character-set to use.

Allowed values are:

  • UTF-8 — Default. ASCII compatible multi-byte 8-bit Unicode
  • ISO-8859-1 — Western European
  • ISO-8859-15 — Western European (adds the Euro sign + French and Finnish letters missing in ISO-8859-1)
  • cp866 — DOS-specific Cyrillic charset
  • cp1251 — Windows-specific Cyrillic charset
  • cp1252 — Windows specific charset for Western European
  • KOI8-R — Russian
  • BIG5 — Traditional Chinese, mainly used in Taiwan
  • GB2312 — Simplified Chinese, national standard character set
  • BIG5-HKSCS — Big5 with Hong Kong extensions
  • Shift_JIS — Japanese
  • EUC-JP — Japanese
  • MacRoman — Character-set that was used by Mac OS

Note: Unrecognized character-sets will be ignored and replaced by ISO-8859-1 in versions prior to PHP 5.4. As of PHP 5.4, it will be ignored an replaced by UTF-8.

ISO 8859-1 Symbol Entities

Result Description Entity Name Number Code
  non-breaking space    
¡ inverted exclamation mark ¡ ¡
¤ currency ¤ ¤
¢ cent ¢ ¢
£ pound £ £
¥ yen ¥ ¥
¦ broken vertical bar ¦ ¦
§ section § §
¨ spacing diaeresis ¨ ¨
copyright © ©
ª feminine ordinal indicator ª ª
angle quotation mark (left) « «
¬ negation ¬ ¬
­ soft hyphen ­ ­
registered trademark ® ®
trademark ™ ™
¯ spacing macron ¯ ¯
° degree ° °
± plus-or-minus  ± ±
² superscript 2 ² ²
³ superscript 3 ³ ³
´ spacing acute ´ ´
µ micro µ µ
paragraph ¶ ¶
· middle dot · ·
¸ spacing cedilla ¸ ¸
¹ superscript 1 ¹ ¹
º masculine ordinal indicator º º
angle quotation mark (right) » »
¼ fraction 1/4 ¼ ¼
½ fraction 1/2 ½ ½
¾ fraction 3/4 ¾ ¾
¿ inverted question mark ¿ ¿
× multiplication × ×
÷ division ÷ ÷

What is URL encoding?

URL encoding stands for encoding certain characters in a URL by replacing them with one or more character triplets that consist of the

percent character «» followed by two hexadecimal digits. The two hexadecimal digits of the triplet(s) represent the

numeric value of the replaced character.

The term URL encoding is a bit inexact because the encoding procedure is not limited to

URLs (Uniform Resource Locators), but can also be applied to any

other URIs (Uniform Resource Identifiers)

such as URNs (Uniform Resource Names).

Therefore, the term percent-encoding should be preferred.

Which Characters Are Allowed in a URL?

The characters allowed in a URI are either reserved or unreserved (or a percent character as part of a percent-encoding).

Reserved characters are those characters that sometimes have special meaning, while unreserved characters have no such

meaning. Using percent-encoding, characters which otherwise would not be allowed are represented using allowed characters.


The sets of reserved and unreserved characters and the circumstances under which certain reserved characters have special meaning

have changed slightly with each revision of specifications that govern URIs and URI schemes.

According to RFC 3986, the characters in a URL have to

be taken from a defined set of unreserved and reserved ASCII characters.

Any other characters are not allowed in a URL.

The unreserved characters can be encoded, but should not be encoded. The unreserved characters are:

The reserved characters have to be encoded only under certain circumstances. The reserved characters are:

Encoding/Decoding a Piece of Text

RFC 3986 does not define according to which character

encoding table non-ASCII characters (e.g. the umlauts ä, ö, ü) should

be encoded. As URL encoding involves a pair of hexadecimal digits and as a pair of hexadecimal digits is equivalent to 8 bits, it would

theoretically be possible to use one of the 8-bit code pages for non-ASCII characters (e.g. ISO-8859-1 for umlauts).


On the other hand, as many languages have their own 8-bit code page, handling all these different 8-bit code pages would be a quite

cumbersome thing to do. Some languages do not even fit into an 8-bit code page (e.g. Chinese). Therefore,

RFC 3629 proposes to use the

UTF-8 character encoding table for non-ASCII characters.

The following tool takes this into account and offers to choose between the ASCII character encoding table and the UTF-8 character

encoding table. If you opt for the ASCII character encoding table, a warning message will pop up if the URL encoded/decoded text

contains non-ASCII characters.

External Links

  • More information about percent-encoding (Wikipedia)
  • URL encoding with Java (UTF-8 character encoding, source code available)

Range: Decimal 128-255. Hex 0080-00FF.

If you want any of these characters displayed in HTML, you can use the HTML entity found in the table below.

If the character does not have an HTML entity, you can use the decimal (dec) or hexadecimal (hex) reference.

Will display as:

I will display £ I will display £ I will display £

Char Dec Hex Entity Name
  160 00A0   NO-BREAK SPACE
¡ 161 00A1 ¡ INVERTED EXCLAMATION MARK
¢ 162 00A2 ¢ CENT SIGN
£ 163 00A3 £ POUND SIGN
¤ 164 00A4 ¤ CURRENCY SIGN
¥ 165 00A5 ¥ YEN SIGN
¦ 166 00A6 ¦ BROKEN BAR
§ 167 00A7 § SECTION SIGN
¨ 168 00A8 ¨ DIAERESIS
169 00A9 © COPYRIGHT SIGN
ª 170 00AA ª FEMININE ORDINAL INDICATOR
171 00AB « LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
¬ 172 00AC ¬ NOT SIGN
­ 173 00AD ­ SOFT HYPHEN
174 00AE ® REGISTERED SIGN
¯ 175 00AF ¯ MACRON
° 176 00B0 ° DEGREE SIGN
± 177 00B1 ± PLUS-MINUS SIGN
² 178 00B2 ² SUPERSCRIPT TWO
³ 179 00B3 ³ SUPERSCRIPT THREE
´ 180 00B4 ´ ACUTE ACCENT
µ 181 00B5 µ MICRO SIGN
182 00B6 ¶ PILCROW SIGN
· 183 00B7 · MIDDLE DOT
¸ 184 00B8 ¸ CEDILLA
¹ 185 00B9 ¹ SUPERSCRIPT ONE
º 186 00BA º MASCULINE ORDINAL INDICATOR
187 00BB » RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
¼ 188 00BC ¼ VULGAR FRACTION ONE QUARTER
½ 189 00BD ½ VULGAR FRACTION ONE HALF
¾ 190 00BE ¾ VULGAR FRACTION THREE QUARTERS
¿ 191 00BF ¿ INVERTED QUESTION MARK
À 192 00C0 À LATIN CAPITAL LETTER A WITH GRAVE
Á 193 00C1 Á LATIN CAPITAL LETTER A WITH ACUTE
 194 00C2  LATIN CAPITAL LETTER A WITH CIRCUMFLEX
à 195 00C3 à LATIN CAPITAL LETTER A WITH TILDE
Ä 196 00C4 Ä LATIN CAPITAL LETTER A WITH DIAERESIS
Å 197 00C5 Å LATIN CAPITAL LETTER A WITH RING ABOVE
Æ 198 00C6 Æ LATIN CAPITAL LETTER AE
Ç 199 00C7 Ç LATIN CAPITAL LETTER C WITH CEDILLA
È 200 00C8 È LATIN CAPITAL LETTER E WITH GRAVE
É 201 00C9 É LATIN CAPITAL LETTER E WITH ACUTE
Ê 202 00CA Ê LATIN CAPITAL LETTER E WITH CIRCUMFLEX
Ë 203 00CB Ë LATIN CAPITAL LETTER E WITH DIAERESIS
Ì 204 00CC Ì LATIN CAPITAL LETTER I WITH GRAVE
Í 205 00CD Í LATIN CAPITAL LETTER I WITH ACUTE
Î 206 00CE Î LATIN CAPITAL LETTER I WITH CIRCUMFLEX
Ï 207 00CF Ï LATIN CAPITAL LETTER I WITH DIAERESIS
Ð 208 00D0 Ð LATIN CAPITAL LETTER ETH
Ñ 209 00D1 Ñ LATIN CAPITAL LETTER N WITH TILDE
Ò 210 00D2 Ò LATIN CAPITAL LETTER O WITH GRAVE
Ó 211 00D3 Ó LATIN CAPITAL LETTER O WITH ACUTE
Ô 212 00D4 Ô LATIN CAPITAL LETTER O WITH CIRCUMFLEX
Õ 213 00D5 Õ LATIN CAPITAL LETTER O WITH TILDE
Ö 214 00D6 Ö LATIN CAPITAL LETTER O WITH DIAERESIS
× 215 00D7 × MULTIPLICATION SIGN
Ø 216 00D8 Ø LATIN CAPITAL LETTER O WITH STROKE
Ù 217 00D9 Ù LATIN CAPITAL LETTER U WITH GRAVE
Ú 218 00DA Ú LATIN CAPITAL LETTER U WITH ACUTE
Û 219 00DB Û LATIN CAPITAL LETTER U WITH CIRCUMFLEX
Ü 220 00DC Ü LATIN CAPITAL LETTER U WITH DIAERESIS
Ý 221 00DD Ý LATIN CAPITAL LETTER Y WITH ACUTE
Þ 222 00DE Þ LATIN CAPITAL LETTER THORN
ß 223 00DF ß LATIN SMALL LETTER SHARP S
à 224 00E0 à LATIN SMALL LETTER A WITH GRAVE
á 225 00E1 á LATIN SMALL LETTER A WITH ACUTE
â 226 00E2 â LATIN SMALL LETTER A WITH CIRCUMFLEX
ã 227 00E3 ã LATIN SMALL LETTER A WITH TILDE
ä 228 00E4 ä LATIN SMALL LETTER A WITH DIAERESIS
å 229 00E5 å LATIN SMALL LETTER A WITH RING ABOVE
æ 230 00E6 æ LATIN SMALL LETTER AE
ç 231 00E7 ç LATIN SMALL LETTER C WITH CEDILLA
è 232 00E8 è LATIN SMALL LETTER E WITH GRAVE
é 233 00E9 é LATIN SMALL LETTER E WITH ACUTE
ê 234 00EA ê LATIN SMALL LETTER E WITH CIRCUMFLEX
ë 235 00EB ë LATIN SMALL LETTER E WITH DIAERESIS
ì 236 00EC ì LATIN SMALL LETTER I WITH GRAVE
í 237 00ED í LATIN SMALL LETTER I WITH ACUTE
î 238 00EE î LATIN SMALL LETTER I WITH CIRCUMFLEX
ï 239 00EF ï LATIN SMALL LETTER I WITH DIAERESIS
ð 240 00F0 ð LATIN SMALL LETTER ETH
ñ 241 00F1 ñ LATIN SMALL LETTER N WITH TILDE
ò 242 00F2 ò LATIN SMALL LETTER O WITH GRAVE
ó 243 00F3 ó LATIN SMALL LETTER O WITH ACUTE
ô 244 00F4 ô LATIN SMALL LETTER O WITH CIRCUMFLEX
õ 245 00F5 õ LATIN SMALL LETTER O WITH TILDE
ö 246 00F6 ö LATIN SMALL LETTER O WITH DIAERESIS
÷ 247 00F7 ÷ DIVISION SIGN
ø 248 00F8 ø LATIN SMALL LETTER O WITH STROKE
ù 249 00F9 ù LATIN SMALL LETTER U WITH GRAVE
ú 250 00FA ú LATIN SMALL LETTER U WITH ACUTE
û 251 00FB û LATIN SMALL LETTER U WITH CIRCUMFLEX
ü 252 00FC ü LATIN SMALL LETTER U WITH DIAERESIS
ý 253 00FD ý LATIN SMALL LETTER Y WITH ACUTE
þ 254 00FE þ LATIN SMALL LETTER THORN
ÿ 255 00FF ÿ LATIN SMALL LETTER Y WITH DIAERESIS

HTML Tutorial

HTML HOMEHTML IntroductionHTML EditorsHTML BasicHTML ElementsHTML AttributesHTML HeadingsHTML ParagraphsHTML StylesHTML FormattingHTML QuotationsHTML CommentsHTML Colors Colors RGB HEX HSL

HTML CSSHTML Links Links Link Colors Link Bookmarks

HTML Images Images Image Map Background Images The Picture Element

HTML TablesHTML Lists Lists Unordered Lists Ordered Lists Other Lists

HTML Block & InlineHTML ClassesHTML IdHTML IframesHTML JavaScriptHTML File PathsHTML HeadHTML LayoutHTML ResponsiveHTML ComputercodeHTML SemanticsHTML Style GuideHTML EntitiesHTML SymbolsHTML EmojisHTML CharsetHTML URL EncodeHTML vs. XHTML

HTML Tutorial

HTML HOMEHTML IntroductionHTML EditorsHTML BasicHTML ElementsHTML AttributesHTML HeadingsHTML ParagraphsHTML StylesHTML FormattingHTML QuotationsHTML CommentsHTML Colors Colors RGB HEX HSL

HTML CSSHTML Links Links Link Colors Link Bookmarks

HTML Images Images Image Map Background Images The Picture Element

HTML TablesHTML Lists Lists Unordered Lists Ordered Lists Other Lists

HTML Block & InlineHTML ClassesHTML IdHTML IframesHTML JavaScriptHTML File PathsHTML HeadHTML LayoutHTML ResponsiveHTML ComputercodeHTML SemanticsHTML Style GuideHTML EntitiesHTML SymbolsHTML EmojisHTML CharsetHTML URL EncodeHTML vs. XHTML

More Examples

Example

Convert some predefined HTML entities to characters:

<?php $str = «Jane &amp; &#039;Tarzan&#039;»; echo htmlspecialchars_decode($str, ENT_COMPAT); // Will only convert double quotes echo «<br>»; echo htmlspecialchars_decode($str, ENT_QUOTES); // Converts double and single quotes echo «<br>»; echo htmlspecialchars_decode($str, ENT_NOQUOTES); // Does not convert any quotes ?>


The HTML output of the code above will be (View Source):

<!DOCTYPE html><html> <body> Jane & &#039;Tarzan&#039;<br> Jane & ‘Tarzan'<br> Jane & &#039;Tarzan&#039; </body> </html>

The browser output of the code above will be:

Jane & ‘Tarzan’ Jane & ‘Tarzan’ Jane & ‘Tarzan’

Example

Convert the predefined HTML entities to double quotes:

<?php $str = ‘I love &quot;PHP&quot;.’;echo htmlspecialchars_decode($str, ENT_QUOTES); // Converts double and single quotes ?>

The HTML output of the code above will be (View Source):

<!DOCTYPE html><html> <body> I love «PHP». </body> </html>

The browser output of the code above will be:

I love «PHP».


С этим читают