bugl
bugl
HomeLearnPatternsPathsSearch
HomeLearnPatternsPathsSearch

Loading lesson path

Learn/HTML/HTML Foundations
HTML•HTML Foundations

HTML Encoding (Character Sets)

Flash cards

Review the key moves

1/4
Core idea

What is the main idea behind HTML Encoding (Character Sets)?

Lesson checks

Practice each idea before moving on

Short Mimo-style checks built from this lesson's code, terms, and sequence.

1Quick choice

Which statement best captures the main point of this lesson?

2Fill blank

Complete the missing token from the example code.

<___ charset="UTF-8">
3Order

Put the learning moves in the order that makes the concept easiest to apply.

The ANSI Character Set
The ASCII Character Set
The HTML charset Attribute

The HTML charset Attribute

To display an HTML page correctly, a web browser must know which character set to use.

The character set is specified in the <meta> tag:

<meta charset="UTF-8">

The HTML specification encourages web developers to use the UTF-8 character set.

UTF-8 covers almost all of the characters and symbols in the world!

Learn More

Full UTF-8 Reference

The ASCII Character Set

ASCII was the first character encoding standard for the web.

It defined 128 different latin characters that could be used on the internet:

  • English letters (a-z and A-Z)
  • Numbers (0-9)
  • Some special characters: ! $ + - ( ) @ < > . # ?

The ANSI Character Set

ANSI (Windows-1252) was the first Windows character set :

  • Identical to ASCII for the first 127 characters
  • Special characters from 128 to 159
  • Identical to UTF-8 from 160 to 255
<meta charset="Windows-1252">

The ISO-8859-1 Character Set

The default character set for HTML 4 was ISO-8859-1 .

It supported 256 characters

  • Identical to ASCII for the first 127 characters
  • Does not use the characters from 128 to 159
  • Identical to ANSI and UTF-8 from 160 to 255

HTML 4 Example

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

HTML 5 Example

<meta charset="ISO-8859-1">

The UTF-8 Character Set

  • Identical to ASCII for the values from 0 to 127
  • Does not use the characters from 128 to 159
  • Identical to ANSI and 8859-1 from 160 to 255
  • Continues from the value 256 to 10 000 characters
<meta charset="UTF-8">

Learn More

Full UTF-8 Reference

HTML UTF-8 Characters

Basic Latin

Latin Extended A

Latin Extended B

Latin Extended C

Latin Extended D

Latin Extended E

IPA Extentions

Spacing Modifiers

Diacritical Marks

General Punctuation

Super and Subscript

Braille

Previous

Using Emojis in HTML

Next

HTML Uniform Resource Locators