Updated 2013-07-02 @ 10:00 EDT (UT-4)
Important Notice

Set the pulldown menus to the Unicode page you want to observe. US-ASCII is on the first half of page "000". The page will display this when it is initially loaded and when reloaded. For each Unicode character, this page will display your browser's render of that character. Click on any of the Unicode symbols and it will be displayed in a large size at the top. This is already selected, so a simple "copy" operation will put it on the clipboard ready for pasting into any document that supports Unicode. The least significant hex digit runs along the columns, and the most significant hex digit runs down the columns.

The pull-downs at the upper left of the table set the high-order hexadecimal bytes for the code table. For example, with the pull-downs set to their default, and hexadecimal numbers written with the standard "0x" prefix, character 0x00050 is an upper-case "P". Character 0x00051 is an upper-case "Q". With the pull-downs set to 0x026 you can see that character 0x02600 is a filled sun graphic and 0x02602 is an umbrella. "Frosty the Snowman" is character 0x02603.

Binary information in modern computers is represented as groups of 8-bit bytes, or "octets" as they are sometimes called. Originally, at least in the US, computers used "US-ASCII" to represent text. Each character was represented by a single byte with a value from 0 - 127. Now the UTF-8 encoding of "Unicode" is generally used everywhere.

The digits of Unicode characters are normally represented using the Hexadecimal (base 16) number system shown below:


In UTF-8 all of the old US-ASCII characters are still represented by the same single bytes with the same values because US-ASCII is a proper subset of Unicode and its UTF-8 representation. Other characters, including nearly all characters used in all living languages and most of the common "dead" languages are represented using 2, 3, or 4 bytes. The bytes used for this are the bytes with values from 128 - 255, although a few of these are not used.

UTF-8 is self-synchronizing. That means that a fragment of a file will be properly synchronized immediately upon encountering the beginning byte of any UTF-8 character, including US-ASCII characters.

To use UTF-8 on a web page, consult Web Browser Support or just type the "HTML-Equivalent" formulation. for example: For ñ type ñ character 0xf1. For Å type Å character 0xc5. For ∜ type ∜ character 0x221c. For ⌘ type ⌘ character 0x2318. For ♖ type ♖ character 0x2656.

Alternatively, and more directly, you can just copy the large character from the top of the chart and paste it in, assuming that the default character set for the page is UTF-8, as below:
ñ Å ∜ ⌘ ♖ For a good introduction to Unicode and its UTF-8 encoding, start with the Unicode article on Wikipedia.

The actual standards are maintained by The Unicode Consortium at unicode.org.

