sábado, 6 de febrero de 2010

24 Character entity references in HTML 4

24 Character entity references in HTML 4

24.1 Introduction to character entity references

A character entity reference is an SGML construct that references a character of the document character set.

This version of HTML supports several sets of character entity references:

The following sections present the complete lists of character entity references. Although, by convention, [ISO10646] the comments following each entry are usually written with uppercase letters, we have converted them to lowercase in this specification for reasons of readability.

24.2 Character entity references for ISO 8859-1 characters

The character entity references in this section produce characters whose numeric equivalents should already be supported by conforming HTML 2.0 user agents. Thus, the character entity reference ÷ is a more convenient form than ÷ for obtaining the division sign (÷).

To support these named entities, user agents need only recognize the entity names and convert them to characters that lie within the repertoire of [ISO88591].

Character 65533 (FFFD hexadecimal) is the last valid character in UCS-2. 65534 (FFFE hexadecimal) is unassigned and reserved as the byte-swapped version of ZERO WIDTH NON-BREAKING SPACE for byte-order detection purposes. 65535 (FFFF hexadecimal) is unassigned.

24.2.1 The list of characters




U+00A0 ISOnum -->





U+00A6 ISOnum -->

U+00A8 ISOdia -->


= left pointing guillemet, U+00AB ISOnum -->

U+00AD ISOnum -->
U+00AE ISOnum -->
= APL overbar, U+00AF ISOdia -->

U+00B1 ISOnum -->
= squared, U+00B2 ISOnum -->
= cubed, U+00B3 ISOnum -->
U+00B4 ISOdia -->

U+00B6 ISOnum -->
= Greek middle dot, U+00B7 ISOnum -->

U+00B9 ISOnum -->
U+00BA ISOnum -->
= right pointing guillemet, U+00BB ISOnum -->
= fraction one quarter, U+00BC ISOnum -->
= fraction one half, U+00BD ISOnum -->
= fraction three quarters, U+00BE ISOnum -->
= turned question mark, U+00BF ISOnum -->
= latin capital letter A grave,
U+00C0 ISOlat1 -->
U+00C1 ISOlat1 -->
U+00C2 ISOlat1 -->
U+00C3 ISOlat1 -->
U+00C4 ISOlat1 -->
= latin capital letter A ring,
U+00C5 ISOlat1 -->
= latin capital ligature AE,
U+00C6 ISOlat1 -->
U+00C7 ISOlat1 -->
U+00C8 ISOlat1 -->
U+00C9 ISOlat1 -->
U+00CA ISOlat1 -->
U+00CB ISOlat1 -->
U+00CC ISOlat1 -->
U+00CD ISOlat1 -->
U+00CE ISOlat1 -->
U+00CF ISOlat1 -->

U+00D1 ISOlat1 -->
U+00D2 ISOlat1 -->
U+00D3 ISOlat1 -->
U+00D4 ISOlat1 -->
U+00D5 ISOlat1 -->
U+00D6 ISOlat1 -->

= latin capital letter O slash,
U+00D8 ISOlat1 -->
U+00D9 ISOlat1 -->
U+00DA ISOlat1 -->
U+00DB ISOlat1 -->
U+00DC ISOlat1 -->
U+00DD ISOlat1 -->
U+00DE ISOlat1 -->
U+00DF ISOlat1 -->
= latin small letter a grave,
U+00E0 ISOlat1 -->
U+00E1 ISOlat1 -->
U+00E2 ISOlat1 -->
U+00E3 ISOlat1 -->
U+00E4 ISOlat1 -->
= latin small letter a ring,
U+00E5 ISOlat1 -->
= latin small ligature ae, U+00E6 ISOlat1 -->
U+00E7 ISOlat1 -->
U+00E8 ISOlat1 -->
U+00E9 ISOlat1 -->
U+00EA ISOlat1 -->
U+00EB ISOlat1 -->
U+00EC ISOlat1 -->
U+00ED ISOlat1 -->
U+00EE ISOlat1 -->
U+00EF ISOlat1 -->

U+00F1 ISOlat1 -->
U+00F2 ISOlat1 -->
U+00F3 ISOlat1 -->
U+00F4 ISOlat1 -->
U+00F5 ISOlat1 -->
U+00F6 ISOlat1 -->

= latin small letter o slash,
U+00F8 ISOlat1 -->
U+00F9 ISOlat1 -->
U+00FA ISOlat1 -->
U+00FB ISOlat1 -->
U+00FC ISOlat1 -->
U+00FD ISOlat1 -->
U+00FE ISOlat1 -->
U+00FF ISOlat1 -->

24.3 Character entity references for symbols, mathematical symbols, and Greek letters

The character entity references in this section produce characters that may be represented by glyphs in the widely available Adobe Symbol font, including Greek characters, various bracketing symbols, and a selection of mathematical operators such as gradient, product, and summation symbols.

To support these entities, user agents may support full [ISO10646] or use other means. Display of glyphs for these characters may be obtained by being able to display the relevant [ISO10646] characters or by other means, such as internally mapping the listed entities, numeric character references, and characters to the appropriate position in some font that contains the requisite glyphs.

When to use Greek entities. This entity set contains all the letters used in modern Greek. However, it does not include Greek punctuation, precomposed accented characters nor the non-spacing accents (tonos, dialytika) required to compose them. There are no archaic letters, Coptic-unique letters, or precomposed letters for Polytonic Greek. The entities defined here are not intended for the representation of modern Greek text and would not be an efficient representation; rather, they are intended for occasional Greek letters used in technical and mathematical works.

24.3.1 The list of characters










= florin, U+0192 ISOtech -->




U+0393 ISOgrk3 -->
U+0394 ISOgrk3 -->



U+0398 ISOgrk3 -->


U+039B ISOgrk3 -->







U+03A3 ISOgrk3 -->

U+03A5 ISOgrk3 -->
U+03A6 ISOgrk3 -->

U+03A8 ISOgrk3 -->
U+03A9 ISOgrk3 -->

U+03B1 ISOgrk3 -->

U+03B3 ISOgrk3 -->
U+03B4 ISOgrk3 -->
U+03B5 ISOgrk3 -->


U+03B8 ISOgrk3 -->

U+03BA ISOgrk3 -->
U+03BB ISOgrk3 -->






U+03C2 ISOgrk3 -->
U+03C3 ISOgrk3 -->

U+03C5 ISOgrk3 -->



U+03C9 ISOgrk3 -->
U+03D1 NEW -->
U+03D2 NEW -->



U+2022 ISOpub -->

U+2026 ISOpub -->

U+2033 ISOtech -->
U+203E NEW -->



= Weierstrass p, U+2118 ISOamso -->
U+2111 ISOamso -->
U+211C ISOamso -->

U+2135 NEW -->








= carriage return, U+21B5 NEW -->



U+21D2 ISOtech -->


U+21D4 ISOamsa -->





U+2205 ISOamso -->
U+2207 ISOtech -->




U+220F ISOamsb -->





U+221A ISOtech -->









U+223C ISOtech -->


U+2248 ISOamsr -->



U+2265 ISOtech -->





U+2287 ISOtech -->
U+2295 ISOamsb -->
U+2297 ISOamsb -->
U+22A5 ISOtech -->




U+2308 ISOamsc -->

U+230A ISOamsc -->

U+2329 ISOtech -->

U+232A ISOtech -->








U+2663 ISOpub -->
U+2665 ISOpub -->

24.4 Character entity references for markup-significant and internationalization characters

The character entity references in this section are for escaping markup-significant characters (these are the same as those in HTML 2.0 and 3.2), for denoting spaces and dashes. Other characters in this section apply to internationalization issues such as the disambiguation of bidirectional text (see the section on bidirectional text for details).

Entities have also been added for the remaining characters occurring in CP-1252 which do not occur in the HTMLlat1 or HTMLsymbol entity sets. These all occur in the 128 to 159 range within the CP-1252 charset. These entities permit the characters to be denoted in a platform-independent manner.

To support these entities, user agents may support full [ISO10646] or use other means. Display of glyphs for these characters may be obtained by being able to display the relevant [ISO10646] characters or by other means, such as internally mapping the listed entities, numeric character references, and characters to the appropriate position in some font that contains the requisite glyphs.

24.4.1 The list of characters










U+0022 ISOnum -->





U+0152 ISOlat2 -->


U+0160 ISOlat2 -->
U+0161 ISOlat2 -->
U+0178 ISOlat2 -->


U+02C6 ISOpub -->






U+200C NEW RFC 2070 -->





U+2018 ISOnum -->
U+2019 ISOnum -->

U+201C ISOnum -->
U+201D ISOnum -->




U+2039 ISO proposed -->

U+203A ISO proposed -->


No hay comentarios:

Publicar un comentario

Correo Vaishnava