jueves, 27 de mayo de 2010

HTML ASCII Reference - The ASCII Character Set

HTML ASCII Reference

« Previous Next Reference »

The ASCII character-set is used to send information between computers on the Internet.


The ASCII Character Set

ASCII stands for the "American Standard Code for Information Interchange". It was designed in the early 60's, as a standard character-set for computers and hardware devices like teleprinters and tapedrives.

ASCII is a 7-bit character set containing 128 characters.

It contains the numbers from 0-9, the uppercase and lowercase English letters from A to Z, and some special characters.

The character-sets used in modern computers, HTML, and Internet are all based on ASCII.

The following table lists the 128 ASCII characters and their equivalent HTML entity codes.


ASCII Printable Characters

ASCII Character HTML Entity Code Description
space
! ! exclamation mark
" " quotation mark
# # number sign
$ $ dollar sign
% % percent sign
& & ampersand
' ' apostrophe
( ( left parenthesis
) ) right parenthesis
* * asterisk
+ + plus sign
, , comma
- - hyphen
. . period
/ / slash
0 0 digit 0
1 1 digit 1
2 2 digit 2
3 3 digit 3
4 4 digit 4
5 5 digit 5
6 6 digit 6
7 7 digit 7
8 8 digit 8
9 9 digit 9
: : colon
; ; semicolon
< less-than
= = equals-to
> > greater-than
? ? question mark
@ @ at sign
A A uppercase A
B B uppercase B
C C uppercase C
D D uppercase D
E E uppercase E
F F uppercase F
G G uppercase G
H H uppercase H
I I uppercase I
J J uppercase J
K K uppercase K
L L uppercase L
M M uppercase M
N N uppercase N
O O uppercase O
P P uppercase P
Q Q uppercase Q
R R uppercase R
S S uppercase S
T T uppercase T
U U uppercase U
V V uppercase V
W W uppercase W
X X uppercase X
Y Y uppercase Y
Z Z uppercase Z
[ [ left square bracket
\ \ backslash
] ] right square bracket
^ ^ caret
_ _ underscore
` ` grave accent
a a lowercase a
b b lowercase b
c c lowercase c
d d lowercase d
e e lowercase e
f f lowercase f
g g lowercase g
h h lowercase h
i i lowercase i
j j lowercase j
k k lowercase k
l l lowercase l
m m lowercase m
n n lowercase n
o o lowercase o
p p lowercase p
q q lowercase q
r r lowercase r
s s lowercase s
t t lowercase t
u u lowercase u
v v lowercase v
w w lowercase w
x x lowercase x
y y lowercase y
z z lowercase z
{ { left curly brace
| | vertical bar
} } right curly brace
~ ~ tilde


ASCII Device Control Characters

The ASCII device control characters were originally designed to control hardware devices.

Control characters have nothing to do inside an HTML document.

ASCII Character HTML Entity Code Description
NUL null character
SOH  start of header
STX  start of text
ETX  end of text
EOT  end of transmission
ENQ  enquiry
ACK  acknowledge
BEL  bell (ring)
BS  backspace
HT horizontal tab
LF line feed
VT vertical tab
FF form feed
CR carriage return
SO  shift out
SI  shift in
DLE  data link escape
DC1  device control 1
DC2  device control 2
DC3  device control 3
DC4  device control 4
NAK  negative acknowledge
SYN  synchronize
ETB  end transmission block
CAN  cancel
EM  end of medium
SUB  substitute
ESC  escape
FS  file separator
GS  group separator
RS  record separator
US  unit separator
DEL  delete (rubout)

Typing Arabic and Farsi Numerals

Typing Arabic and Farsi Numerals

1. While Microsoft Word is open, click on the Tools drop-down menu on the top of the screen. (Although these instructions are given for Windows XP, they should also work for Windows 2002.)

2. Next, click on Options....

3. Select the Complex Scripts tab.

3. Next, click on Right-to-Left button in the lower General section of the Complex Scripts tab, so that a black dot appears next to the Right-to-Left button..

4. Next, click on the downward pointing black triangle for the Numeral drop-down menu and select Context.

5. MS Word should now type Arabic numerals when typing Arabic text.

Last Edit Date 11-18-2002


Smoothing Screen Fonts

Smoothing Screen Fonts

Before

Example of Arabic text before smoothing is applied to Arabic screen fonts.

After

Example of the same Arabic text after smoothing has been applied to Arabic screen fonts.

Windows XP

Using font smoothing makes Arabic, Farsi, Urdu, and other fonts look more appealing on a computer screen.

1. First, click on the Control Panel icon.

2. Next, click on Performance and Maintenance.

3. Next, click on Adjust visual effects.

4. Next, check the Smooth Edges of screen fonts box.

Last edited 03-05-2002

Windows 2000

Using font smoothing on Windows 2000.

1. First, click on the Display icon in the Control Panel.

2. Next, click on the Effects tab of the Display Properties.

3. Finally, click on the box for Smooth edges of screen fonts box, so that a check mark appears within the box.

Outlook 2000

Outlook 2000

Font Selection: It is necessary to select a font that is able to display Unicode characters. Andalus, Arial Unicode MS, Arabic Simplified, Arabic Transparent, Lucida Sans Unicode, Microsoft Sans Serif, Tahoma, Times New Roman, and Traditional Arabic are able to display some of the Unicode fonts. Arial Unicode MS font displays a very wide range of Unicode characters. For most Arabic text viewing, Times New Roman, Arabic Transparent or Traditional Arabic fonts should be suitable.

However, if you experience any difficulties, Arial Unicode MS font should be tried. The Times New Roman font that comes with Windows 2000 is a Unicode capable font, and it displays Arabic, English, Greek, and Hebrew fonts nicely. Prior versions of Times New Roman were not tested by me.

Outlook 2000: First, click the mouse on Tools and select Options....

Second, click the mouse on Mail Format and then select HTML for "Send in this message format:"

Third, click the mouse on International Options. Select Unicode (UTF-8) for outgoing and unmarked received messages. Check the Use English for message flags and headers.

International Font Selection: For International Font settings, select Unicode. Next, select a font that is Unicode compatible, such as Times New Roman. Choose a font size that is easily read. Finally, set the Encoding to Unicode (UTF-8). Click the Set as Default button to retain the Unicode configuration. Click the mouse on the OK button.

After Outlook 2000 has a Unicode encoding and a compatible font, Arabic, English, Greek, and Hebrew fonts should display in Outlook 2000 properly.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.


Last edited 06-11-2000



Outlook Express 5

Outlook Express 5

Font Selection: First it is necessary to select a font that is able to display Unicode characters. Andalus, Arial Unicode MS, Arabic Simplified, Arabic Transparent, Lucida Sans Unicode, Microsoft Sans Serif, Tahoma, Times New Roman, and Traditional Arabic are able to display some of the Unicode fonts. Arial Unicode MS font displays a very wide range of Unicode characters. For most Arabic text viewing, Times New Roman, Arabic Transparent or Traditional Arabic fonts should be suitable. However, if you experience any difficulties, Arial Unicode MS font should be tried. The Times New Roman font that comes with Windows 2000 is a Unicode font, and it displays Arabic, English, Greek, and Hebrew fonts nicely.

The current release of Times New Roman of Windows 2000 displays Arabic, English, Greek, and Hebrew properly and is a good choice. Prior versions of Times New Roman were not tested by me.

Outlook Express 5: First, click the mouse on Tools and select Options....

Read: Click the mouse on Read and then select Fonts.

Third, click the mouse on Font settings. Select Unicode and then a Proportional font that displays Unicode characters. After Unicode is highlighted, Unicode compatible fonts are displayed in the Proportional font box. Select Unicode (UTF-8) for Encoding. Finally click the mouse on the Set as Default button to make it your default setting.

Send: Click the mouse on Send.

Then, click the mouse on International Settings. Select Unicode (UTF-8) for the Default encoding.

Encoding Selection: Second, it is necessary to select an encoding scheme that includes the Arabic, English, Greek, and Hebrew Unicode number ranges. Unicode(UTF-8) should be suitable for most email.

First, click the mouse on View .

Second, click the mouse on Encoding. Third, click the mouse on Unicode (UTF-8).

After Outlook Express has a Unicode encoding and a compatible font, Arabic, English, Greek, and Hebrew fonts should display in Outlook Express email properly.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.


Last edited 06-11-2000



Netscape Messenger 4.7

Netscape Messenger 4.7

Character Set & Font Selection: For Character Set and Font selection, see the Netscape Browser information. Netscape

Email Formatting: To format a Messenger email to send Unicode, it is necessary to use the Messenger's HTML setting. First, first click the mouse on Edit and select Preferences....

Second, if necessary, click the mouse on the (+) Mail & Newsgroups to expand the sub-choices and then select Formatting. Mark the Message formatting to Use the HTML editor to compose messages. For some email servers, it may be necessary to select "Send the message in HTML anyway" so that the message does not appear in both HTML and plain text too.

Email HTML Formatting: First, click the mouse on Tools and select HTML Tools. Then click the mouse on Edit HTML Source... This will bring up a window with HTML code that can be edited.

HTML Email: After Netscape's Messenger is set to the above default settings. The software is able to send email messages in different languages that are encoded in Unicode.

After Netscape Messenger has Unicode encoding and a Unicode compatible font, Arabic, English, Greek, and Hebrew fonts should display in Netscape's email properly.

Of course, when your email is completely in the English language, no changes to the coding are necessary.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.

Last edited 06-11-2000

Netscape

Netscape

Font Selection: It is necessary to select a font that is able to display Unicode characters. Andalus, Arial Unicode MS, Arabic Simplified, Arabic Transparent, Lucida Sans Unicode, Microsoft Sans Serif, Tahoma, Times New Roman, and Traditional Arabic are able to display some of the Unicode fonts. Arial Unicode MS font displays a very wide range of Unicode characters. For most Arabic text viewing, Arabic Transparent or Traditional Arabic fonts should be suitable.

However, if you experience any difficulties, Arial Unicode MS font should be tried. The Times New Roman font that came with Windows 2000 displayed is a Unicode capable font, and it displays Arabic, English, Greek, and Hebrew fonts nicely.

Netscape Font Selection: First, click the mouse on Edit.

Second, click the mouse on Preferences.

Third, click the mouse on Fonts under Appearance. Then select Unicode Encoding and a Unicode compatible font. Select a font that displays the appropriate target language or languages.

Fourth, click the mouse on OK to finalize the selection.

Character Set: Next, it is necessary to select a Character Set that includes the Arabic Unicode number range. Often, Unicode(UTF-8) will be satisfactory.

First, click the mouse on View .

Second, click the mouse on Character Set. Third, click the mouse on Unicode (UTF-8). Fourth, click the mouse on Set Default Character Set. This assures that Unicode (UTF-8) will remain after the program is restarted.

Finally, click the mouse on Reload to activate the new settings.

After Netscape has a Unicode compatible font, and the Character Set is Unicode (UTF-8), the Arabic fonts should display properly.

Test settings: There are a couple of sites that have Unicode font pages. The Unicode decimal number range for Arabic is 1536-1791, 64336-65023, and 65136-65279. If Netscape has been properly configured, the Arabic fonts should display properly.

Another web site has an excellent chart of the decimal number ranges of Unicode. The 1001-2000 , 64001-64999 , and 65001-65536

Hopefully, these comments and pictures will provide you with enough information so that you will be able to view Arabic Unicode text. A similar procedure is followed to display Greek and Hebrew Unicode characters. First select a Unicode font and then select the appropriate Encoding scheme. Finally, test the settings by viewing a page that has the appropriate decimal number for the Unicode language fonts.

Normal settings: To return to your normal settings, select the Western (ISO-8859-1) Character Set and the font of your choice. However, if you have chosen a Unicode font that can display all of the languages that you plan to view, you can leave your browser's configuration set to view Unicode fonts.

There is additional help for Netscape users at Allan Wood's site.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.


Last edited 06-11-2000



Internet Explorer

Internet Explorer

Font Selection: It is necessary to select a font that is able to display Unicode characters. Andalus, Arial Unicode MS, Arabic Simplified, Arabic Transparent, Lucida Sans Unicode, Microsoft Sans Serif, Tahoma, Times New Roman, and Traditional Arabic are able to display some of the Unicode fonts. Arial Unicode MS font displays a very wide range of Unicode characters. For most Arabic text viewing, Arabic Transparent or Traditional Arabic fonts should be suitable.

However, if you experience difficulties, Arial Unicode MS font should be tried. The Times New Roman font that comes with Windows 2000 is a Unicode capable font, and it displays Arabic, English, Greek, and Hebrew fonts nicely.

Internet Explorer Font Selection: First, click the mouse on Tools.

Second, click the mouse on Internet Options.

Third, click the mouse on Fonts. In the beginning, it may be best to select Arial Unicode MS font because it can display a wide range of Unicode fonts. However, you may find that Times New Roman has a more pleasing font face.

Encoding Selection: Next, it is necessary to select an encoding scheme that includes the Arabic Unicode number range. However, Unicode(UTF-8) should be satisfactory for most purposes, including Arabic Unicode text.

First, click the mouse on View .

Second, click the mouse on Encoding. Third, click the mouse on Unicode (UTF-8) and deselect Auto-Select. Deselecting Auto-Select assures that Internet Explorer will use Unicode encoding each time the program is reloaded.

Finally, click the mouse on Refresh to activate the new settings.

After Internet Explorer has a Unicode compatible font, and the Encoding is set to Unicode, Arabic fonts should display properly.

Test settings: There are a couple of sites that have Unicode font pages. The Unicode decimal number range for Arabic is 1536-1791, 64336-65023, and 65136-65279. If Internet Explorer has been properly configured, the Arabic fonts should display properly.

Another web site has an excellent chart of the decimal number ranges of Unicode. The 1001-2000 , 64001-64999 , and 65001-65536

Hopefully, these comments and pictures will provide you with enough information so that you will be able to view Arabic Unicode text. A similar procedure is followed to display Greek and Hebrew Unicode characters. First select a Unicode font and then select the appropriate Encoding scheme. Finally, test the settings by viewing a page that has the appropriate decimal number for the Unicode language fonts.

Normal settings: To return to your normal settings, select the Western European (Windows) encoding scheme, activate the Auto-Select for Encoding, and select the font of your choice. However, if you have chosen a Unicode font that can display all of the languages that you plan to view, you can leave your browser's configuration set to view Unicode fonts.

There is additional help for Internet Explorer users at Allan Wood's site.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.

Last edited 06-08-2000

Windows XP: Language Settings

Windows XP: Language Settings

It is necessary to configure Windows XP to view various languages with different programs, such as Internet Explorer.

Important Note: Before attempting to add a Language Group, make sure that you have the Windows XP CD available.

Add a Language Group: To add a Language Group, click on Start, Settings, and Control Panel. Click on Date, Time, Language, and Regional Options.

Next, click on Regional and Language Options.

Language Options: Click on the Language tab, and check the "Install files for complex script and right-to-left languages" if you have selected the Arabic, Farsi, or Urdu languages.

Regional Options: Click on the Regional Option tab, and then select the various languages from the drop-down menu to be installed on your computer.

Details...: Click on the Details... tab to add the keyboards configurations for the different languages.

Second, make sure that your default language is correctly selected. For example, English (United States) is set to Default. Finally, click OK. This assures that English is the default language.

Language Bar... Click on the Language Bar tab and check "Show the Language bar on the desktop." This facilitates quickly changing from one language to another language.

If the default language is English, EN should appear at the bottom right-hand side of the taskbar. It will be to the left of the clock's time.

To change the keyboard's input selection, click on the EN

to see the Locales selection pop-up menu. Click on the particular language to be able to use the keyboard to type in that particular language. After the selection is made, the taskbar icon will change to the chosen language. For example, if the selected language were Arabic the taskbar would display an AR.

Important Note: These notes are merely offered as a help to our readers. The reader is solely responsible regarding the use and consequences of using these notes.

Last edited 12-03-2003

Windows 2000: Language Settings

Windows 2000: Language Settings


It is necessary to configure Windows 2000 to view various languages with different programs, such as Internet Explorer. Before an individual language can be added, the language group needs to be added. It is important to remember that the name for an individual language may be different from the name of the language group. For example, Hindi is an individual language in the Indic language group. So, to add the Hindi language, it is first necessary to add the Indic language group.

Important Note: Before attempting to add a Language Group, make sure that you have the Windows 2000 CD available.

Add a Language Group: To add a Language Group, click on Start, Settings, and Control Panel.

Next, click on Regional Settings in the Control Panel,

Regional Options: Click on the General tab, and then check the Language Groups to be added in the Language settings for the system section.

Second, make sure that Western Europe and United States is set to Default. Finally, click OK. This assures that English is the default language. After OK is clicked, a prompt appears asking that the Windows 2000 CD to be inserted. After the CD is inserted, click OK.

After the Language Groups are installed, a prompt appears requesting to Restart the Computer. Click OK to restart the computer.

After the computer has re-started, return to the Control Panel and click on Regional Options. Then click on the Input Locales tab.

Add an individual Language: The individual languages are selected in the Input Locales section of the Regional Options.

Input Locales: The selections under this tab heading configure the keyboard to be an input device for various languages. Under Installed input locales, click Add to add a particular language.

Near the bottom left-hand side, check Enable indicator on taskbar box. Finally, click OK. If the Default language is English, EN should appear at the bottom right-hand side of the taskbar. It will be to the left of the clock's time.

To change the keyboard's input selection, click on the EN

to see the Locales selection pop-up menu. Click on the particular language to be able to use the keyboard to type in that particular language. After the selection is made, the taskbar icon will change to the chosen language. For example, if the selected language were Arabic the taskbar would display an AR.

To verify that the selections are working, load Microsoft Word 2000 and select a particular language by clicking on the Input Locales icon in the taskbar and selecting the desired language. Make sure that the Input Locales icon has changed to reflect the new language choice. Finally, begin using Word 2000. The text should be in the selected language, if the configuration were successful.

Important Note: These notes are merely offered as a help to our readers. The reader is solely responsible regarding the use and consequences of using these notes.

Last edited 03-03-2001

General Unicode Information

General Unicode Information

General: In the past, both the web page programmer and the casual reader had to use the same font. Typically, this meant that the reader had to find a site from which to download and install the foreign language font that was used by the programmer in order to view the foreign language web page.

Unicode allows the web page developer to design a foreign language web page using a common universal standard. The reader may then use any font system that meets the Unicode standard.

The Unicode standard is an evolving standard that is becoming more capable. Some language fonts are Unicode aware, but they only display Unicode encoding for a limited Unicode range, e.g., they may only display the fonts of one particular language. Other fonts, like Arial Unicode MS, can display a very wide range of foreign language characters. However, some of its fonts don't seem as visually pleasing as Times New Roman. And, Times New Roman comes with Windows 2000, and it displays Arabic, English, Greek, and Hebrew fonts that are well formed. So, it would be a good choice.

Since Unicode is an encoding scheme that requires a Unicode aware font, the reader must configure his or her browser properly. The appropriate font must be selected for the web browser as well as the appropriate Encoding (Internet Explorer) or Character Set (Netscape).

Details on configuring both Internet Explorer and Netscape are given on the linked pages. Click on the appropriate web browser that is named on the side bar of this page.

After the web browser is properly configured, the Sample Text in the next column should display all three languages properly. The next column has the Sample Text in Unicode and in a Gif image. The Sample Text Gif image is correctly displayed. The Sample Text above the Gif image should look similar to the Gif image.

Microsoft Internet Explorer & Email: Microsoft

MS Fonts Used in the Style Sheets:

Arabic: Arabic Transparent, Traditional Arabic, Times New Roman, Simplified Arabic


Farsi: Farsi Simple Bold, Times New Roman, Arabic Transparent, Nesf


Greek: Times New Roman, Lucida Sans Unicode, Arial Unicode MS


Hebrew: Times New Roman, Lucida Sans Unicode, David

HTML Language Codes:

Language Codes: ISO 639, Microsoft and Macintosh by the Unicode Consortium


ISO 639-2 Codes for the Representation of Names of Languages by the Library of Congress

Useful Unicode Links:

Unicode Consortium


Alan Wood's Unicode Resources


FarsiWeb Project

Useful Unicode Charts:

Index for Unicode HTML Reference


Alan Wood's Unicode Charts

Useful Keyboard Layouts:

Microsoft Windows Keyboard Layouts


Windows 2000 Visual Keyboard is screen pop-up keyboard representation that helps to locate keys for the various international languages in Microsoft Windows 2000 programs.


Windows XP Visual Keyboard is screen pop-up keyboard representation that helps to locate keys for the various international languages in Microsoft Windows XP programs.


The Key Connection sells foreign language keyboards.


Fentek Industries Inc sells Arabic keyboards.


Arabic Keyboard layouts shows various Arabic keyboard layouts.

Useful Font's Property tool:

Font properties extension, version 2.1



After you install ttfext.exe, if you right click on a font in Windows, the font's properties are displayed. It provides information about the font and its language capabilities. For example, the Gif image shows various languages that Times New Roman version 2.82 supports.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.

Last updated 12-03-2007

Miscellaneous Font links:

Dr. Shirley's Font Pages
McCreedy's Gallery of Fonts: Arabic
Luc Devroye, School of Computer Science,McGill University
Montreal, Canada
Soft.Vip600 Fonts

Useful Unicode Fonts:

Arabic True Type Open Fonts Pack from Microsoft has some useful Arabic and Farsi fonts. After the download, click on "arafonts.exe" to install the fonts.

Scheherazade and Lateef Fonts handle complex-script rendering much more smoothly than Traditional Arabic or Times New Roman. The fonts are available for free under the SIL International's Open Font License.

Code2000 font covers various languages including Syriac.
Christoph Singer Slavic Text Processing lists Unicode font download sites.
Athena Greek Unicode and Galatia SIL Greek Unicode fonts offer accent marks.
Borna Rayaneh has various Unicode Farsi font that may be downloaded. B Lotus and B Nazanin are excellent fonts.
The Shahedy web site has a number of Farsi fonts to download, such as, B Compset, B Badr, B Lotus, and B Zar.
Ezra SIL Hebrew Unicode fonts offer accent marks.
Beth Mardutho: The Syriac Institute offers Syriac fonts.

If there are difficulties with these browser and email configurations suggestions, please let me know. I have not been able to do many tests.

Non-Unicode fonts:

Avesta - Zoroastrian Archives: Avesta
Iranian Chamber Society: Aramaic, Avesta, Cuneiform, Pahlavi, and Parthian fonts
Minnesota Iranian Font Family: Avesta & Pahlavi fonts.
St. Mary Coptic Orthodox Church: Coptic fonts.

Sample Text:

Arabic Text (Surah 1:1):

بِسْمِ اللّهِ الرَّحْمـَنِ الرَّحِيمِ

Greek Text (John 1:1):

εν αρχη ην ο λογος και ο λογος ην προς
τον θεον και θεος ην ο λογος

Hebrew Text (Genesis 1:1):

בראשית ברא אלהים את השמים ואת הארץ׃

The Browser Sample Text above should look similar to the gray Gif image of the Sample Text below.

Sample Text in Unicode Encoding: Unicode encoding as seen in View, Source (Internet Explorer) or View, Page Source (Netscape). This type of encoding in a page's source code is an indication that Unicode is being utilized to encode the language characters.

Arabic Text (Surah 1:1):

بِسْم
ِاللّ
هِالر
َّحْم
ـَنِا
لرَّح
ِيمِ

Greek Text (John 1:1):

εναρχη
ηνολογ
οςκαιο
λογοςη
νπροςτ
ονθεον
καιθεο
ςηνολο
γος

Hebrew Text (Genesis 1:1):

בראשי
תבראא
להיםא
תהשמי
םואתה
ארץ׃

Correo Vaishnava