Page 65 - Cs_withBlue_J_C11_Flipbook
P. 65
Code points 128 to 255 represent basic characters from different Indian languages. Composite characters are formed
with the combination of these basic characters. The ISCII character code covers languages like Hindi, Bengali, Tamil,
Telugu, Kannada, Malayalam, Assamese, Oriya, Gujarati and Punjabi.
In addition to the code points representing characters, ISCII makes use of ATR (Attribute code) followed by byte code
to change different font attributes or different ISCII languages as shown in the tables given below:
Change in formatting attributes
ATR + byte Mnemonic Font attributes
0 × 30 BLD Makes text bold
0 × 31 ITA Makes text italics
0 × 32 UL Underlines text
0 × 34 HLT Highlights text
0 × 35 OTL Outlines text
Change in ISCII scripts
ATR + byte Mnemonic Font attributes
0 × 42 DEV Devanagari
0 × 43 BNG Bengali
0 × 44 TML Tamil
0 × 45 TLG Telugu
0 × 46 ASM Assamese
0 × 47 ORI Odia
0 × 48 KND Kannada
0 × 49 MLM Malayalam
0 × 4A GJR Gujarati
0 × 4B PNJ Gurmukhi
Advantages of the ISCII code are as follows:
• Most of the Indian languages can be represented by this code.
• The character set is simple and easy to understand.
• Transliteration between different Indian languages is very easy.
Disadvantages of the ISCII code are as follows:
• A special keyboard containing ISCII character keys is required.
• After the development of the Unicode scheme which includes the ISCII codes, the latter became obsolete.
2.3.4 Unicode Character Set
There are numerous languages spoken across the world. But the early character encoding was limited to English and
a few European languages only. Even then, there were different encoding schemes available. This led to cases where
the same code point referred to two different characters in different encoding schemes and ultimately led to conflict.
So, an information technology standard that enabled consistent encoding, representation and handling of the text
of all recognised global languages was essential. For this purpose, a non-profit organisation called the Unicode
Consortium was established in California in 1991. It replaced the existing character encoding schemes with Unicode
and its standard Unicode Transformation Format (UTF) schemes. A total of 159 scripts having 144697 characters are
included in the latest version of Unicode 14.0 released on September 2021.
The Unicode is an international character encoding standard that includes different languages, scripts and symbols.
Each letter, digit or symbol has its own unique Unicode value. Unicode is an extension of ASCII that allows many more
characters to be represented.
63
Encoding 63

