Page 66 - Cs_withBlue_J_C11_Flipbook
P. 66
There are different Unicode encoding schemes namely UTF-8, UTF-16 and UTF-32:
• UTF-8: It is a multi-byte, variable-width encoding scheme and is the most popular Unicode encoding scheme. It is
backwards compatible with ASCII. It uses 1 byte for standard English letters, 2 bytes for additional Latin and Middle
Eastern characters, 3 bytes for Asian characters and 4 bytes for additional characters. This encoding scheme is
compatible with HTML, XML, JSON and e-mail.
• UTF-16: It uses 2 bytes to represent 65536 characters and 4 bytes for the additional characters. It is used in operating
systems like Microsoft Windows.
• UTF-32: It represents each character using 8 bytes.
Unicode 14.0 Character Code Charts of some scripts
090 091 092 093 094 095 096 097 000 001 002 003 004 005 006 007
0 ,s B j h ¬ Î ñ 0 NUL DLE SP 0 @ P p
0900 0910 0920 0930 0940 0950 0960 0970 0000 0010 0020 0030 0040 0050 0060 0070
1 vkW M j+ µ 1 SOH DC1 ! 1 A Q a q
0901 0911 0921 0931 0941 0951 0961 0971 0001 0011 0021 0031 0041 0051 0061 0071
2 vk < y vW 2 STX DC2 " 2 B R b r
0902 0912 0922 0932 0942 0952 0962 0972 0002 0012 0022 0032 0042 0052 0062 0072
3 vks .k G ` v© 3 ETX DC3 # 3 C S c s
0903 0913 0923 0933 0943 0953 0963 0973 0003 0013 0023 0033 0043 0053 0063 0073
4 v vkS r G+ A vk© 4 EOT DC4 $ 4 D T d t
0904 0914 0924 0934 0944 0954 0964 0974 0004 0014 0024 0034 0044 0054 0064 0074
5 v d Fk o AA vk 5 ENQ NAK % 5 E U e u
0905 0915 0925 0935 0945 0955 0965 0975 0005 0015 0025 0035 0045 0055 0065 0075
6 vk [k n 'k ú v 6 ACK SYN & 6 F V f v
0906 0916 0926 0936 0946 0956 0966 0976 0006 0016 0026 0036 0046 0056 0066 0076
7 b x / "k û v 7 BEL ETB ' 7 G W g w
0907 0917 0927 0937 0947 0957 0967 0977 0007 0017 0027 0037 0047 0057 0067 0077
8 bZ ?k u l d + ü 8 BS CAN ( 8 H X h x
0908 0918 0928 0938 0948 0958 0968 0978 0008 0018 0028 0038 0048 0058 0068 0078
+
9 m Ä u g kW [k + ý t 9 HT EM ) 9 I Y i y
0909 0919 0929 0939 0949 0959 0969 0979 0009 0019 0029 0039 0049 0059 0069 0079
A Å p i © x + þ ; A LF SUB * : J Z j z
090A 091A 092A 093A 094A 095A 096A 097A 000A 001A 002A 003A 004A 005A 006A 007A
B ½ N iQ © k ”k ÿ x B VT ESC + ; K [ k {
090B 091B 092B 093B 094B 095B 096B 097B 000B 001B 002B 003B 004B 005B 006B 007B
C Æ t c M+ ö t C FF FS , < L \ l |
090C 091C 092C 093C 094C 095C 096C 097C 000C 001C 002C 003C 004C 005C 006C 007C
D ,W > Hk ¿ <+ ÷ D CR GS - = M ] m }
090D 091D 092D 093D 094D 095D 096D 097D 000D 001D 002D 003D 004D 005D 006D 007D
E , ×k e k iQ + ø M E SO RS . > N ^ n ~
090E 091E 092E 093E 094E 095E 096E 097E 000E 001E 002E 003E 004E 005E 006E 007E
F , V ; f k ; + ù c F SI US / ? O _ o DEL
090F 091F 092F 093F 094F 095F 096F 097F 000F 001F 002F 003F 004F 005F 006F 007F
Unicode Codes for Indian Scripts
Language Script Starting hexa code Ending hexa code
Urdu Arabic 0060 067f
Hindi Devanagari 0090 097f
Bengali Bengali 0980 09ff
Punjabi Gurumukhi 0a00 0a7f
Gujarati Gujarati 0a80 0aff
Oriya Oriya 0b00 0b7f
Tamil Tamil 0b80 0bff
Telugu Telugu 0c00 0c7f
Kannada Kannada 0c80 0cff
Malayalam Malayalam 0d00 0dff
Advantages of Unicode are as follows:
• A total of 159 scripts having 144697 characters are supported by Unicode which covers all recognised languages in
the world.
6464 Touchpad Computer Science-XI

