Page 66 - Cs_withBlue_J_C11_Flipbook
P. 66

There are different Unicode encoding schemes namely UTF-8, UTF-16 and UTF-32:
              •  UTF-8: It is a multi-byte, variable-width encoding scheme and is the most popular Unicode encoding scheme. It is
                backwards compatible with ASCII. It uses 1 byte for standard English letters, 2 bytes for additional Latin and Middle
                Eastern characters, 3 bytes for Asian characters and 4 bytes for additional characters. This encoding scheme is
                compatible with HTML, XML, JSON and e-mail.
              •  UTF-16: It uses 2 bytes to represent 65536 characters and 4 bytes for the additional characters. It is used in operating
                systems like Microsoft Windows.
              •  UTF-32: It represents each character using 8 bytes.
                                           Unicode 14.0 Character Code Charts of some scripts
                                         090 091 092 093 094 095  096 097  000 001 002 003 004 005 006 007
                                       0    ,s  B  j   h ¬ Î ñ       0  NUL  DLE  SP  0 @ P  p
                                         0900  0910 0920 0930 0940 0950 0960 0970  0000 0010 0020 0030 0040  0050 0060 0070
                                       1    vkW M j+       µ         1  SOH  DC1 !  1 A Q a q
                                         0901  0911 0921  0931 0941 0951 0961 0971  0001 0011 0021 0031 0041  0051 0061 0071
                                       2    vk < y            vW     2  STX  DC2 " 2 B R b r
                                         0902  0912 0922  0932 0942 0952 0962 0972  0002 0012 0022 0032 0042  0052 0062 0072
                                       3    vks .k G  `       v©     3  ETX  DC3 # 3 C S c   s
                                         0903  0913 0923  0933 0943 0953 0963 0973  0003 0013 0023 0033 0043  0053 0063 0073
                                       4  v vkS r G+        A vk©    4  EOT  DC4 $ 4 D T d   t
                                         0904  0914 0924  0934 0944 0954 0964 0974  0004 0014 0024 0034 0044  0054 0064 0074
                                       5  v d Fk o         AA vk     5  ENQ  NAK % 5 E U e   u
                                         0905  0915 0925  0935 0945 0955 0965 0975  0005 0015 0025 0035 0045  0055 0065 0075
                                       6  vk [k n 'k       ú v       6  ACK  SYN & 6 F V f   v
                                         0906  0916 0926  0936 0946 0956 0966 0976  0006 0016 0026 0036 0046  0056 0066 0076
                                       7  b x / "k         û v       7  BEL  ETB  '  7 G W g w
                                         0907  0917 0927  0937 0947 0957 0967 0977  0007 0017 0027 0037 0047  0057 0067 0077
                                       8  bZ ?k u l     d +  ü       8  BS  CAN (  8 H X h   x
                                         0908  0918 0928  0938 0948 0958 0968 0978  0008 0018 0028 0038 0048  0058 0068 0078
                                                +
                                       9  m Ä u g     kW  [k +  ý t  9  HT  EM  )  9  I  Y i  y
                                         0909  0919 0929  0939 0949 0959 0969 0979  0009 0019 0029 0039 0049  0059 0069 0079
                                       A  Å p i    ©    x +  þ ;     A  LF  SUB *  :  J  Z j  z
                                         090A  091A 092A  093A 094A 095A 096A 097A  000A 001A 002A 003A 004A  005A 006A 007A
                                       B  ½ N  iQ   © k  ”k  ÿ x     B  VT  ESC  +  ; K  [ k  {
                                         090B  091B 092B  093B 094B 095B 096B 097B  000B 001B 002B 003B 004B  005B 006B 007B
                                       C  Æ t c         M+  ö t      C  FF  FS  ,  < L  \  l  |
                                         090C  091C 092C  093C 094C 095C 096C 097C  000C 001C 002C 003C 004C  005C 006C 007C
                                       D  ,W > Hk ¿     <+  ÷        D  CR  GS  - = M ] m }
                                         090D  091D 092D  093D 094D 095D 096D 097D  000D 001D 002D 003D 004D  005D 006D 007D
                                       E  , ×k e    k   iQ +  ø M    E  SO  RS  .  > N ^ n ~
                                         090E  091E 092E  093E 094E 095E 096E 097E  000E 001E 002E 003E 004E  005E 006E 007E
                                       F  , V ; f      k  ; +  ù c   F  SI  US  /  ? O _ o   DEL
                                         090F  091F 092F  093F  094F 095F 096F 097F     000F 001F  002F  003F  004F  005F 006F  007F
                                                    Unicode Codes for Indian Scripts
                                   Language            Script       Starting hexa code  Ending hexa code
                               Urdu              Arabic                  0060               067f
                               Hindi             Devanagari              0090               097f
                               Bengali           Bengali                 0980               09ff
                               Punjabi           Gurumukhi               0a00               0a7f
                               Gujarati          Gujarati                0a80               0aff
                               Oriya             Oriya                   0b00               0b7f
                               Tamil             Tamil                   0b80               0bff
                               Telugu            Telugu                  0c00               0c7f
                               Kannada           Kannada                 0c80               0cff
                               Malayalam         Malayalam               0d00               0dff
              Advantages of Unicode are as follows:
              •  A total of 159 scripts having 144697 characters are supported by Unicode which covers all recognised languages in
                the world.


                6464  Touchpad Computer Science-XI
   61   62   63   64   65   66   67   68   69   70   71