Page 471 - AI Ver 3.0 class 10_Flipbook
P. 471

NLTK


                 The Natural Language Toolkit (NLTK) is one of the most commonly used open-source NLP toolkit that is made
                 up of Python libraries and is used for building programs that help in synthesis and statistical analysis of human
                 language processing. The text processing libraries do text processing through tokenization, parsing, classification,
                 stemming, tagging and semantic reasoning.

                 Installing NLTK

                 To install NLTK open command prompt in your computer and type:

                                                           pip install nltk
                 Downloading NLTK data


                 After the installation of NLTK, import nltk:
                 import nltk
                 then next step is to install all the packages of nltk:

                 nltk.download()
                 It will show the NLTK Downloader dialog box. Now click on the download button to install various packages
                 related to NLTK.

























                 Some important commands of NLTK:

                 Tokenization is the process of converting large textual data into smaller parts called tokens. These tokens help in
                 nlp for finding patterns and are used for further processing through stemming and lemmatization.

                 Tokenizing a sentence will split big sentences into smaller sentences.

                    [1]:  data="Hello friends. Hope you are enjoying doing NLP. Wish you a wonderful experience"
                          from nltk.tokenize import sent_tokenize
                          sent_token=sent_tokenize(data)
                          print(sent_token)
                          ['Hello friends.', 'Hope you are enjoying doing NLP.', 'Wish you a wonderful experience']




                                                                                    Advance Python (Practical)  469
   466   467   468   469   470   471   472   473   474   475   476