Page 299 - AI Ver 1.0 Class 10
P. 299

Installing NLTK

                 To install NLTK open command prompt in your computer and type:

                 pip install nltk
                 To install in anaconda environment, the command is:


                 conda install -c anaconda nltk

                 Downloading NLTK data

                 After the installation of NLTK, import nltk:

                 >>> import nltk
                 then next step is to install all the packages of nltk:


                 >>>nltk.download()

                 Some important commands of NLTK:

                 Tokenization is the process of converting large textual data into smaller parts called tokens. These tokens help in
                 nlp for finding patterns and are used for further processing through stemming and lemmatization.

                 Tokenizing a sentence will split big sentences into smaller sentences.

                 >>>data="Hello  friends.  Hope you are enjoying  doing NLP. Wish you a wonderful
                 learning experience"

                 >>>from nltk.tokenize import sent_tokenize
                 >>>sent_token=sent_tokenize(data)

                 >>>print(sent_token)
                 Tokenizing a word will split a sentence into words.


                 >>> word_token=nltk.word_tokenize(data)
                 >>>print(word_token)

                 Stemming is the process of extracting base word from the given word:

                 >>>from nltk.stem import PorterStemmer

                 >>>ps = PorterStemmer()
                 >>>ps.stem('enjoying')

                 enjoy

                 >>>ps.stem('learning')
                 learn

                 Lemmatization is the process of extracting a base word called lemma. It is considered a better way than stemming
                 because stemming just removes the suffix without considering the actual meaning of the word.




                                                                               Natural Language Processing  297
   294   295   296   297   298   299   300   301   302   303   304