Steps to install NLTK in windows
Step 1: Install Python
Run the installer with Add PATH variable box checked. Otherwise need to manually add the path variable.
After Python is successfully installed you can see the below items under the programs menu
Also can run the "python --version" command from the command prompt to verify python is successfully installed.
Step 2: Install NLTK
Step 2.1: Install numpy
Command: "pip install numpy"
NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. [Ref: https://docs.scipy.org/doc/numpy-1.13.0/user/whatisnumpy.html]
Step 2.2: Install NLTK
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, etc. [Ref: https://www.nltk.org/]
Now NLTK is successfully installed.
Step 2.3: Download NLTK data
Alternative: You can use Python Shell IDE also for this purpose (Use IDLE).
2.3.2. Import nltk by typing - [Command: "import nltk"]
If the command is executed successfully without showing any errors then NLTK is successfully installed.
2.3.3. Download and install nltk data - [Command: "nltk.download()"]
This will show a popup like this:
Here click the Download button and download all the data.
Step 3: Check whether NLTK data is installed successfully
Open Python IDLE and run the following commands.
>>> from nltk.tokenize import word_tokenize
>>> sentence = "Hello Mr. Atheesan, how are you?"
>>> print(word_tokenize(sentence))