FinUniversity Electronic Library

     

Details

Ghosh, Sohom. Natural Language Processing Fundamentals [[electronic resource]]: Build Intelligent Applications That Can Interpret the Human Language to Deliver Impactful Results. — Birmingham: Packt Publishing Ltd, 2019. — 1 online resource (375 p.) — <URL:http://elib.fa.ru/ebsco/2094770.pdf>.

Record create date: 4/13/2019

Subject: Natural language processing (Computer science); Computational linguistics.; Artificial intelligence.; COMPUTERS / General

Collections: EBSCO

Allowed Actions:

Action 'Read' will be available if you login or access site from another network Action 'Download' will be available if you login or access site from another network

Group: Anonymous

Network: Internet

Annotation

Natural Language Processing Fundamentals starts with basics and goes on to explain various NLP tools and techniques that equip you with all that you need to solve common business problems for processing text.

Document access rights

Network User group Action
Finuniversity Local Network All Read Print Download
Internet Readers Read Print
-> Internet Anonymous

Table of Contents

  • Cover
  • FM
  • Copyright
  • Table of Contents
  • Preface
  • Chapter 1: Introduction to Natural Language Processing
    • Introduction
    • History of NLP
    • Text Analytics and NLP
      • Exercise 1: Basic Text Analytics
    • Various Steps in NLP
      • Tokenization
      • Exercise 2: Tokenization of a Simple Sentence
      • PoS Tagging
      • Exercise 3: PoS Tagging
      • Stop Word Removal
      • Exercise 4: Stop Word Removal
      • Text Normalization
      • Exercise 5: Text Normalization
      • Spelling Correction
      • Exercise 6: Spelling Correction of a Word and a Sentence
      • Stemming
      • Exercise 7: Stemming
      • Lemmatization
      • Exercise 8: Extracting the base word using Lemmatization
      • NER
      • Exercise 9: Treating Named Entities
      • Word Sense Disambiguation
      • Exercise 10: Word Sense Disambiguation
      • Sentence Boundary Detection
      • Exercise 11: Sentence Boundary Detection
      • Activity 1: Preprocessing of Raw Text
    • Kick Starting an NLP Project
      • Data Collection
      • Data Preprocessing
      • Feature Extraction
      • Model Development
      • Model Assessment
      • Model Deployment
    • Summary
  • Chapter 2: Basic Feature Extraction Methods
    • Introduction
    • Types of Data
      • Categorizing Data Based on Structure
      • Categorization of Data Based on Content
    • Cleaning Text Data
      • Tokenization
      • Exercise 12: Text Cleaning and Tokenization
      • Exercise 13: Extracting n-grams
      • Exercise 14: Tokenizing Texts with Different Packages – Keras and TextBlob
      • Types of Tokenizers
      • Exercise 15: Tokenizing Text Using Various Tokenizers
      • Issues with Tokenization
      • Stemming
      • RegexpStemmer
      • Exercise 16: Converting words in gerund form into base words using RegexpStemmer
      • The Porter Stemmer
      • Exercise 17: The Porter Stemmer
      • Lemmatization
      • Exercise 18: Lemmatization
      • Exercise 19: Singularizing and Pluralizing Words
      • Language Translation
      • Exercise 20: Language Translation
      • Stop-Word Removal
      • Exercise 21: Stop-Word Removal
    • Feature Extraction from Texts
      • Extracting General Features from Raw Text
      • Exercise 22: Extracting General Features from Raw Text
      • Activity 2: Extracting General Features from Text
      • Bag of Words
      • Exercise 23: Creating a BoW
      • Zipf's Law
      • Exercise 24: Zipf's Law
      • TF-IDF
      • Exercise 25: TF-IDF Representation
      • Activity 3: Extracting Specific Features from Texts
    • Feature Engineering
      • Exercise 26: Feature Engineering (Text Similarity)
      • Word Clouds
      • Exercise 27: Word Clouds
      • Other Visualizations
      • Exercise 28: Other Visualizations (Dependency Parse Trees and Named Entities)
      • Activity 4: Text Visualization
    • Summary
  • Chapter 3: Developing a Text classifier
    • Introduction
    • Machine Learning
      • Unsupervised Learning
      • Hierarchical Clustering
      • Exercise 29: Hierarchical Clustering
      • K-Means Clustering
      • Exercise 30: K-Means Clustering
      • Supervised Learning
      • Classification
      • Logistic Regression
      • Naive Bayes Classifiers
      • K-Nearest Neighbors
      • Exercise 31: Text Classification (Logistic regression, Naive Bayes, and KNN)
      • Regression
      • Linear Regression
      • Exercise 32: Regression Analysis Using Textual Data
      • Tree Methods
      • Random Forest
      • GBM and XGBoost
      • Exercise 33: Tree-Based Methods (Decision Tree, Random Forest, GBM, and XGBoost)
      • Sampling
      • Exercise 34: Sampling (Simple Random, Stratified, Multi-Stage)
    • Developing a Text Classifier
      • Feature Extraction
      • Feature Engineering
      • Removing Correlated Features
      • Exercise 35: Removing Highly Correlated Features (Tokens)
      • Dimensionality Reduction
      • Exercise 36: Dimensionality Reduction (PCA)
      • Deciding on a Model Type
      • Evaluating the Performance of a Model
      • Exercise 37: Calculate the RMSE and MAPE
      • Activity 5: Developing End-to-End Text Classifiers
    • Building Pipelines for NLP Projects
      • Exercise 38: Building Pipelines for NLP Projects
    • Saving and Loading Models
      • Exercise 39: Saving and Loading Models
    • Summary
  • Chapter 4: Collecting Text Data from the Web
    • Introduction
    • Collecting Data by Scraping Web Pages
      • Exercise 40: Extraction of Tag-Based Information from HTML Files
    • Requesting Content from Web Pages
      • Exercise 41: Collecting Online Text Data
      • Exercise 42: Analyzing the Content of Jupyter Notebooks (in HTML Format)
      • Activity 6: Extracting Information from an Online HTML Page
      • Activity 7: Extracting and Analyzing Data Using Regular Expressions
    • Dealing with Semi-Structured Data
      • JSON
      • Exercise 43: Dealing with JSON Files
      • Activity 8: Dealing with Online JSON Files
      • XML
      • Exercise 44: Dealing with a Local XML File
      • Using APIs to Retrieve Real-Time Data
      • Exercise 45: Collecting Data Using APIs
      • API Creation
      • Activity 9: Extracting Data from Twitter
      • Extracting Data from Local Files
      • Exercise 46: Extracting Data from Local Files
      • Exercise 47: Performing Various Operations on Local Files
    • Summary
  • Chapter 5: Topic Modeling
    • Introduction
    • Topic Discovery
      • Discovering Themes
      • Exploratory Data Analysis
      • Document Clustering
      • Dimensionality Reduction
      • Historical Analysis
      • Bag of Words
    • Topic Modeling Algorithms
      • Latent Semantic Analysis
      • LSA – How It Works
      • Exercise 48: Analyzing Reuters News Articles with Latent Semantic Analysis
      • Latent Dirichlet Allocation
      • LDA – How It Works
      • Exercise 49: Topics in Airline Tweets
      • Topic Fingerprinting
      • Exercise 50: Visualizing Documents Using Topic Vectors
      • Activity 10: Topic Modelling Jeopardy Questions
    • Summary
  • Chapter 6: Text Summarization and Text Generation
    • Introduction
    • What is Automated Text Summarization?
      • Benefits of Automated Text Summarization
    • High-Level View of Text Summarization
      • Purpose
      • Input
      • Output
      • Extractive Text Summarization
      • Abstractive Text Summarization
      • Sequence to Sequence
      • Encoder Decoder
    • TextRank
      • Exercise 51: TextRank from Scratch
    • Summarizing Text Using Gensim
      • Activity 11: Summarizing a Downloaded Page Using the Gensim Text Summarizer
    • Summarizing Text Using Word Frequency
      • Exercise 52: Word Frequency Text Summarization
    • Generating Text with Markov Chains
      • Markov Chains
      • Exercise 53: Generating Text Using Markov Chains
    • Summary
  • Chapter 7: Vector Representation
    • Introduction
    • Vector Definition
    • Why Vector Representations?
      • Encoding
      • Character-Level Encoding
      • Exercise 54: Character Encoding Using ASCII Values
      • Exercise 55: Character Encoding with the Help of NumPy Arrays
      • Positional Character-Level Encoding
      • Exercise 56: Character-Level Encoding Using Positions
      • One-Hot Encoding
      • Key Steps in One-Hot Encoding
      • Exercise 57: Character One-Hot Encoding – Manual
      • Exercise 58: Character-Level One-Hot Encoding with Keras
      • Word-Level One-Hot Encoding
      • Exercise 59: Word-Level One-Hot Encoding
      • Word Embeddings
      • Word2Vec
      • Exercise 60: Training Word Vectors
      • Using Pre-Trained Word Vectors
      • Exercise 61: Loading Pre-Trained Word Vectors
      • Document Vectors
      • Uses of Document Vectors
      • Exercise 62: From Movie Dialogue to Document Vectors
      • Activity 12: Finding Similar Movie Lines Using Document Vectors
    • Summary
  • Chapter 8: Sentiment Analysis
    • Introduction
    • Why is Sentiment Analysis Required?
    • Growth of Sentiment Analysis
      • Monetization of Emotion
      • Types of Sentiments
      • Key Ideas and Terms
      • Applications of Sentiment Analysis
    • Tools Used for Sentiment Analysis
      • NLP Services from Major Cloud Providers
      • Online Marketplaces
      • Python NLP Libraries
      • Deep Learning Libraries
    • TextBlob
      • Exercise 63: Basic Sentiment Analysis Using the TextBlob Library
      • Activity 13: Tweet Sentiment Analysis Using the TextBlob library
    • Understanding Data for Sentiment Analysis
      • Exercise 64: Loading Data for Sentiment Analysis
    • Training Sentiment Models
      • Exercise 65: Training a Sentiment Model Using TFIDF and Logistic Regression
    • Summary
  • Appendix
  • Index

Usage statistics

stat Access count: 0
Last 30 days: 0
Detailed usage statistics