FinUniversity Electronic Library

     

Details

Kedia, Aman. Hands-on Python natural language processing: explore tools and techniques to analyze and process text with a view to building real-world NLP applications / Aman Kedia, Mayank Rasu. — 1 online resource (1 volume) : illustrations — <URL:http://elib.fa.ru/ebsco/2512691.pdf>.

Record create date: 10/27/2020

Subject: Natural language processing (Computer science); Python (Computer program language); Mathematical theory of computation.; Natural language & machine translation.; Machine learning.; Data capture & analysis.; Computers — Machine Theory.; Computers — Natural Language Processing.; Computers — Data Processing.

Collections: EBSCO

Allowed Actions:

Action 'Read' will be available if you login or access site from another network Action 'Download' will be available if you login or access site from another network

Group: Anonymous

Network: Internet

Document access rights

Network User group Action
Finuniversity Local Network All Read Print Download
Internet Readers Read Print
-> Internet Anonymous

Table of Contents

  • Cover
  • Title Page
  • Copyright and Credits
  • About Packt
  • Contributors
  • Table of Contents
  • Preface
  • Section 1: Introduction
  • Chapter 1: Understanding the Basics of NLP
    • Programming languages versus natural languages
      • Understanding NLP
    • Why should I learn NLP?
    • Current applications of NLP
      • Chatbots
      • Sentiment analysis
      • Machine translation
      • Named-entity recognition
      • Future applications of NLP
    • Summary
  • Chapter 2: NLP Using Python
    • Technical requirements
    • Understanding Python with NLP 
      • Python's utility in NLP
    • Important Python libraries
      • NLTK
        • NLTK corpora
          • Text processing
          • Part of speech tagging
      • Textblob
        • Sentiment analysis
        • Machine translation
        • Part of speech tagging
      • VADER
    • Web scraping libraries and methodology
    • Overview of Jupyter Notebook
    • Summary
  • Section 2: Natural Language Representation and Mathematics
  • Chapter 3: Building Your NLP Vocabulary
    • Technical requirements
    • Lexicons
    • Phonemes, graphemes, and morphemes
    • Tokenization 
      • Issues with tokenization
      • Different types of tokenizers
        • Regular expressions 
        • Regular expressions-based tokenizers
        • Treebank tokenizer
        • TweetTokenizer 
    • Understanding word normalization
      • Stemming
        • Over-stemming and under-stemming
      • Lemmatization 
        • WordNet lemmatizer
        • Spacy lemmatizer
      • Stopword removal
      • Case folding
      • N-grams
      • Taking care of HTML tags
      • How does all this fit into my NLP pipeline?
    • Summary
  • Chapter 4: Transforming Text into Data Structures
    • Technical requirements
    • Understanding vectors and matrices
      • Vectors
      • Matrices
    • Exploring the Bag-of-Words architecture
      • Understanding a basic CountVectorizer
      • Out-of-the-box features offered by CountVectorizer
        • Prebuilt dictionary and support for n-grams
        • max_features
        • Min_df and Max_df thresholds
      • Limitations of the BoW representation
    • TF-IDF vectors
      • Building a basic TF-IDF vectorizer
      • N-grams and maximum features in the TF-IDF vectorizer 
      • Limitations of the TF-IDF vectorizer's representation
    • Distance/similarity calculation between document vectors
      • Cosine similarity
        • Solving Cosine math
        • Cosine similarity on vectors developed using CountVectorizer
        • Cosine similarity on vectors developed using TfIdfVectorizers tool
    • One-hot vectorization
    • Building a basic chatbot
    • Summary 
  • Chapter 5: Word Embeddings and Distance Measurements for Text
    • Technical requirements
    • Understanding word embeddings
    • Demystifying Word2vec
      • Supervised and unsupervised learning
      • Word2vec – supervised or unsupervised?
      • Pretrained Word2vec 
      • Exploring the pretrained Word2vec model using gensim
      • The Word2vec architecture
        • The Skip-gram method
          • How do you define target and context words?
        • Exploring the components of a Skip-gram model
          • Input vector
          • Embedding matrix
          • Context matrix
          • Output vector
          • Softmax
          • Loss calculation and backpropagation
          • Inference
        • The CBOW method
        • Computational limitations of the methods discussed and how to overcome them
          • Subsampling
          • Negative sampling
          • How to select negative samples
    • Training a Word2vec model 
      • Building a basic Word2vec model
      • Modifying the min_count parameter 
      • Playing with the vector size
      • Other important configurable parameters
      • Limitations of Word2vec
      • Applications of the Word2vec model 
    • Word mover’s distance
    • Summary
  • Chapter 6: Exploring Sentence-, Document-, and Character-Level Embeddings
    • Technical requirements
    • Venturing into Doc2Vec
      • Building a Doc2Vec model
        • Changing vector size and min_count 
        • The dm parameter for switching between modeling approaches
        • The dm_concat parameter
        • The dm_mean parameter
        • Window size
        • Learning rate
    • Exploring fastText 
      • Building a fastText model
      • Building a spelling corrector/word suggestion module using fastText
      • fastText and document distances
    • Understanding Sent2Vec and the Universal Sentence Encoder
      • Sent2Vec
      • The Universal Sentence Encoder
    • Summary 
  • Section 3: NLP and Learning
  • Chapter 7: Identifying Patterns in Text Using Machine Learning
    • Technical requirements
    • Introduction to ML
    • Data preprocessing
      • NaN values
      • Label encoding and one-hot encoding
      • Data standardization
        • Min-max standardization
        • Z-score standardization
    • The Naive Bayes algorithm
      • Building a sentiment analyzer using the Naive Bayes algorithm
    • The SVM algorithm
      • Building a sentiment analyzer using SVM
    • Productionizing a trained sentiment analyzer
    • Summary 
  • Chapter 8: From Human Neurons to Artificial Neurons for Understanding Text
    • Technical requirements
    • Exploring the biology behind neural networks
      • Neurons
      • Activation functions
        • Sigmoid
        • Tanh activation
        • Rectified linear unit
      • Layers in an ANN
    • How does a neural network learn?
      • How does the network get better at making predictions?
    • Understanding regularization
      • Dropout
    • Let's talk Keras
    • Building a question classifier using neural networks
    • Summary
  • Chapter 9: Applying Convolutions to Text
    • Technical requirements
    • What is a CNN?
      • Understanding convolutions
        • Let's pad our data
        • Understanding strides in a CNN
      • What is pooling?
      • The fully connected layer
    • Detecting sarcasm in text using CNNs
      • Loading the libraries and the dataset
      • Performing basic data analysis and preprocessing our data
      • Loading the Word2Vec model and vectorizing our data
      • Splitting our dataset into train and test sets
      • Building the model
      • Evaluating and saving our model
    • Summary
  • Chapter 10: Capturing Temporal Relationships in Text
    • Technical requirements
    • Baby steps toward understanding RNNs
      • Forward propagation in an RNN
      • Backpropagation through time in an RNN
    • Vanishing and exploding gradients
    • Architectural forms of RNNs
      • Different flavors of RNN
      • Carrying relationships both ways using bidirectional RNNs
      • Going deep with RNNs
    • Giving memory to our networks – LSTMs
      • Understanding an LSTM cell
        • Forget gate
        • Input gate
        • Output gate
      • Backpropagation through time in LSTMs
    • Building a text generator using LSTMs
    • Exploring memory-based variants of the RNN architecture
      • GRUs
      • Stacked LSTMs
    • Summary
  • Chapter 11: State of the Art in NLP
    • Technical requirements
    • Seq2Seq modeling
      • Encoders
      • Decoders
        • The training phase
        • The inference phase
    • Translating between languages using Seq2Seq modeling 
    • Let's pay some attention
    • Transformers 
      • Understanding the architecture of Transformers
        • Encoders 
        • Decoders
        • Self-attention
          • How does self-attention work mathematically?
          • A small note on masked self-attention
        • Feedforward neural networks
        • Residuals and layer normalization
        • Positional embeddings
        • How the decoder works
        • The linear layer and the softmax function
        • Transformer model summary
    • BERT 
      • The BERT architecture
      • The BERT model input and output
      • How did BERT the pre-training happen?
        • The masked language model
        • Next-sentence prediction 
      • BERT fine-tuning
    • Summary
  • Other Books You May Enjoy
  • Index

Usage statistics

stat Access count: 0
Last 30 days: 0
Detailed usage statistics