Card | Table | RUSMARC | |
Kedia, Aman. Hands-on Python natural language processing: explore tools and techniques to analyze and process text with a view to building real-world NLP applications / Aman Kedia, Mayank Rasu. — 1 online resource (1 volume) : illustrations — <URL:http://elib.fa.ru/ebsco/2512691.pdf>.Record create date: 10/27/2020 Subject: Natural language processing (Computer science); Python (Computer program language); Mathematical theory of computation.; Natural language & machine translation.; Machine learning.; Data capture & analysis.; Computers — Machine Theory.; Computers — Natural Language Processing.; Computers — Data Processing. Collections: EBSCO Allowed Actions: –
Action 'Read' will be available if you login or access site from another network
Action 'Download' will be available if you login or access site from another network
Group: Anonymous Network: Internet |
Document access rights
Network | User group | Action | ||||
---|---|---|---|---|---|---|
Finuniversity Local Network | All | |||||
Internet | Readers | |||||
Internet | Anonymous |
Table of Contents
- Cover
- Title Page
- Copyright and Credits
- About Packt
- Contributors
- Table of Contents
- Preface
- Section 1: Introduction
- Chapter 1: Understanding the Basics of NLP
- Programming languages versus natural languages
- Understanding NLP
- Why should I learn NLP?
- Current applications of NLP
- Chatbots
- Sentiment analysis
- Machine translation
- Named-entity recognition
- Future applications of NLP
- Summary
- Programming languages versus natural languages
- Chapter 2: NLP Using Python
- Technical requirements
- Understanding Python with NLP
- Python's utility in NLP
- Important Python libraries
- NLTK
- NLTK corpora
- Text processing
- Part of speech tagging
- NLTK corpora
- Textblob
- Sentiment analysis
- Machine translation
- Part of speech tagging
- VADER
- NLTK
- Web scraping libraries and methodology
- Overview of Jupyter Notebook
- Summary
- Section 2: Natural Language Representation and Mathematics
- Chapter 3: Building Your NLP Vocabulary
- Technical requirements
- Lexicons
- Phonemes, graphemes, and morphemes
- Tokenization
- Issues with tokenization
- Different types of tokenizers
- Regular expressions
- Regular expressions-based tokenizers
- Treebank tokenizer
- TweetTokenizer
- Understanding word normalization
- Stemming
- Over-stemming and under-stemming
- Lemmatization
- WordNet lemmatizer
- Spacy lemmatizer
- Stopword removal
- Case folding
- N-grams
- Taking care of HTML tags
- How does all this fit into my NLP pipeline?
- Stemming
- Summary
- Chapter 4: Transforming Text into Data Structures
- Technical requirements
- Understanding vectors and matrices
- Vectors
- Matrices
- Exploring the Bag-of-Words architecture
- Understanding a basic CountVectorizer
- Out-of-the-box features offered by CountVectorizer
- Prebuilt dictionary and support for n-grams
- max_features
- Min_df and Max_df thresholds
- Limitations of the BoW representation
- TF-IDF vectors
- Building a basic TF-IDF vectorizer
- N-grams and maximum features in the TF-IDF vectorizer
- Limitations of the TF-IDF vectorizer's representation
- Distance/similarity calculation between document vectors
- Cosine similarity
- Solving Cosine math
- Cosine similarity on vectors developed using CountVectorizer
- Cosine similarity on vectors developed using TfIdfVectorizers tool
- Cosine similarity
- One-hot vectorization
- Building a basic chatbot
- Summary
- Chapter 5: Word Embeddings and Distance Measurements for Text
- Technical requirements
- Understanding word embeddings
- Demystifying Word2vec
- Supervised and unsupervised learning
- Word2vec – supervised or unsupervised?
- Pretrained Word2vec
- Exploring the pretrained Word2vec model using gensim
- The Word2vec architecture
- The Skip-gram method
- How do you define target and context words?
- Exploring the components of a Skip-gram model
- Input vector
- Embedding matrix
- Context matrix
- Output vector
- Softmax
- Loss calculation and backpropagation
- Inference
- The CBOW method
- Computational limitations of the methods discussed and how to overcome them
- Subsampling
- Negative sampling
- How to select negative samples
- The Skip-gram method
- Training a Word2vec model
- Building a basic Word2vec model
- Modifying the min_count parameter
- Playing with the vector size
- Other important configurable parameters
- Limitations of Word2vec
- Applications of the Word2vec model
- Word mover’s distance
- Summary
- Chapter 6: Exploring Sentence-, Document-, and Character-Level Embeddings
- Technical requirements
- Venturing into Doc2Vec
- Building a Doc2Vec model
- Changing vector size and min_count
- The dm parameter for switching between modeling approaches
- The dm_concat parameter
- The dm_mean parameter
- Window size
- Learning rate
- Building a Doc2Vec model
- Exploring fastText
- Building a fastText model
- Building a spelling corrector/word suggestion module using fastText
- fastText and document distances
- Understanding Sent2Vec and the Universal Sentence Encoder
- Sent2Vec
- The Universal Sentence Encoder
- Summary
- Section 3: NLP and Learning
- Chapter 7: Identifying Patterns in Text Using Machine Learning
- Technical requirements
- Introduction to ML
- Data preprocessing
- NaN values
- Label encoding and one-hot encoding
- Data standardization
- Min-max standardization
- Z-score standardization
- The Naive Bayes algorithm
- Building a sentiment analyzer using the Naive Bayes algorithm
- The SVM algorithm
- Building a sentiment analyzer using SVM
- Productionizing a trained sentiment analyzer
- Summary
- Chapter 8: From Human Neurons to Artificial Neurons for Understanding Text
- Technical requirements
- Exploring the biology behind neural networks
- Neurons
- Activation functions
- Sigmoid
- Tanh activation
- Rectified linear unit
- Layers in an ANN
- How does a neural network learn?
- How does the network get better at making predictions?
- Understanding regularization
- Dropout
- Let's talk Keras
- Building a question classifier using neural networks
- Summary
- Chapter 9: Applying Convolutions to Text
- Technical requirements
- What is a CNN?
- Understanding convolutions
- Let's pad our data
- Understanding strides in a CNN
- What is pooling?
- The fully connected layer
- Understanding convolutions
- Detecting sarcasm in text using CNNs
- Loading the libraries and the dataset
- Performing basic data analysis and preprocessing our data
- Loading the Word2Vec model and vectorizing our data
- Splitting our dataset into train and test sets
- Building the model
- Evaluating and saving our model
- Summary
- Chapter 10: Capturing Temporal Relationships in Text
- Technical requirements
- Baby steps toward understanding RNNs
- Forward propagation in an RNN
- Backpropagation through time in an RNN
- Vanishing and exploding gradients
- Architectural forms of RNNs
- Different flavors of RNN
- Carrying relationships both ways using bidirectional RNNs
- Going deep with RNNs
- Giving memory to our networks – LSTMs
- Understanding an LSTM cell
- Forget gate
- Input gate
- Output gate
- Backpropagation through time in LSTMs
- Understanding an LSTM cell
- Building a text generator using LSTMs
- Exploring memory-based variants of the RNN architecture
- GRUs
- Stacked LSTMs
- Summary
- Chapter 11: State of the Art in NLP
- Technical requirements
- Seq2Seq modeling
- Encoders
- Decoders
- The training phase
- The inference phase
- Translating between languages using Seq2Seq modeling
- Let's pay some attention
- Transformers
- Understanding the architecture of Transformers
- Encoders
- Decoders
- Self-attention
- How does self-attention work mathematically?
- A small note on masked self-attention
- Feedforward neural networks
- Residuals and layer normalization
- Positional embeddings
- How the decoder works
- The linear layer and the softmax function
- Transformer model summary
- Understanding the architecture of Transformers
- BERT
- The BERT architecture
- The BERT model input and output
- How did BERT the pre-training happen?
- The masked language model
- Next-sentence prediction
- BERT fine-tuning
- Summary
- Other Books You May Enjoy
- Index
Usage statistics
Access count: 0
Last 30 days: 0 Detailed usage statistics |