Электронная библиотека Финансового университета

     

Детальная информация

Rothman, Denis. Transformers for natural language processing: build, train, and fine-tune deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, and GPT-3 / Denis Rothman ; foreword by Antonio Gulli. — Second edition. — 1 online resource. — Includes index. — <URL:http://elib.fa.ru/ebsco/3197830.pdf>.

Дата создания записи: 29.06.2022

Тематика: Artificial intelligence — Data processing.; Python (Computer program language); Cloud computing.

Коллекции: EBSCO

Разрешенные действия:

Действие 'Прочитать' будет доступно, если вы выполните вход в систему или будете работать с сайтом на компьютере в другой сети Действие 'Загрузить' будет доступно, если вы выполните вход в систему или будете работать с сайтом на компьютере в другой сети

Группа: Анонимные пользователи

Сеть: Интернет

Права на использование объекта хранения

Место доступа Группа пользователей Действие
Локальная сеть Финуниверситета Все Прочитать Печать Загрузить
Интернет Читатели Прочитать Печать
-> Интернет Анонимные пользователи

Оглавление

  • Copyright
  • Foreword
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: What are Transformers?
    • The ecosystem of transformers
      • Industry 4.0
      • Foundation models
        • Is programming becoming a sub-domain of NLP?
        • The future of artificial intelligence specialists
    • Optimizing NLP models with transformers
      • The background of transformers
    • What resources should we use?
      • The rise of Transformer 4.0 seamless APIs
      • Choosing ready-to-use API-driven libraries
      • Choosing a Transformer Model
      • The role of Industry 4.0 artificial intelligence specialists
    • Summary
    • Questions
    • References
  • Chapter 2: Getting Started with the Architecture of the Transformer Model
    • The rise of the Transformer: Attention is All You Need
      • The encoder stack
        • Input embedding
        • Positional encoding
        • Sublayer 1: Multi-head attention
        • Sublayer 2: Feedforward network
      • The decoder stack
        • Output embedding and position encoding
        • The attention layers
        • The FFN sublayer, the post-LN, and the linear layer
    • Training and performance
    • Tranformer models in Hugging Face
    • Summary
    • Questions
    • References
  • Chapter 3: Fine-Tuning BERT Models
    • The architecture of BERT
      • The encoder stack
        • Preparing the pretraining input environment
        • Pretraining and fine-tuning a BERT model
    • Fine-tuning BERT
      • Hardware constraints
      • Installing the Hugging Face PyTorch interface for BERT
      • Importing the modules
      • Specifying CUDA as the device for torch
      • Loading the dataset
      • Creating sentences, label lists, and adding BERT tokens
      • Activating the BERT tokenizer
      • Processing the data
      • Creating attention masks
      • Splitting the data into training and validation sets
      • Converting all the data into torch tensors
      • Selecting a batch size and creating an iterator
      • BERT model configuration
      • Loading the Hugging Face BERT uncased base model
      • Optimizer grouped parameters
      • The hyperparameters for the training loop
      • The training loop
      • Training evaluation
      • Predicting and evaluating using the holdout dataset
      • Evaluating using the Matthews Correlation Coefficient
      • The scores of individual batches
      • Matthews evaluation for the whole dataset
    • Summary
    • Questions
    • References
  • Chapter 4: Pretraining a RoBERTa Model from Scratch
    • Training a tokenizer and pretraining a transformer
    • Building KantaiBERT from scratch
      • Step 1: Loading the dataset
      • Step 2: Installing Hugging Face transformers
      • Step 3: Training a tokenizer
      • Step 4: Saving the files to disk
      • Step 5: Loading the trained tokenizer files
      • Step 6: Checking resource constraints: GPU and CUDA
      • Step 7: Defining the configuration of the model
      • Step 8: Reloading the tokenizer in transformers
      • Step 9: Initializing a model from scratch
        • Exploring the parameters
      • Step 10: Building the dataset
      • Step 11: Defining a data collator
      • Step 12: Initializing the trainer
      • Step 13: Pretraining the model
      • Step 14: Saving the final model (+tokenizer + config) to disk
      • Step 15: Language modeling with FillMaskPipeline
    • Next steps
    • Summary
    • Questions
    • References
  • Chapter 5: Downstream NLP Tasks with Transformers
    • Transduction and the inductive inheritance of transformers
      • The human intelligence stack
      • The machine intelligence stack
    • Transformer performances versus Human Baselines
      • Evaluating models with metrics
        • Accuracy score
        • F1-score
        • Matthews Correlation Coefficient (MCC)
      • Benchmark tasks and datasets
        • From GLUE to SuperGLUE
        • Introducing higher Human Baselines standards
        • The SuperGLUE evaluation process
      • Defining the SuperGLUE benchmark tasks
        • BoolQ
        • Commitment Bank (CB)
        • Multi-Sentence Reading Comprehension (MultiRC)
        • Reading Comprehension with Commonsense Reasoning Dataset (ReCoRD)
        • Recognizing Textual Entailment (RTE)
        • Words in Context (WiC)
        • The Winograd schema challenge (WSC)
    • Running downstream tasks
      • The Corpus of Linguistic Acceptability (CoLA)
      • Stanford Sentiment TreeBank (SST-2)
      • Microsoft Research Paraphrase Corpus (MRPC)
      • Winograd schemas
    • Summary
    • Questions
    • References
  • Chapter 6: Machine Translation with the Transformer
    • Defining machine translation
      • Human transductions and translations
      • Machine transductions and translations
    • Preprocessing a WMT dataset
      • Preprocessing the raw data
      • Finalizing the preprocessing of the datasets
    • Evaluating machine translation with BLEU
      • Geometric evaluations
      • Applying a smoothing technique
        • Chencherry smoothing
    • Translation with Google Translate
    • Translations with Trax
      • Installing Trax
      • Creating the original Transformer model
      • Initializing the model using pretrained weights
      • Tokenizing a sentence
      • Decoding from the Transformer
      • De-tokenizing and displaying the translation
    • Summary
    • Questions
    • References
  • Chapter 7: The Rise of Suprahuman Transformers with GPT-3 Engines
    • Suprahuman NLP with GPT-3 transformer models
    • The architecture of OpenAI GPT transformer models
      • The rise of billion-parameter transformer models
      • The increasing size of transformer models
        • Context size and maximum path length
      • From fine-tuning to zero-shot models
      • Stacking decoder layers
      • GPT-3 engines
    • Generic text completion with GPT-2
      • Step 9: Interacting with GPT-2
    • Training a custom GPT-2 language model
      • Step 12: Interactive context and completion examples
    • Running OpenAI GPT-3 tasks
      • Running NLP tasks online
      • Getting started with GPT-3 engines
        • Running our first NLP task with GPT-3
        • NLP tasks and examples
    • Comparing the output of GPT-2 and GPT-3
    • Fine-tuning GPT-3
      • Preparing the data
        • Step 1: Installing OpenAI
        • Step 2: Entering the API key
        • Step 3: Activating OpenAI’s data preparation module
      • Fine-tuning GPT-3
        • Step 4: Creating an OS environment
        • Step 5: Fine-tuning OpenAI’s Ada engine
        • Step 6: Interacting with the fine-tuned model
    • The role of an Industry 4.0 AI specialist
      • Initial conclusions
    • Summary
    • Questions
    • References
  • Chapter 8: Applying Transformers to Legal and Financial Documents for AI Text Summarization
    • Designing a universal text-to-text model
      • The rise of text-to-text transformer models
      • A prefix instead of task-specific formats
      • The T5 model
    • Text summarization with T5
      • Hugging Face
        • Hugging Face transformer resources
      • Initializing the T5-large transformer model
        • Getting started with T5
        • Exploring the architecture of the T5 model
      • Summarizing documents with T5-large
        • Creating a summarization function
        • A general topic sample
        • The Bill of Rights sample
        • A corporate law sample
    • Summarization with GPT-3
    • Summary
    • Questions
    • References
  • Chapter 9: Matching Tokenizers and Datasets
    • Matching datasets and tokenizers
      • Best practices
        • Step 1: Preprocessing
        • Step 2: Quality control
        • Continuous human quality control
      • Word2Vec tokenization
        • Case 0: Words in the dataset and the dictionary
        • Case 1: Words not in the dataset or the dictionary
        • Case 2: Noisy relationships
        • Case 3: Words in the text but not in the dictionary
        • Case 4: Rare words
        • Case 5: Replacing rare words
        • Case 6: Entailment
    • Standard NLP tasks with specific vocabulary
      • Generating unconditional samples with GPT-2
      • Generating trained conditional samples
      • Controlling tokenized data
    • Exploring the scope of GPT-3
    • Summary
    • Questions
    • References
  • Chapter 10: Semantic Role Labeling with BERT-Based Transformers
    • Getting started with SRL
      • Defining semantic role labeling
        • Visualizing SRL
      • Running a pretrained BERT-based model
        • The architecture of the BERT-based model
        • Setting up the BERT SRL environment
    • SRL experiments with the BERT-based model
    • Basic samples
      • Sample 1
      • Sample 2
      • Sample 3
    • Difficult samples
      • Sample 4
      • Sample 5
      • Sample 6
    • Questioning the scope of SRL
      • The limit of predicate analysis
      • Redefining SRL
    • Summary
    • Questions
    • References
  • Chapter 11: Let Your Data Do the Talking: Story, Questions, and Answers
    • Methodology
      • Transformers and methods
    • Method 0: Trial and error
    • Method 1: NER first
      • Using NER to find questions
        • Location entity questions
        • Person entity questions
    • Method 2: SRL first
      • Question-answering with ELECTRA
      • Project management constraints
      • Using SRL to find questions
    • Next steps
      • Exploring Haystack with a RoBERTa model
      • Exploring Q&A with a GTP-3 engine
    • Summary
    • Questions
    • References
  • Chapter 12: Detecting Customer Emotions to Make Predictions
    • Getting started: Sentiment analysis transformers
    • The Stanford Sentiment Treebank (SST)
      • Sentiment analysis with RoBERTa-large
    • Predicting customer behavior with sentiment analysis
      • Sentiment analysis with DistilBERT
      • Sentiment analysis with Hugging Face’s models’ list
        • DistilBERT for SST
        • MiniLM-L12-H384-uncased
        • RoBERTa-large-mnli
        • BERT-base multilingual model
    • Sentiment analysis with GPT-3
    • Some Pragmatic I4.0 thinking before we leave
      • Investigating with SRL
      • Investigating with Hugging Face
      • Investigating with the GPT-3 playground
        • GPT-3 code
    • Summary
    • Questions
    • References
  • Chapter 13: Analyzing Fake News with Transformers
    • Emotional reactions to fake news
      • Cognitive dissonance triggers emotional reactions
        • Analyzing a conflictual Tweet
        • Behavioral representation of fake news
    • A rational approach to fake news
      • Defining a fake news resolution roadmap
      • The gun control debate
        • Sentiment analysis
        • Named entity recognition (NER)
        • Semantic Role Labeling (SRL)
        • Gun control SRL
        • Reference sites
      • COVID-19 and former President Trump’s Tweets
        • Semantic Role Labeling (SRL)
    • Before we go
    • Summary
    • Questions
    • References
  • Chapter 14: Interpreting Black Box Transformer Models
    • Transformer visualization with BertViz
      • Running BertViz
        • Step 1: Installing BertViz and importing the modules
        • Step 2: Load the models and retrieve attention
        • Step 3: Head view
        • Step 4: Processing and displaying attention heads
        • Step 5: Model view
    • LIT
      • PCA
      • Running LIT
    • Transformer visualization via dictionary learning
      • Transformer factors
      • Introducing LIME
      • The visualization interface
    • Exploring models we cannot access
    • Summary
    • Questions
    • References
  • Chapter 15: From NLP to Task-Agnostic Transformer Models
    • Choosing a model and an ecosystem
    • The Reformer
      • Running an example
    • DeBERTa
      • Running an example
    • From Task-Agnostic Models to Vision Transformers
      • ViT – Vision Transformers
        • The Basic Architecture of ViT
        • Vision transformers in code
      • CLIP
        • The Basic Architecture of CLIP
        • CLIP in code
      • DALL-E
        • The Basic Architecture of DALL-E
        • DALL-E in code
    • An expanding universe of models
    • Summary
    • Questions
    • References
  • Chapter 16: The Emergence of Transformer-Driven Copilots
    • Prompt engineering
      • Casual English with a meaningful context
      • Casual English with a metonymy
      • Casual English with an ellipsis
      • Casual English with vague context
      • Casual English with sensors
      • Casual English with sensors but no visible context
      • Formal English conversation with no context
      • Prompt engineering training
    • Copilots
      • GitHub Copilot
      • Codex
    • Domain-specific GPT-3 engines
      • Embedding2ML
        • Step 1: Installing and importing OpenAI
        • Step 2: Loading the dataset
        • Step 3: Combining the columns
        • Step 4: Running the GPT-3 embedding
        • Step 5: Clustering (k-means clustering) with the embeddings
        • Step 6: Visualizing the clusters (t-SNE)
      • Instruct series
      • Content filter
    • Transformer-based recommender systems
      • General-purpose sequences
      • Dataset pipeline simulation with RL using an MDP
        • Training customer behaviors with an MDP
        • Simulating consumer behavior with an MDP
        • Making recommendations
    • Computer vision
    • Humans and AI copilots in metaverses
      • From looking at to being in
    • Summary
    • Questions
    • References
  • Appendix I — Terminology of Transformer Models
    • Stack
    • Sublayer
    • Attention heads
  • Appendix II — Hardware Constraints for Transformer Models
    • The Architecture and Scale of Transformers
    • Why GPUs are so special
    • GPUs are designed for parallel computing
    • GPUs are also designed for matrix multiplication
    • Implementing GPUs in code
    • Testing GPUs with Google Colab
    • Google Colab Free with a CPU
      • Google Colab Free with a GPU
    • Google Colab Pro with a GPU
  • Appendix III — Generic Text Completion with GPT-2
    • Step 1: Activating the GPU
    • Step 2: Cloning the OpenAI GPT-2 repository
    • Step 3: Installing the requirements
    • Step 4: Checking the version of TensorFlow
    • Step 5: Downloading the 345M-parameter GPT-2 model
    • Steps 6-7: Intermediate instructions
    • Steps 7b-8: Importing and defining the model
    • Step 9: Interacting with GPT-2
    • References
  • Appendix IV — Custom Text Completion with GPT-2
    • Training a GPT-2 language model
      • Step 1: Prerequisites
      • Steps 2 to 6: Initial steps of the training process
      • Step 7: The N Shepperd training files
      • Step 8: Encoding the dataset
      • Step 9: Training a GPT-2 model
      • Step 10: Creating a training model directory
      • Step 11: Generating unconditional samples
      • Step 12: Interactive context and completion examples
    • References
  • Appendix V — Answers to the Questions
    • Chapter 1, What are Transformers?
    • Chapter 2, Getting Started with the Architecture of the Transformer Model
    • Chapter 3, Fine-Tuning BERT Models
    • Chapter 4, Pretraining a RoBERTa Model from Scratch
    • Chapter 5, Downstream NLP Tasks with Transformers
    • Chapter 6, Machine Translation with the Transformer
    • Chapter 7, The Rise of Suprahuman Transformers with GPT-3 Engines
    • Chapter 8, Applying Transformers to Legal and Financial Documents for AI Text Summarization
    • Chapter 9, Matching Tokenizers and Datasets
    • Chapter 10, Semantic Role Labeling with BERT-Based Transformers
    • Chapter 11, Let Your Data Do the Talking: Story, Questions, and Answers
    • Chapter 12, Detecting Customer Emotions to Make Predictions
    • Chapter 13, Analyzing Fake News with Transformers
    • Chapter 14, Interpreting Black Box Transformer Models
    • Chapter 15, From NLP to Task-Agnostic Transformer Models
    • Chapter 16, The Emergence of Transformer-Driven Copilots
  • Other Books You May Enjoy
  • Index

Статистика использования

stat Количество обращений: 0
За последние 30 дней: 0
Подробная статистика