Вход в систему

Электронная библиотека Финансового университета

Детальная информация

	Карточка	Таблица	RUSMARC

Liu, Yuxi (Hayden. Python machine learning by example: easy-to-follow examples that get you up and running with machine learning / Yuxi (Hayden) Liu. — Second edition. — Birmingham, UK: Packt Publishing, 2019. — 1 online resource — <URL:http://elib.fa.ru/ebsco/2037541.pdf>.

Дата создания записи: 14.03.2019

Тематика: Python (Computer program language); Machine learning.; COMPUTERS / Programming Languages / Python.; COMPUTERS / Data Processing.; COMPUTERS / Databases / Data Mining.

Коллекции: EBSCO

Разрешенные действия: –

Действие 'Прочитать' будет доступно, если вы выполните вход в систему или будете работать с сайтом на компьютере в другой сети Действие 'Загрузить' будет доступно, если вы выполните вход в систему или будете работать с сайтом на компьютере в другой сети

Группа: Анонимные пользователи

Сеть: Интернет

Права на использование объекта хранения

	Место доступа		Группа пользователей		Действие
	Локальная сеть Финуниверситета		Все
	Интернет		Читатели
	Интернет		Анонимные пользователи

Cover
Title Page
Copyright and Credits
About Packt
Dedication
Foreword
Contributors
Table of Contents
Preface
Section 1: Fundamentals of Machine Learning
Getting Started with Machine Learning and Python
- A very high-level overview of machine learning technology
  - Types of machine learning tasks
  - A brief history of the development of machine learning algorithms
- Core of machine learning – generalizing with data
  - Overfitting, underfitting, and the bias-variance trade-off
  - Avoiding overfitting with cross-validation
  - Avoiding overfitting with regularization
  - Avoiding overfitting with feature selection and dimensionality reduction
- Preprocessing, exploration, and feature engineering
  - Missing values
  - Label encoding
  - One hot encoding
  - Scaling
  - Polynomial features
  - Power transform
  - Binning
- Combining models
  - Voting and averaging
  - Bagging
  - Boosting
  - Stacking
- Installing software and setting up
  - Setting up Python and environments
  - Installing the various packages
    - NumPy
    - SciPy
    - Pandas
    - Scikit-learn
    - TensorFlow
- Summary
- Exercises
Section 2: Practical Python Machine Learning By Example
Exploring the 20 Newsgroups Dataset with Text Analysis Techniques
- How computers understand language - NLP
- Picking up NLP basics while touring popular NLP libraries
  - Corpus
  - Tokenization
  - PoS tagging
  - Named-entity recognition
  - Stemming and lemmatization
  - Semantics and topic modeling
- Getting the newsgroups data
- Exploring the newsgroups data
- Thinking about features for text data
  - Counting the occurrence of each word token
  - Text preprocessing
  - Dropping stop words
  - Stemming and lemmatizing words
- Visualizing the newsgroups data with t-SNE
  - What is dimensionality reduction?
  - t-SNE for dimensionality reduction
- Summary
- Exercises
Mining the 20 Newsgroups Dataset with Clustering and Topic Modeling Algorithms
- Learning without guidance – unsupervised learning
- Clustering newsgroups data using k-means
  - How does k-means clustering work?
  - Implementing k-means from scratch
  - Implementing k-means with scikit-learn
  - Choosing the value of k
  - Clustering newsgroups data using k-means
- Discovering underlying topics in newsgroups
- Topic modeling using NMF
- Topic modeling using LDA
- Summary
- Exercises
Detecting Spam Email with Naive Bayes
- Getting started with classification
  - Types of classification
  - Applications of text classification
- Exploring Naïve Bayes
  - Learning Bayes' theorem by examples
  - The mechanics of Naïve Bayes
  - Implementing Naïve Bayes from scratch
  - Implementing Naïve Bayes with scikit-learn
- Classification performance evaluation
- Model tuning and cross-validation
- Summary
- Exercise
Classifying Newsgroup Topics with Support Vector Machines
- Finding separating boundary with support vector machines
  - Understanding how SVM works through different use cases
    - Case 1 – identifying a separating hyperplane
    - Case 2 – determining the optimal hyperplane
    - Case 3 – handling outliers
  - Implementing SVM
    - Case 4 – dealing with more than two classes
  - The kernels of SVM
    - Case 5 – solving linearly non-separable problems
  - Choosing between linear and RBF kernels
- Classifying newsgroup topics with SVMs
- More example – fetal state classification on cardiotocography
- A further example – breast cancer classification using SVM with TensorFlow
- Summary
- Exercise
Predicting Online Ad Click-Through with Tree-Based Algorithms
- Brief overview of advertising click-through prediction
- Getting started with two types of data – numerical and categorical
- Exploring decision tree from root to leaves
  - Constructing a decision tree
  - The metrics for measuring a split
- Implementing a decision tree from scratch
- Predicting ad click-through with decision tree
- Ensembling decision trees – random forest
  - Implementing random forest using TensorFlow
- Summary
- Exercise
Predicting Online Ad Click-Through with Logistic Regression
- Converting categorical features to numerical – one-hot encoding and ordinal encoding
- Classifying data with logistic regression
  - Getting started with the logistic function
  - Jumping from the logistic function to logistic regression
- Training a logistic regression model
  - Training a logistic regression model using gradient descent
  - Predicting ad click-through with logistic regression using gradient descent
  - Training a logistic regression model using stochastic gradient descent
  - Training a logistic regression model with regularization
- Training on large datasets with online learning
- Handling multiclass classification
- Implementing logistic regression using TensorFlow
- Feature selection using random forest
- Summary
- Exercises
Scaling Up Prediction to Terabyte Click Logs
- Learning the essentials of Apache Spark
  - Breaking down Spark
  - Installing Spark
  - Launching and deploying Spark programs
- Programming in PySpark
- Learning on massive click logs with Spark
  - Loading click logs
  - Splitting and caching the data
  - One-hot encoding categorical features
  - Training and testing a logistic regression model
- Feature engineering on categorical variables with Spark
  - Hashing categorical features
  - Combining multiple variables – feature interaction
- Summary
- Exercises
Stock Price Prediction with Regression Algorithms
- Brief overview of the stock market and stock prices
- What is regression?
- Mining stock price data
  - Getting started with feature engineering
  - Acquiring data and generating features
- Estimating with linear regression
  - How does linear regression work?
  - Implementing linear regression
- Estimating with decision tree regression
  - Transitioning from classification trees to regression trees
  - Implementing decision tree regression
  - Implementing regression forest
- Estimating with support vector regression
  - Implementing SVR
- Estimating with neural networks
  - Demystifying neural networks
  - Implementing neural networks
- Evaluating regression performance
- Predicting stock price with four regression algorithms
- Summary
- Exercise
Section 3: Python Machine Learning Best Practices
Machine Learning Best Practices
- Machine learning solution workflow
- Best practices in the data preparation stage
  - Best practice 1 – completely understanding the project goal
  - Best practice 2 – collecting all fields that are relevant
  - Best practice 3 – maintaining the consistency of field values
  - Best practice 4 – dealing with missing data
  - Best practice 5 – storing large-scale data
- Best practices in the training sets generation stage
  - Best practice 6 – identifying categorical features with numerical values
  - Best practice 7 – deciding on whether or not to encode categorical features
  - Best practice 8 – deciding on whether or not to select features, and if so, how to do so
  - Best practice 9 – deciding on whether or not to reduce dimensionality, and if so, how to do so
  - Best practice 10 – deciding on whether or not to rescale features
  - Best practice 11 – performing feature engineering with domain expertise
  - Best practice 12 – performing feature engineering without domain expertise
  - Best practice 13 – documenting how each feature is generated
  - Best practice 14 – extracting features from text data
- Best practices in the model training, evaluation, and selection stage
  - Best practice 15 – choosing the right algorithm(s) to start with
    - Naïve Bayes
    - Logistic regression
    - SVM
    - Random forest (or decision tree)
    - Neural networks
  - Best practice 16 – reducing overfitting
  - Best practice 17 – diagnosing overfitting and underfitting
  - Best practice 18 – modeling on large-scale datasets
- Best practices in the deployment and monitoring stage
  - Best practice 19 – saving, loading, and reusing models
  - Best practice 20 – monitoring model performance
  - Best practice 21 – updating models regularly
- Summary
- Exercises
Other Books You May Enjoy
Index

Статистика использования

Количество обращений: 0
За последние 30 дней: 0
Подробная статистика

Электронная библиотека Финансового университета

Детальная информация

Liu, Yuxi (Hayden. Python machine learning by example: easy-to-follow examples that get you up and running with machine learning / Yuxi (Hayden) Liu. — Second edition. — Birmingham, UK: Packt Publishing, 2019. — 1 online resource — <URL:http://elib.fa.ru/ebsco/2037541.pdf>.

Права на использование объекта хранения

Оглавление

Статистика использования