FinUniversity Electronic Library

     

Details

Johnston, Benjamin. Applied unsupervised learning with Python: discover hidden patterns and relationships in unstructured data with Python / Benjamin Johnston, Aaron Jones and Christopher Kruger. — 1 online resource (483 pages) — <URL:http://elib.fa.ru/ebsco/2148643.pdf>.

Record create date: 7/20/2019

Subject: Python (Computer program language); Python (Computer program language)

Collections: EBSCO

Allowed Actions:

Action 'Read' will be available if you login or access site from another network Action 'Download' will be available if you login or access site from another network

Group: Anonymous

Network: Internet

Annotation

Starting with the basics, Applied Unsupervised Learning with Python explains various techniques that you can apply to your data using the powerful Python libraries so that your unlabeled data reveals solutions to all your business questions.

Document access rights

Network User group Action
Finuniversity Local Network All Read Print Download
Internet Readers Read Print
-> Internet Anonymous

Table of Contents

  • Cover
  • FM
  • Table of Contents
  • Preface
  • Chapter 1: Introduction to Clustering
    • Introduction
    • Unsupervised Learning versus Supervised Learning
    • Clustering
      • Identifying Clusters
      • Two-Dimensional Data
      • Exercise 1: Identifying Clusters in Data
    • Introduction to k-means Clustering
      • No-Math k-means Walkthrough
      • k-means Clustering In-Depth Walkthrough
      • Alternative Distance Metric – Manhattan Distance
      • Deeper Dimensions
      • Exercise 2: Calculating Euclidean Distance in Python
      • Exercise 3: Forming Clusters with the Notion of Distance
      • Exercise 4: Implementing k-means from Scratch
      • Exercise 5: Implementing k-means with Optimization
      • Clustering Performance: Silhouette Score
      • Exercise 6: Calculating the Silhouette Score
      • Activity 1: Implementing k-means Clustering
      • Summary
  • Chapter 2: Hierarchical Clustering
    • Introduction
    • Clustering Refresher
      • k-means Refresher
    • The Organization of Hierarchy
    • Introduction to Hierarchical Clustering
      • Steps to Perform Hierarchical Clustering
      • An Example Walk-Through of Hierarchical Clustering
      • Exercise 7: Building a Hierarchy
    • Linkage
      • Activity 2: Applying Linkage Criteria
    • Agglomerative versus Divisive Clustering
      • Exercise 8: Implementing Agglomerative Clustering with scikit-learn
      • Activity 3: Comparing k-means with Hierarchical Clustering
    • k-means versus Hierarchical Clustering
    • Summary
  • Chapter 3: Neighborhood Approaches and DBSCAN
    • Introduction
      • Clusters as Neighborhoods
    • Introduction to DBSCAN
      • DBSCAN In-Depth
      • Walkthrough of the DBSCAN Algorithm
      • Exercise 9: Evaluating the Impact of Neighborhood Radius Size
      • DBSCAN Attributes – Neighborhood Radius
      • Activity 4: Implement DBSCAN from Scratch
      • DBSCAN Attributes – Minimum Points
      • Exercise 10: Evaluating the Impact of Minimum Points Threshold
      • Activity 5: Comparing DBSCAN with k-means and Hierarchical Clustering
    • DBSCAN Versus k-means and Hierarchical Clustering
    • Summary
  • Chapter 4: Dimension Reduction and PCA
    • Introduction
      • What Is Dimensionality Reduction?
      • Applications of Dimensionality Reduction
      • The Curse of Dimensionality
    • Overview of Dimensionality Reduction Techniques
      • Dimensionality Reduction and Unsupervised Learning
    • PCA
      • Mean
      • Standard Deviation
      • Covariance
      • Covariance Matrix
      • Exercise 11: Understanding the Foundational Concepts of Statistics
      • Eigenvalues and Eigenvectors
      • Exercise 12: Computing Eigenvalues and Eigenvectors
      • The Process of PCA
      • Exercise 13: Manually Executing PCA
      • Exercise 14: Scikit-Learn PCA
      • Activity 6: Manual PCA versus scikit-learn
      • Restoring the Compressed Dataset
      • Exercise 15: Visualizing Variance Reduction with Manual PCA
      • Exercise 16: Visualizing Variance Reduction with
      • Exercise 17: Plotting 3D Plots in Matplotlib
      • Activity 7: PCA Using the Expanded Iris Dataset
    • Summary
  • Chapter 5: Autoencoders
    • Introduction
    • Fundamentals of Artificial Neural Networks
      • The Neuron
      • Sigmoid Function
      • Rectified Linear Unit (ReLU)
      • Exercise 18: Modeling the Neurons of an Artificial Neural Network
      • Activity 8: Modeling Neurons with a ReLU Activation Function
      • Neural Networks: Architecture Definition
      • Exercise 19: Defining a Keras Model
      • Neural Networks: Training
      • Exercise 20: Training a Keras Neural Network Model
      • Activity 9: MNIST Neural Network
    • Autoencoders
      • Exercise 21: Simple Autoencoder
      • Activity 10: Simple MNIST Autoencoder
      • Exercise 22: Multi-Layer Autoencoder
      • Convolutional Neural Networks
      • Exercise 23: Convolutional Autoencoder
      • Activity 11: MNIST Convolutional Autoencoder
    • Summary
  • Chapter 6: t-Distributed Stochastic Neighbor Embedding (t-SNE)
    • Introduction
    • Stochastic Neighbor Embedding (SNE)
    • t-Distributed SNE
      • Exercise 24: t-SNE MNIST
      • Activity 12: Wine t-SNE
    • Interpreting t-SNE Plots
      • Perplexity
      • Exercise 25: t-SNE MNIST and Perplexity
      • Activity 13: t-SNE Wine and Perplexity
      • Iterations
      • Exercise 26: t-SNE MNIST and Iterations
      • Activity 14: t-SNE Wine and Iterations
      • Final Thoughts on Visualizations
    • Summary
  • Chapter 7: Topic Modeling
    • Introduction
      • Topic Models
      • Exercise 27: Setting Up the Environment
      • A High-Level Overview of Topic Models
      • Business Applications
      • Exercise 28: Data Loading
    • Cleaning Text Data
      • Data Cleaning Techniques
      • Exercise 29: Cleaning Data Step by Step
      • Exercise 30: Complete Data Cleaning
      • Activity 15: Loading and Cleaning Twitter Data
    • Latent Dirichlet Allocation
      • Variational Inference
      • Bag of Words
      • Exercise 31: Creating a Bag-of-Words Model Using the Count Vectorizer
      • Perplexity
      • Exercise 32: Selecting the Number of Topics
      • Exercise 33: Running Latent Dirichlet Allocation
      • Exercise 34: Visualize LDA
      • Exercise 35: Trying Four Topics
      • Activity 16: Latent Dirichlet Allocation and Health Tweets
      • Bag-of-Words Follow-Up
      • Exercise 36: Creating a Bag-of-Words Using TF-IDF
    • Non-Negative Matrix Factorization
      • Frobenius Norm
      • Multiplicative Update
      • Exercise 37: Non-negative Matrix Factorization
      • Exercise 38: Visualizing NMF
      • Activity 17: Non-Negative Matrix Factorization
    • Summary
  • Chapter 8: Market Basket Analysis
    • Introduction
    • Market Basket Analysis
      • Use Cases
      • Important Probabilistic Metrics
      • Exercise 39: Creating Sample Transaction Data
      • Support
      • Confidence
      • Lift and Leverage
      • Conviction
      • Exercise 40: Computing Metrics
    • Characteristics of Transaction Data
      • Exercise 41: Loading Data
      • Data Cleaning and Formatting
      • Exercise 42: Data Cleaning and Formatting
      • Data Encoding
      • Exercise 43: Data Encoding
      • Activity 18: Loading and Preparing Full Online Retail Data
    • Apriori Algorithm
      • Computational Fixes
      • Exercise 44: Executing the Apriori algorithm
      • Activity 19: Apriori on the Complete Online Retail Dataset
    • Association Rules
      • Exercise 45: Deriving Association Rules
      • Activity 20: Finding the Association Rules on the Complete Online Retail Dataset
    • Summary
  • Chapter 9: Hotspot Analysis
    • Introduction
      • Spatial Statistics
      • Probability Density Functions
      • Using Hotspot Analysis in Business
    • Kernel Density Estimation
      • The Bandwidth Value
      • Exercise 46: The Effect of the Bandwidth Value
      • Selecting the Optimal Bandwidth
      • Exercise 47: Selecting the Optimal Bandwidth Using Grid Search
      • Kernel Functions
      • Exercise 48: The Effect of the Kernel Function
      • Kernel Density Estimation Derivation
      • Exercise 49: Simulating the Derivation of Kernel Density Estimation
      • Activity 21: Estimating Density in One Dimension
    • Hotspot Analysis
      • Exercise 50: Loading Data and Modeling with Seaborn
      • Exercise 51: Working with Basemaps
      • Activity 22: Analyzing Crime in London
    • Summary
  • Appendix
  • Index

Usage statistics

stat Access count: 0
Last 30 days: 0
Detailed usage statistics