Data Science II
Table of Contents
The contents of the lecture is as follows
1.1 Introduction
1.2 Linear Support Vector Machine Classification
1.2.1 Soft Margin Classification
1.3 Nonlinear Support Vector Machine Classification
1.3.1 Polynomial Kernel
1.3.2 The Kernel Trick
1.3.3 Similarity Features
1.3.4 Gaussian RBF Kernel
1.4 Regression
2 Decision Trees
2.1 Introduction
2.1.1 Advantages and Disadvantages
2.2 Training and Visualising Decision Trees
2.3 Making Predictions
2.3.1 Gini Impurity
2.3.2 Estimating Class Probabilities
2.4 The CART Training Algorithm
2.4.1 Gini Impurity v. Information Entropy
2.4.2 Regularisation Hyperparameters
2.5 Regression
2.6 Sensitivity to Axis Orientation
3 Ensemble Learning and Random Forests
3.1 Introduction
3.1.1 Voting Classifiers
3.2 Bagging and Pasting
3.2.1 Implementation
3.2.2 Out-of-Bag Evaluation
3.2.3 Random Patches and Random Subspaces
3.3 Random Forests
3.3.1 Extra-Trees
3.3.2 Feature Importance
3.4 Boosting
3.4.1 AdaBoost
3.4.2 Gradient Boosting
3.4.3 Histogram-Based Gradient Boosting
3.5 Bagging v. Boosting
3.6 Stacking
4 Dimensionality Reduction
4.1 Introduction
4.1.1 The Problems of Dimensions
4.2 Main Approaches to Dimensionality Reduction
4.2.1 Projection
4.2.2 Manifold Learning
4.3 Principal Component Analysis (PCA)
4.3.1 Preserving the Variance
4.3.2 Principal Components
4.3.3 Downgrading Dimensions
4.3.4 The Right Number of Dimensions
4.3.5 PCA for Compression
4.3.6 Randomized PCA
4.3.7 Incremental PCA
4.4 Random Projection
4.5 Locally Linear Embedding
5 Unsupervised Learning
5.1 Introduction
5.2 Clustering Algorithms
5.2.1 k-means
5.2.2 Limits of K-Means
5.2.3 Using Clustering for Image Segmentation
5.2.4 Using Clustering for Semi-Supervised Learning
5.2.5 DBSCAN
5.3 Gaussian Mixtures
5.3.1 Using Gaussian Mixtures for Anomaly Detection
5.3.2 Selecting the Number of Clusters
5.3.3 Bayesian Gaussian Mixture Models
5.3.4 Other Algorithms for Anomaly and Novelty Detection
6 Introduction to Artificial Neural Networks
6.1 Introduction
6.2 From Biology to Silicon: Artificial Neurons
6.2.1 Biological Neurons
6.2.2 Logical Computations with Neurons
6.2.3 The Perceptron
6.2.4 Multilayer Perceptron and Backpropagation
6.2.5 Regression MLPs
6.2.6 Classification MLPs
6.3 Implementing MLPs with Keras
6.3.1 Building an Image Classifier Using Sequential API
6.3.2 Creating the model using the sequential API
6.3.3 Building a Regression MLP Using the Sequential API
7 Computer Vision using Convolutional Neural Networks
7.1 Introduction
7.2 Visual Cortex Architecture
7.3 Convolutional Layers
7.3.1 Filters
7.3.2 Stacking Multiple Feature Maps
7.3.3 Implementing Convolutional Layers with Keras
7.3.4 Memory Requirements
7.4 Pooling Layer
7.5 Implementing Pooling Layers with Keras
7.6 CNN Architectures
7.6.1 LeNet-5
7.6.2 AlexNet
7.6.3 GoogLeNet
7.6.4 VGGNet
7.6.5 ResNet
7.7 Implementing a ResNet-34 CNN using Keras
7.8 Using Pre-Trained Models from Keras
7.9 Pre-Trained Models for Transfer Learning
List of Acronyms
Bibliography
Copyright Information
(2025, D. T. McGuines, Ph.D)
This document includes the contents of Drive Systems, official name
being Machine Learning and Data Science 2, taught at MCI in the
Mechatronik Design Innovation. This document is the part of the module
MECH-B-5-MLDS-MLDS2-ILV taught in the B.Sc degree.
All relevant code of the document is done using SageMath where
stated and Python v3.13.7.
This document was compiled with LuaTeX v1.22.0, and all editing were
done using GNU Emacs v30.1 using AUCTeX and org-mode package.
This document is based on the following books and resources shown in no
particular order:
Neural Networks: Methodology and Applications by Gérard Dreyfus
, Springer Python for Data Analysis: Data Wrangling with Pandas,
Numpy, and iPython by Wes McKinney , Springer Hands-On Machine
Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron
, O’Reilly TensorFlow for Deep Learning: From Linear Regression To
Reinforcement Learning by B. Ramsundar, and R. B. Zadeh , O’Reilly
AI and Machine Learning for Coders by Moroney L. , O’ Reilly
Neural Networks and Deep Learning by Aggarwal S. , Springer
Python Machine Learning by Raschka., et. al. , Packt
Machine Learning with Python Cookbook by Albon C. , O’ Reilly
CS229 Lecture Notes by Ng A., et.al , - Lecture Notes on
Machine Learning by Migel A., et. al , -
The document is designed with no intention of publication and has only
been designed for education purposes.
The current maintainer of this work along with the primary
lecturer
is D. T. McGuines, Ph.D. (dtm@mci4me.at).