Data Science II

The contents of the lecture is as follows

1 Support Vector Machines
1.1 Introduction
1.2 Linear Support Vector Machine Classification
1.2.1 Soft Margin Classification
1.3 Nonlinear Support Vector Machine Classification
1.3.1 Polynomial Kernel
1.3.2 The Kernel Trick
1.3.3 Similarity Features
1.3.4 Gaussian RBF Kernel
1.4 Regression
2 Decision Trees
2.1 Introduction
2.1.1 Advantages and Disadvantages
2.2 Training and Visualising Decision Trees
2.3 Making Predictions
2.3.1 Gini Impurity
2.3.2 Estimating Class Probabilities
2.4 The CART Training Algorithm
2.4.1 Gini Impurity v. Information Entropy
2.4.2 Regularisation Hyperparameters
2.5 Regression
2.6 Sensitivity to Axis Orientation
3 Ensemble Learning and Random Forests
3.1 Introduction
3.1.1 Voting Classifiers
3.2 Bagging and Pasting
3.2.1 Implementation
3.2.2 Out-of-Bag Evaluation
3.2.3 Random Patches and Random Subspaces
3.3 Random Forests
3.3.1 Extra-Trees
3.3.2 Feature Importance
3.4 Boosting
3.4.1 AdaBoost
3.4.2 Gradient Boosting
3.4.3 Histogram-Based Gradient Boosting
3.5 Bagging v. Boosting
3.6 Stacking
4 Dimensionality Reduction
4.1 Introduction
4.1.1 The Problems of Dimensions
4.2 Main Approaches to Dimensionality Reduction
4.2.1 Projection
4.2.2 Manifold Learning
4.3 Principal Component Analysis (PCA)
4.3.1 Preserving the Variance
4.3.2 Principal Components
4.3.3 Downgrading Dimensions
4.3.4 The Right Number of Dimensions
4.3.5 PCA for Compression
4.3.6 Randomized PCA
4.3.7 Incremental PCA
4.4 Random Projection
4.5 Locally Linear Embedding
5 Unsupervised Learning
5.1 Introduction
5.2 Clustering Algorithms
5.2.1 k-means
5.2.2 Limits of K-Means
5.2.3 Using Clustering for Image Segmentation
5.2.4 Using Clustering for Semi-Supervised Learning
5.2.5 DBSCAN
5.3 Gaussian Mixtures
5.3.1 Using Gaussian Mixtures for Anomaly Detection
5.3.2 Selecting the Number of Clusters
5.3.3 Bayesian Gaussian Mixture Models
5.3.4 Other Algorithms for Anomaly and Novelty Detection
6 Introduction to Artificial Neural Networks
6.1 Introduction
6.2 From Biology to Silicon: Artificial Neurons
6.2.1 Biological Neurons
6.2.2 Logical Computations with Neurons
6.2.3 The Perceptron
6.2.4 Multilayer Perceptron and Backpropagation
6.2.5 Regression MLPs
6.2.6 Classification MLPs
6.3 Implementing MLPs with Keras
6.3.1 Building an Image Classifier Using Sequential API
6.3.2 Creating the model using the sequential API
6.3.3 Building a Regression MLP Using the Sequential API
7 Computer Vision using Convolutional Neural Networks
7.1 Introduction
7.2 Visual Cortex Architecture
7.3 Convolutional Layers
7.3.1 Filters
7.3.2 Stacking Multiple Feature Maps
7.3.3 Implementing Convolutional Layers with Keras
7.3.4 Memory Requirements
7.4 Pooling Layer
7.5 Implementing Pooling Layers with Keras
7.6 CNN Architectures
7.6.1 LeNet-5
7.6.2 AlexNet
7.6.3 GoogLeNet
7.6.4 VGGNet
7.6.5 ResNet
7.7 Implementing a ResNet-34 CNN using Keras
7.8 Using Pre-Trained Models from Keras
7.9 Pre-Trained Models for Transfer Learning
List of Acronyms
Bibliography

Copyright Information

(2025, D. T. McGuines, Ph.D)

This document includes the contents of Drive Systems, official name being Machine Learning and Data Science 2, taught at MCI in the Mechatronik Design Innovation. This document is the part of the module MECH-B-5-MLDS-MLDS2-ILV taught in the B.Sc degree.

All relevant code of the document is done using SageMath where stated and Python v3.13.7.
This document was compiled with LuaTeX v1.22.0, and all editing were done using GNU Emacs v30.1 using AUCTeX and org-mode package.

This document is based on the following books and resources shown in no particular order:

Neural Networks: Methodology and Applications by Gérard Dreyfus , Springer Python for Data Analysis: Data Wrangling with Pandas, Numpy, and iPython by Wes McKinney , Springer Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron , O’Reilly TensorFlow for Deep Learning: From Linear Regression To Reinforcement Learning by B. Ramsundar, and R. B. Zadeh , O’Reilly AI and Machine Learning for Coders by Moroney L. , O’ Reilly Neural Networks and Deep Learning by Aggarwal S. , Springer Python Machine Learning by Raschka., et. al. , Packt Machine Learning with Python Cookbook by Albon C. , O’ Reilly CS229 Lecture Notes by Ng A., et.al , - Lecture Notes on Machine Learning by Migel A., et. al , -

The document is designed with no intention of publication and has only been designed for education purposes.

The current maintainer of this work along with the primary lecturer
is D. T. McGuines, Ph.D. (dtm@mci4me.at).

Data Science II

Table of Contents

Copyright Information