### Machine Learning

#### Lessons

## Introduction

## What is the history of machine learning?

## What is the difference between machine learning solution and programmatic solution?

## What is a formal definition of machine learning?

## What are some domains and examples of machine learning?

## How can we create a (machine) learner?

## Different types of Machine Learning

## What are the broad types of machine learning?

## What is UnSupervised / Supervised / SemiSupervised and Reinforcement Learning?

## What is supervised learning?(In detail)

## What are some examples of Classification and Regression problems?

## What are Features, Some of the Sample training examples of feature and Can we draw some Schematic Diagrams (for Supervised learning)?

## What is Classification Learning? and what are some of its tasks and performance metric?

## How do we get data for the learning problems? How are representations of functions used in machine learning? What is the hypothesis space?

## Hypothesis Space and Inductive Bias

## What is inductive learning?

## What are features and feature vectors?

## What is the start of Classification problem. What is Feature Space and Hypothesis space for Classification problems?

## 5 types of representations of a function

## Hypothesis space

## Terminology (example, training data, instrance space, concept, target function)

## What is the Size of the Hypothesis space (for n boolean features) and What is Hypothesis language?

## What is inductive learning hypothesis?

## What is Inductive learning and consistent hypothesis? Why is Inductive learning an ill posed problem?

## What are various types of bias? (Occums Razor, MDL, MM) and what are the important issues in Machine Learning? What is Generalization? (Bias and Variance)

## Evaluation and Cross-Validation

## What is experimental evaluation of learning algorithms?

## How do we Evaluate predictions? and What is absolute error? (Evaluate predictions)

## What is sum of squares error and number of misclassification? (Evaluate predictions)

## What is confusion matrix?

## What is accuracy, precision and recall? (evaluate predictions)

## What is sample error and true error?

## What are the sources of errors?

## What are the difficulties in evaluating hypothesis with limited data and possible solutions?

## How can we evaluate with limited training data?

## What is K fold cross validation trade off in machine learning?

## Tutorial I

## Introduction to Tutorial I

## Types of learning : supervised vs unsupervised learning

## Example of supervised vs unsupervised learning

## Types of features : categorical vs continuous features

## Types of supervised learning: regression vs classification

## Bias vs Variance

## Generalization performance of a learning algorithm

## Linear Regression

## What is regression? (Linear functions and other functions) and What are various Types of regression models?

## What is linear regression?

## Looking at an example of a training set for regression

## What is multiple linear regression?

## What assumption are we making for errors?

## The least square regression line

## How do we learn the parameters (for single regression and for multiple linear regression)

## What is the delta or lms method and how do we use gradient descent?

## What is lms update or delta rule, batch descent and stochastic gradient descent?

## Introduction to Decision Trees

## What is a decision tree?

## How to draw a sample decision trees for discrete data?

## How to draw a sample decision trees for continuous data?

## Generate a decision tree from training examples

## Decision tree for playing tennis

## Introduction to ID3 (searching for a good tree )

## Learning Decision Tree

## How do we select attributes for decision tree? (information gain, entropy)

## Example of creating a decision tree (using ID3 algorithm)

## What is GINI Index?

## How do we split continuous attributes and what are the practical issues in classification

## Practical issues in classification

## Overfitting

## What is overfitting?

## An example of underfitting and overfitting

## Overfitting due to noise or insufficient examples

## How to avoid overfitting?

## What is MDL?

## What are the conditions for pre pruning?

## How do we use reduced error pruning for post pruning?

## What are the triple tradeoffs in model selection and generalization?

## What is regularization?

## Python Exercise on Decision Tree and Linear Regression

## Python exercise on linear regression

## Python exercise on logistic regression

## Python exercise on decision tree regression

## Tutorial II

## How to solve a sample problem in linear regression?

## How to solve problems related to decision trees?

## How to find the entropy of a set and use in decision trees?

## What is information gain?

## K-Nearest Neighbour

## What is instance based learning and K-Nearest Neighbour algorithm?

## What is the standard distance function (euclidean distance) and the 3 issues related to it?

## What are some examples of K-Nearest Neighbour and what is the impact of k?

## How can we use weighted distance functions?

## Why do we need to remove extra features?

## What are the various approaches to giving weights?

## Feature Selection

## Why do we need feature reduction?

## What is the curse of dimensionality?

## How can we do feature reduction? (selection and extraction)

## How can we evaluate feature subset? (wrapper / supervised and filter / unsupervised)

## How can we use the feature selection algorithm? (forward and backward selection algorithm)

## What are univariate feature selection methods?

## What are multivariate feature selection methods?

## Feature Extraction

## What is feature extraction and what kind of features do we want?

## What are principal components (PCs) and how do we choose features?

## How do we choose the direction of the principal components (PCs) and how do we use PCA?

## How do we choose a feature (axis) for classification and how is Linear discriminant Analysis useful?

## Collaborative Filtering

## What is a recommender system?

## How can we formally define recommendation problem?

## What are the two types of recommendation systems? (content, collaborative filtering)

## What are the two types of collaborative filtering? (used based nearest nbr, item based nearest nbr)

## What are the two phases of algorithms for collaborative filtering? (nbr formation, recommendation)

## What are the issues with user based KNN CF?

## What is item based collaborative filtering?

## Python Exercise on KNN and PCA

## What we will cover?

## How do we use KNeighborsClassifier in python?

## How do we use randomized PCA in Python?

## How can we do Face recognition using PCA and KNN?

## Tutorial III

## What is the curse of dimensionality?

## What is feature selection?

## What is feature reduction and PCA? (principal component analysis)

## How do you calculate the eigen values and eigen vector of a matrix?

## What is K-NN (K Nearest Neighbour) Classification?

## Bayesian Learning

## How is probability used for modelling concepts?

## What is Bayes theorem?

## Can we look at an example of Bayes theorem?

## How can Bayes theorem be applied to find the hypothesis in machine learning? (MAP hypothesis)

## What is Bayes optimal classifier?

## Gibbs sampling

## Naive Bayes

## Naive bayes algorithm

## Naive bayes algorithm for discrete x

## What is smoothing and why is it required?

## Can we look at an example of naive bayes algorithm for discrete x?

## How do we use smoothing when estimating parameters?

## What is the assumption that we made in naive bayes and what happens if it is invalid?

## What is gaussian naive bayes? (for continuous X, but discrete Y)

## What are bayesian networks?

## Bayesian Network

## Why do we need bayes network?

## Can we look at an example of bayes network?

## What does a bayesian network represent?

## What can we do with a baynesian network (Inference)?

## Where can we apply bayesian network?

## How do we define a bayesian network?

## What is the graphical representation of naive bayes model?

## What is the hidden markov model?

## How is learning helped by bayesian belief networks?

## Python Exercise on Naive Bayes

## How to use the naive bayes classifier?

## What is naive bayes classifier?

## How is naive bayes classifier relevant in the context of email spam classification?

## Tutorial IV

## How do we estimate the probabilities using the frequency distribution of probability?

## How do we use bayes rule?

## What is MAP inference?

## What is naive bayes assumption?

## What is bayesian networks (the structures), inference and marginalization?

## Logistic Regression

## What is Logistic Regression (for Classification problems) and sigmoid function?

## What are some of the Interesting Propreties of Sigmoid function?

## How can we use stochastic gradient descent with logistic regression?

## Introduction Support Vector Machine

## Support vector machine

## Functional margin

## Functional margin of a set of point

## Solving the optimization problem

## SVM The Dual Formulation

## Lagrangian duality in brief

## The KKT conditions

## Implication of Lagrangian

## The dual problem

## SVM Maximum Margin with Noise

## Linear SVM formulation

## Limitation of previous SVM formulation

## What objective to be minimized?

## Lagrangian

## Dual formulation

## Nonlinear SVM and Kernel Function

## Non-linear SVM, feature space and kernel function

## Kernel trick

## Commonly used kernel function

## Performance

## SVM Solution to the Dual Problem

## SMO algorithm (sequential optimization)

## Cordinate ascent

## SMO (for dual problem)

## Python Exercise on SVM

## Support vector classification

## Visualize the decision boundaries

## Load data

## Introduction to NN

## Neural network and neuron

## Perceptron - basic unit in NN

## Gradient decent

## Stochastic gradient descent

## Multi-layer networks - by stochastic many NN

## Multilayer Neural Network

## Limitation of perceptrons

## Multi-layer NN

## Power/ Expressiveness of multilayer networks

## Two-layer back-propagation neural network

## Learning for BP nets

## Derivation

## Neural Network and Backpropagation Algorithm

## Single layer perceptron and boolean functions (OR, XOR)

## Representation capability of NNs

## Learning in multi layer N using back propagation

## Derivation

## Back propagation algorithm

## Training practices: batch vs stochastic and learning in epoch

## Overfitting in anns and local minima

## Deep Neural Network

## Deep learning

## Hierarchical representation & unsupervised pre-training

## Architecture & Training

## Pooling

## CNN properties

## Python Exercise on Neural Network

## How can we create a artificial neural network using TensorFlow and TFLearn to recognize handwritten digits?

## How do we load dependencies (to recognize handwritten digits)?

## How do we load the data (to recognize handwritten digits)?

## How do we make the model (to recognize handwritten digits)?

## How do we train the model (to recognize handwritten digits)?

## What is our takeaway from this exercise (to recognize handwritten digits)?

## Tutorial VI

## What is a perceptron?

## What is perceptron learning rule?

## How do we represent a boolean function using a perceptron?

## What is forward and backward pass algorithm or backpropagation algorithm?

## Stochastic gradient descent and batch gradient descent

## Quick overview of some deep learning algorithms

## Introduction to Computational Learning Theory

## Goal of learning theory & Core aspect of machine learning

## PAC

## Prototypical concept learning task

## Sample Complexity Finite Hypothesis Space

## What is Sample Complexity?

## Can we look at an example of consistent case?

## What is Find-S algorithm and what can it do?

## VC Dimension

## What kind of theorems do we have when hypothesis state is infinite?

## What is shattering?

## What is the definition of VC dimension?

## What is the upper bound and lower band on sample complexity with VC?

## Introduction to Ensembles

## What is ensemble learning?

## How can we use weak learners?

## How can we combine learners in Bayesian classifiers?

## Why are ensembles successful and what are the main challenges with them?

## Bagging and Boosting

## What is Bagging?

## What is Boosting and what is AdaBoost?

## Why does ensembling work?

## Introduction to Clustering

## What is unsupervised learning and clustering?

## What are some applications of clustering, and what are various aspectis of clustering?

## Major clustering approaches

## How can we measure the quality of clustering?

## Kmeans Clustering

## What is K-means algorithm?

## How can we describe K-means Algorithm, and can we look at an illustration of it?

## What are the similarity and distance measures?

## What is the proof of convergence of K-means, time complexity, advantages and disadvantages?

## What is model based clustering?

## How can we apply K-means on a RGB image?

## What is EM algorithm?

## Agglomerative Hierarchical Clustering

## What is hierarchical clustering, bottom up and top down clustering?

## What is a Dendrogram?

## What is the algorithm for Agglomerative Hierarchical Clustering?

## What is the complete link method?

## What is average link clustering?

## Python Exercise on kmeans clustering

## Can we look at python code for K means algorithm?

## Can we look at python code for gaussian mixture model?

## Hierarchical agglomerative clustering

## Tutorial VIII

## What is K-means clustering?

## Solving a sample problem n K-means clustering

## What is agglomorative hierarchical clustering?

## What is gaussian mixture model?