Machine Learning
Lessons
Introduction
What is the history of machine learning?
What is the difference between machine learning solution and programmatic solution?
What is a formal definition of machine learning?
What are some domains and examples of machine learning?
How can we create a (machine) learner?
Different types of Machine Learning
What are the broad types of machine learning?
What is UnSupervised / Supervised / SemiSupervised and Reinforcement Learning?
What is supervised learning?(In detail)
What are some examples of Classification and Regression problems?
What are Features, Some of the Sample training examples of feature and Can we draw some Schematic Diagrams (for Supervised learning)?
What is Classification Learning? and what are some of its tasks and performance metric?
How do we get data for the learning problems? How are representations of functions used in machine learning? What is the hypothesis space?
Hypothesis Space and Inductive Bias
What is inductive learning?
What are features and feature vectors?
What is the start of Classification problem. What is Feature Space and Hypothesis space for Classification problems?
5 types of representations of a function
Hypothesis space
Terminology (example, training data, instrance space, concept, target function)
What is the Size of the Hypothesis space (for n boolean features) and What is Hypothesis language?
What is inductive learning hypothesis?
What is Inductive learning and consistent hypothesis? Why is Inductive learning an ill posed problem?
What are various types of bias? (Occums Razor, MDL, MM) and what are the important issues in Machine Learning? What is Generalization? (Bias and Variance)
Evaluation and Cross-Validation
What is experimental evaluation of learning algorithms?
How do we Evaluate predictions? and What is absolute error? (Evaluate predictions)
What is sum of squares error and number of misclassification? (Evaluate predictions)
What is confusion matrix?
What is accuracy, precision and recall? (evaluate predictions)
What is sample error and true error?
What are the sources of errors?
What are the difficulties in evaluating hypothesis with limited data and possible solutions?
How can we evaluate with limited training data?
What is K fold cross validation trade off in machine learning?
Tutorial I
Introduction to Tutorial I
Types of learning : supervised vs unsupervised learning
Example of supervised vs unsupervised learning
Types of features : categorical vs continuous features
Types of supervised learning: regression vs classification
Bias vs Variance
Generalization performance of a learning algorithm
Linear Regression
What is regression? (Linear functions and other functions) and What are various Types of regression models?
What is linear regression?
Looking at an example of a training set for regression
What is multiple linear regression?
What assumption are we making for errors?
The least square regression line
How do we learn the parameters (for single regression and for multiple linear regression)
What is the delta or lms method and how do we use gradient descent?
What is lms update or delta rule, batch descent and stochastic gradient descent?
Introduction to Decision Trees
What is a decision tree?
How to draw a sample decision trees for discrete data?
How to draw a sample decision trees for continuous data?
Generate a decision tree from training examples
Decision tree for playing tennis
Introduction to ID3 (searching for a good tree )
Learning Decision Tree
How do we select attributes for decision tree? (information gain, entropy)
Example of creating a decision tree (using ID3 algorithm)
What is GINI Index?
How do we split continuous attributes and what are the practical issues in classification
Practical issues in classification
Overfitting
What is overfitting?
An example of underfitting and overfitting
Overfitting due to noise or insufficient examples
How to avoid overfitting?
What is MDL?
What are the conditions for pre pruning?
How do we use reduced error pruning for post pruning?
What are the triple tradeoffs in model selection and generalization?
What is regularization?
Python Exercise on Decision Tree and Linear Regression
Python exercise on linear regression
Python exercise on logistic regression
Python exercise on decision tree regression
Tutorial II
How to solve a sample problem in linear regression?
How to solve problems related to decision trees?
How to find the entropy of a set and use in decision trees?
What is information gain?
K-Nearest Neighbour
What is instance based learning and K-Nearest Neighbour algorithm?
What is the standard distance function (euclidean distance) and the 3 issues related to it?
What are some examples of K-Nearest Neighbour and what is the impact of k?
How can we use weighted distance functions?
Why do we need to remove extra features?
What are the various approaches to giving weights?
Feature Selection
Why do we need feature reduction?
What is the curse of dimensionality?
How can we do feature reduction? (selection and extraction)
How can we evaluate feature subset? (wrapper / supervised and filter / unsupervised)
How can we use the feature selection algorithm? (forward and backward selection algorithm)
What are univariate feature selection methods?
What are multivariate feature selection methods?
Feature Extraction
What is feature extraction and what kind of features do we want?
What are principal components (PCs) and how do we choose features?
How do we choose the direction of the principal components (PCs) and how do we use PCA?
How do we choose a feature (axis) for classification and how is Linear discriminant Analysis useful?
Collaborative Filtering
What is a recommender system?
How can we formally define recommendation problem?
What are the two types of recommendation systems? (content, collaborative filtering)
What are the two types of collaborative filtering? (used based nearest nbr, item based nearest nbr)
What are the two phases of algorithms for collaborative filtering? (nbr formation, recommendation)
What are the issues with user based KNN CF?
What is item based collaborative filtering?
Python Exercise on KNN and PCA
What we will cover?
How do we use KNeighborsClassifier in python?
How do we use randomized PCA in Python?
How can we do Face recognition using PCA and KNN?
Tutorial III
What is the curse of dimensionality?
What is feature selection?
What is feature reduction and PCA? (principal component analysis)
How do you calculate the eigen values and eigen vector of a matrix?
What is K-NN (K Nearest Neighbour) Classification?
Bayesian Learning
How is probability used for modelling concepts?
What is Bayes theorem?
Can we look at an example of Bayes theorem?
How can Bayes theorem be applied to find the hypothesis in machine learning? (MAP hypothesis)
What is Bayes optimal classifier?
Gibbs sampling
Naive Bayes
Naive bayes algorithm
Naive bayes algorithm for discrete x
What is smoothing and why is it required?
Can we look at an example of naive bayes algorithm for discrete x?
How do we use smoothing when estimating parameters?
What is the assumption that we made in naive bayes and what happens if it is invalid?
What is gaussian naive bayes? (for continuous X, but discrete Y)
What are bayesian networks?
Bayesian Network
Why do we need bayes network?
Can we look at an example of bayes network?
What does a bayesian network represent?
What can we do with a baynesian network (Inference)?
Where can we apply bayesian network?
How do we define a bayesian network?
What is the graphical representation of naive bayes model?
What is the hidden markov model?
How is learning helped by bayesian belief networks?
Python Exercise on Naive Bayes
How to use the naive bayes classifier?
What is naive bayes classifier?
How is naive bayes classifier relevant in the context of email spam classification?
Tutorial IV
How do we estimate the probabilities using the frequency distribution of probability?
How do we use bayes rule?
What is MAP inference?
What is naive bayes assumption?
What is bayesian networks (the structures), inference and marginalization?
Logistic Regression
What is Logistic Regression (for Classification problems) and sigmoid function?
What are some of the Interesting Propreties of Sigmoid function?
How can we use stochastic gradient descent with logistic regression?
Introduction Support Vector Machine
Support vector machine
Functional margin
Functional margin of a set of point
Solving the optimization problem
SVM The Dual Formulation
Lagrangian duality in brief
The KKT conditions
Implication of Lagrangian
The dual problem
SVM Maximum Margin with Noise
Linear SVM formulation
Limitation of previous SVM formulation
What objective to be minimized?
Lagrangian
Dual formulation
Nonlinear SVM and Kernel Function
Non-linear SVM, feature space and kernel function
Kernel trick
Commonly used kernel function
Performance
SVM Solution to the Dual Problem
SMO algorithm (sequential optimization)
Cordinate ascent
SMO (for dual problem)
Python Exercise on SVM
Support vector classification
Visualize the decision boundaries
Load data
Introduction to NN
Neural network and neuron
Perceptron - basic unit in NN
Gradient decent
Stochastic gradient descent
Multi-layer networks - by stochastic many NN
Multilayer Neural Network
Limitation of perceptrons
Multi-layer NN
Power/ Expressiveness of multilayer networks
Two-layer back-propagation neural network
Learning for BP nets
Derivation
Neural Network and Backpropagation Algorithm
Single layer perceptron and boolean functions (OR, XOR)
Representation capability of NNs
Learning in multi layer N using back propagation
Derivation
Back propagation algorithm
Training practices: batch vs stochastic and learning in epoch
Overfitting in anns and local minima
Deep Neural Network
Deep learning
Hierarchical representation & unsupervised pre-training
Architecture & Training
Pooling
CNN properties
Python Exercise on Neural Network
How can we create a artificial neural network using TensorFlow and TFLearn to recognize handwritten digits?
How do we load dependencies (to recognize handwritten digits)?
How do we load the data (to recognize handwritten digits)?
How do we make the model (to recognize handwritten digits)?
How do we train the model (to recognize handwritten digits)?
What is our takeaway from this exercise (to recognize handwritten digits)?
Tutorial VI
What is a perceptron?
What is perceptron learning rule?
How do we represent a boolean function using a perceptron?
What is forward and backward pass algorithm or backpropagation algorithm?
Stochastic gradient descent and batch gradient descent
Quick overview of some deep learning algorithms
Introduction to Computational Learning Theory
Goal of learning theory & Core aspect of machine learning
PAC
Prototypical concept learning task
Sample Complexity Finite Hypothesis Space
What is Sample Complexity?
Can we look at an example of consistent case?
What is Find-S algorithm and what can it do?
VC Dimension
What kind of theorems do we have when hypothesis state is infinite?
What is shattering?
What is the definition of VC dimension?
What is the upper bound and lower band on sample complexity with VC?
Introduction to Ensembles
What is ensemble learning?
How can we use weak learners?
How can we combine learners in Bayesian classifiers?
Why are ensembles successful and what are the main challenges with them?
Bagging and Boosting
What is Bagging?
What is Boosting and what is AdaBoost?
Why does ensembling work?
Introduction to Clustering
What is unsupervised learning and clustering?
What are some applications of clustering, and what are various aspectis of clustering?
Major clustering approaches
How can we measure the quality of clustering?
Kmeans Clustering
What is K-means algorithm?
How can we describe K-means Algorithm, and can we look at an illustration of it?
What are the similarity and distance measures?
What is the proof of convergence of K-means, time complexity, advantages and disadvantages?
What is model based clustering?
How can we apply K-means on a RGB image?
What is EM algorithm?
Agglomerative Hierarchical Clustering
What is hierarchical clustering, bottom up and top down clustering?
What is a Dendrogram?
What is the algorithm for Agglomerative Hierarchical Clustering?
What is the complete link method?
What is average link clustering?
Python Exercise on kmeans clustering
Can we look at python code for K means algorithm?
Can we look at python code for gaussian mixture model?
Hierarchical agglomerative clustering
Tutorial VIII
What is K-means clustering?
Solving a sample problem n K-means clustering
What is agglomorative hierarchical clustering?
What is gaussian mixture model?