## MACHINE LEARNING : COURSE OUTLINE

Duration: 5 Days

Day 1

1. Introduction

Definition of learning systems. Goals and applications of machine learning. Aspects of developing a learning system: training data, concept representation, function approximation.

2. Inductive Classification

The concept learning task. Concept learning as search through a hypothesis space. General-to-specific ordering of hypotheses.

Day 2

3. Decision Tree Learning

Representing concepts as decision trees. Recursive induction of decision trees. Picking the best splitting attribute: entropy and information gain. Searching for simple trees and computational complexity. Occam’s razor. Overfitting, noisy data, and pruning.

4. Ensemble Learning

Using committees of multiple hypotheses. Bagging, boosting, and DECORATE. Active learning with ensembles.

5. Experimental Evaluation of Learning Algorithms

Measuring the accuracy of learned hypotheses. Comparing learning algorithms: cross-validation, learning curves, and statistical hypothesis testing.

Day 3

6. Computational Learning Theory

Models of learnability: learning in the limit; probably approximately correct (PAC) learning. Sample complexity: quantifying the number of examples needed to PAC learn.

7. Rule Learning: Propositional and First-Order

Translating decision trees into rules. Heuristic rule induction using separate and conquer and information gain.

8. Artificial Neural Networks

Neurons and biological motivation. Linear threshold units. Perceptrons: representational limitation and gradient descent training.

Day 4

9. Support Vector Machines

Maximum margin linear separators. Quadractic programming solution to finding maximum margin separators. Kernels for learning non-linear functions.

10. Bayesian Learning

Probability theory and Bayes rule. Naive Bayes learning algorithm.

Day 5

11. Instance-Based Learning

Constructing explicit generalizations versus comparing to past specific examples. k-Nearest-neighbor algorithm. Case-based learning.

12. Text Classification

Bag of words representation. Vector space model and cosine similarity. Relevance feedback and Rocchio algorithm. Clustering and Unsupervised Learning Learning from unclassified data. Clustering. Hierarchical Aglomerative Clustering. k- eanspartitional clustering.

13. Language Learning

Classification problems in language: word-sense disambiguation, sequence labeling. Hidden Markov models (HMM’s).

LAB:

1. We will use Rstudio and Rpackage for the Practice

2. We will cover real time project using Machine learning.

Projects:

1. Project Based on Real time dataset.