# DATA SCIENCE AND ML INTERMEDIATE WORKSHOP : COURSE OUTLINE

Duration: 3 Days

Introductions, Stats, Prob, Python, Data Manipulation, Visualization (Day 1)

● Introduction to Data Science and Understanding of problem Statement
● Basic Statistics – Measures of Central Tendencies and Variance
● Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
● Inferential Statistics -Sampling – Concept of Hypothesis Testing Statistical Methods – Z/t-tests (One sample, independent, paired), Analysis of variance, Correlations and Chi-square
● Important modules for statistical methods: NumPy, SciPy, Pandas
● Using Statistical methods on visualization and understanding concepts
● Treatment of Data
● Data Manipulation

Supervised Learning (Day 2)

● What is Linear & Non –Linear
● Different types of Data – Numerical & Categorical
● Vector Space , Mathematical functions Dimensions and Their Graphical/Vector Representation
● Introduction to Machine Learning & What is Model
● Types of ML problem
● Model and Curve
● Linear Regression & Equation
● LR Solvers – OLS method
● LR Solvers – Gradient Descent
● Assumptions of LR
● Evaluation metrics of LR
● Advance LR concept , Non-linear L1 & L2
● Case-study
● Basics of probability and Odds
● Classification using Linear Regressions
● Logit Equation and Logit function to solve the Classification
● Classification Evaluation – Accuracy, Confusion Matrix, Precision, Recall & F1
● ROC and AUC curve
● Feature Engineering & Feature Selection in ML Algorithms
● Model Interpretability using SHAP
● Use-cases

Unsupervised Learning (Day 3)

● Introduction to Unsupervised Learning
● Concepts behind Unsupervised techniques and understanding according to business use-cases
● Clustering & Segmentations in ML
● K-means Clustering technique
● Spectral Clustering , DBSCAN & Optics algorithms for clustering
● Multi-Cluster Algorithm Analysis of Unsupervised Problems
● Evaluation Metrics for Clustering
● Use-Cases