June 4, 2018 by Artificial Intelligence Training, Training

3 Day Training Course: Machine Learning

Day 1


Topic Description
Core Concepts and Techniques. Theory
  • An introduction to machine learning tasks and definitions
  • Core principles of building machine learning algorithms
  • A diversity of machine learning algorithms: from linear regression to random forest
  • Core Python packages for machine learning
Core Concepts and Techniques. Labs

*Packages of choice are Pandas/NumPy/scikit-learn

  • Linear and logistic regressions
  • k-nearest neighbors and k-means
  • Decision trees and random forest
  • Handling classification, regression, and clustering tasks


Day 2

Topic Description
Advanced Algorithms. Theory
Day 2 will cover the use of advanced theoretical concepts underlying such complex models as:

  • LASSO/Ridge (regularization)
  • PCA/SVD (dimensionality reduction)
  • Advanced clustering algorithms, such as DBSCAN, expectation-maximization (different similarity approaches to data)
  • Naive Bayes (The Bayes theorem)
  • Complex ensembling schemes, gradient boosting, stacking (iterative refinement)
  • Algorithmic hyperparameter tuning
Advanced Algorithms. Labs

*Packages of choice are Pandas/NumPy/scikit-learn/HyperOpt/XGBoost

  • PCA
  • DBSCAN, expectation-maximization, agglomerative clustering, mean shift
  • Naive Bayes
  • Gradient boosting machine, stacking
  • Tree-structured Parzen estimator

Day 3

Topic Description
Feature Engineering and Development Methodology. Theory A wide range of topics related to building ML models will be covered:

  • Feature engineering
  • Dealing with missing data and outliers
  • Dealing with imbalanced classification
  • Advanced validation schemes
  • Handling of model versioning
  • CRISP-DM as a major machine learning development methodology
Feature Engineering and Development Methodology. Labs

*Packages of choice are Pandas/NumPy/scikit-learn

Feature engineering:

  • Polynomial and logarithmic features, combinations of features
  • Periodic feature encoding
  • Target encodings

Imbalanced classification:

  • Advanced metrics for classification
  • Threshold tuning
  • Over- and undersampling (SMOTE)

Missing data handling:

  • Imputation of missing values using k-nearest neighbors or decision trees

Advanced validation:

  • Cross-validation for time series


© 2001–2019 Altoros