import sklearn
import pandas as pd
import numpy as np
sklearn.__version__'1.8.0'
Machine learning is often presented as a collection of algorithms.
Linear regression.
Logistic regression.
Decision trees.
Neural networks.
But applied machine learning is not a menu of techniques.
It is a structured workflow:
Design → Data → Model → Evaluation → Interpretation
In production settings, this workflow extends further to deployment and monitoring. In this free track, we focus on building the foundation correctly.
Machine learning is a system, not a model.
Machine learning is the practice of building systems that learn patterns from data to make predictions or decisions on unseen inputs.
In applied settings, success depends less on algorithm choice and more on:
Unlike traditional programming, rules are learned from data rather than hard-coded.
Machine learning supports decision-making across domains:
The domain changes.
The workflow remains.
This guide focuses on supervised learning using structured tabular data.
You will learn how to:
This is a complete and scalable foundation.
Advanced topics such as hyperparameter tuning, ensemble methods, deployment, and monitoring build on this structure but are beyond the scope of this free track.
You should be comfortable with:
Basic familiarity with statistics is helpful but not required.
This project uses a virtual environment to isolate dependencies.
#| label: create-environment
bash scripts/setup-env.sh
source .venv/bin/activate
This installs the core dependencies:
#| label: generate-dataset
python scripts/make-cdi-customer-churn.py
The dataset will be saved to:
data/ml-ready/cdi-customer-churn.csv
This dataset will be used consistently throughout the guide.
import sklearn
import pandas as pd
import numpy as np
sklearn.__version__'1.8.0'
If a version number prints, the environment is correctly configured.
To render the Quarto book locally:
#| label: render-book
bash scripts/build-all.sh
The rendered site will appear in:
docs/
Navigation is handled automatically by Quarto.
Do not rush into modeling.
Most machine learning errors originate in:
The next lesson begins with the most important skill:
Framing the prediction problem correctly.