I wanted to learn MLOps properly — not just the concepts, but the actual implementation patterns. Most tutorials stop at model.fit(), but production ML needs reproducible experiments, versioned artifacts, configurable pipelines, and clean separation between training and serving. None of that shows up in a Kaggle notebook.
So I built a reference implementation that covers all of it.
I chose housing prices as the domain because it’s a classic ML dataset and the problem is well-understood. That let me focus on the infrastructure patterns rather than the modeling. This project isn’t really about predicting house prices — it’s about building ML systems the way you’d want them built in the real world.
What it covers
Pipeline Modularity
Feature engineering, training, evaluation, and prediction are separate CLI commands. Each stage reads config, logs to MLflow, and produces artifacts the next stage can consume.
poetry run feature-engineering # Fit and persist preprocessors
poetry run train-model # Train and log to MLflow
poetry run evaluate-model # Score on holdout data
poetry run predict-model # Run single predictions
MLflow Tracking
Every run logs parameters, metrics, and artifacts. Models are versioned. Runs can be compared in the UI. Nothing gets lost.
poetry run mlflow ui --backend-store-uri file:./mlruns
Streaming Inference
Beyond batch, there’s a streaming workflow that continuously reads data, runs inference, and writes predictions. Configurable batch size, checkpointing, optional metric logging. It’s designed as a stepping stone toward real streaming systems (Kafka, Flink) without all the infrastructure overhead.
poetry run stream-infer
FastAPI Serving
A /predict endpoint that loads the latest MLflow model. Partial inputs are auto-imputed from training statistics. You can pin to a specific run ID or let it grab the most recent.
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"features": {"area": 7500, "bedrooms": 3, "stories": 2}}'
Configuration Management
YAML defaults with environment variable overrides and CLI flags. Flexible enough to handle different environments without hardcoding anything.
The patterns
This is what the repo is really about:
- Modular CLIs — each pipeline stage is independently runnable and testable
- Artifact tracking — models, preprocessors, configs all versioned in MLflow
- Config layering — YAML → env vars → CLI flags, documented in
docs/CONFIG_REFERENCE.md - Training/serving separation — the serving layer pulls artifacts from MLflow, not from training code
- Imputation at inference — training statistics are logged alongside the model so inference handles missing fields gracefully
Current state
Complete. It does what it set out to do. The pipeline runs end-to-end, artifacts are tracked, the API serves predictions, streaming works.
If I extend it, the natural next step is swapping the CSV reader for a Kafka consumer. The streaming code is structured with that migration in mind.
This repo is meant to be forked and adapted. The housing prices are just a vehicle — the value is in the patterns.