ML Feature Store
Problem
Design a Feature Store — a centralised repository for storing, sharing, and serving ML features for training and inference.
Why It Matters for You
Direct relevance: Crest Data ML thresholding engine — KPI features needed consistent computation across training and real-time inference. A feature store solves exactly this.
Functional Requirements
- Store + version features computed from raw data
- Serve features for real-time inference (low latency)
- Batch features for model training
- Feature discovery — catalog with metadata
Non-Functional Requirements
- Low latency online serving (< 10ms p99)
- High throughput batch reads
- Consistency between training and serving (training-serving skew problem)
High-Level Design
Data Sources → Feature Pipeline (Spark/Flink) →
Offline Store (S3/Data Warehouse) → Model Training
Online Store (Redis/DynamoDB) → Real-time Inference
↑
Feature Registry (metadata, versioning)
Key Concepts
- Online Store — low-latency KV store (Redis, DynamoDB) for inference
- Offline Store — historical data lake (S3 + Parquet) for training
- Feature Pipeline — batch (Spark) or streaming (Flink/Kafka) computation
- Feature Registry — catalog of all features, versions, ownership
- Training-serving skew — biggest problem: same feature must be computed identically in batch and streaming
Interview Angle (Your War Story)
At Crest, the ML thresholding engine needed KPI features at training time and at real-time inference time. Without a feature store, we risked skew. Use this to explain WHY feature stores exist.