Morning: Review monitoring dashboards—model accuracy dropped on segment of users (data drift?). Help
data scientist debug training pipeline that's crashing (out of memory, increase instance size). Meeting about new model deployment (they want blue-green deployment but don't know what that means). Update Kubernetes configs for better resource utilization (costs were getting silly).
Midday: Build automated retraining pipeline because manual retraining every week is unsustainable. Set up A/B testing framework for model experiments. Optimize model serving because latency is too high (switch to batching, reduce model size). Meeting with SRE about oncall rotation (yes, you'll be oncall). Debug why models work in staging but fail in prod (classic infra issue).
Afternoon: Code review for ML pipeline changes. Implement feature store so teams stop recreating same features. Work on cost optimization (inference costs are 40% of cloud bill). Document deployment process (actually important for MLOps). Oncall alert—model endpoint is 503ing (scale up replicas, investigate root cause later). Help hiring team interview MLOps candidate (there aren't many, hope they're good).
Breakdown: 40% infrastructure/automation work, 25% debugging/oncall, 20% building new MLOps capabilities, 15% meetings/collaboration. It's more operational than ML Engineer, more ML than regular DevOps. If you love automation, reliability, and seeing systems work smoothly, you'll love this. If you want to build models, you'll be bored.