ml system design
Bloomberg
Amazon
Stripe

Bloomberg ML System Design: Real-time Fraud Detection

Topics:
Fraud Detection
Online Inference
Feature Engineering
Roles:
Machine Learning Engineer
Data Scientist
Software Engineer
Experience:
Mid Level
Senior
Staff

Question Description

You are asked to design a low-latency, production-ready machine learning system that detects fraudulent e-commerce transactions in real time.

The core task is to ingest streams of transaction events (user profile, payment method, device, location, purchase history), compute real-time features, run an online inference model, and output a fraud probability and binary decision within strict latency bounds (<=100 ms). You must also provide data storage for auditing and retraining, a safe deployment strategy for model updates, and monitoring for model quality and data drift.

High-level flow you should discuss:

  • Data ingestion and validation (streaming platforms, gateways)
  • Real-time feature computation (feature store, windowing, sessionization)
  • Low-latency model serving (lightweight ensemble or neural network optimized for inference)
  • Decisioning and alerting (rules + score thresholds, manual review queue)
  • Storage and offline pipelines (cold store for full transactions, warm store for recent context)

Skill signals interviewers expect:

  • Knowledge of online inference and feature stores (latency, consistency, cold starts)
  • Trade-offs between precision/recall and user experience (thresholding, cost of false positives)
  • Scalability and reliability design (sharding, autoscaling, failover)
  • Monitoring and observability (latency SLOs, data/model drift detection, alerting)

Be ready to justify design trade-offs, propose A/B or shadow testing strategies, and describe responses to evolving fraud patterns.

Common Follow-up Questions

  • How would you detect and handle concept drift in fraud patterns without degrading user experience?
  • Describe the architecture of a feature store that supports sub-100 ms lookups for recent and historical features.
  • What are the trade-offs between using a lightweight GBDT versus a deep neural network for online inference in this low-latency setting?
  • How would you design canary/rolling model updates and shadow testing to avoid downtime or sudden drops in precision?
  • Explain how you would instrument monitoring and alerting to catch model degradation, data pipeline failures, and latency SLO breaches.

Related Questions

1Design a scalable online feature pipeline for session-based user behavior
2Design a real-time anomaly detection system for payment streams
3How to build an A/B testing and continuous evaluation pipeline for production ML models
4Design an online learning system that updates models incrementally from streaming labels
5Architect a high-throughput, low-latency inference platform for fraud scoring

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Real-time Fraud Detection Design — Bloomberg ML Interview | Voker