PayPal ML System Design: Real-Time Fraud Detection Engine
Question Description
Design a real-time fraud detection system that inspects millions of e-commerce or financial transactions and responds within stringent constraints. You’ll be asked to design an end-to-end streaming ML pipeline that ingests transaction events (user ID, amount, device, location, history), computes features, scores risk, and enforces decisions (allow, flag, block) with minimal user friction.
Start by describing the high-level flow: event ingestion (Kafka/Kinesis), lightweight rule-based pre-checks, stateful stream processing for feature computation (Flink/Spark Streaming), a low-latency feature cache (Redis/FAISS), model scoring via a serving layer (TF-Serving/Triton), and a decision engine with configurable thresholds and manual-review hooks. Include offline components: feature store, batch training, labeling pipelines, automated retraining, and canary deployments to rollout models without downtime.
You must show how you meet non-functional requirements: keep total processing latency under 100 ms through compact features and local caches; scale horizontally to 10k TPS with partitioning and autoscaling; maintain 99.99% availability with redundant services and graceful degradation (fall back to rule-based checks); and reduce false positives below ~1% by combining supervised models, ensemble scoring, and human-in-the-loop review for borderline cases.
Signal the skills you’ll demonstrate: streaming architectures, feature engineering for velocity and behavioral features, online/offline model lifecycle, latency/throughput trade-offs, fault tolerance and observability (SLOs, logging, audit trails), concept-drift handling, and compliance-aware logging for audits. Use concrete choices and trade-offs rather than abstract descriptions to show practical engineering judgement.
Common Follow-up Questions
- •How would you detect and respond to concept drift in the fraud model? Describe monitoring, automatic retraining triggers, and feature validation.
- •Design a feature-store and online feature serving strategy to keep scoring within 100 ms. Which features are computed online vs offline and why?
- •Explain a rollout strategy (canary/blue-green) and offline A/B evaluation to ensure new models don’t increase false positives above the 1% threshold.
- •How would you architect the system to ensure idempotency and exactly-once scoring semantics when events can be duplicated in the stream?
- •Describe approaches to explainability and compliance: how do you provide human reviewers and auditors with reasons for a block decision?
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.