ml system design
Meta
Facebook

Meta ML System Design: Real-Time Personalized Feed Ranking

Topics:
Feed Ranking
Recommender System
Real-Time Personalization
Roles:
Machine Learning Engineer
ML Engineer
Experience:
Mid Level
Senior

Question Description

Overview

You are asked to design a real-time machine learning ranking system that personalizes the order of articles and comments in Meta's news feed. The system must serve personalized ranked lists per user with sub-100ms latency, incorporate real-time interactions (clicks, likes, dwell time), handle millions of users and content items, and support A/B testing, monitoring, and explainability.

Core requirements and scope

Explain how you would perform candidate generation, feature engineering, and final ranking. Describe offline model training and an online serving stack that ingests streaming signals, updates user state, and recomputes scores or re-ranks candidates quickly. Address filtering (inappropriate or already-seen items), cold-start for new users/content, diversity and fairness constraints, and a feedback loop that logs interactions for continual improvement.

High-level flow / stages

  1. Candidate generation (recall) — fast, high-recall retrieval (e.g., embeddings, inverted indices, graph-based or heuristic filters).
  2. Feature enrichment — join user profile, session/context signals, content metadata, and recent events from a feature store or stream layer.
  3. Scoring & ranking — low-latency model (tree-based, neural pointwise/pairwise, or learned-to-rank) with diversity/fairness post-processing.
  4. Logging & feedback — collect impressions, clicks, dwell time for training and online evaluation.
  5. Experimentation — support A/B tests, shadow runs, and canary deployments.

Skill signals interviewers look for

You should demonstrate understanding of scalable retrieval, latency/throughput trade-offs, feature stores and real-time feature pipelines, online vs offline learning (including bandits), evaluation metrics (CTR, dwell, novelty, fairness), monitoring/SLAs, and practical deployment strategies (caching, sharding, graceful degradation). Discussability on explainability, cost optimizations, and mitigation for filter bubbles is a plus.

Common Follow-up Questions

  • How would you handle cold-start for new users and newly posted articles/comments without retraining the main model?
  • Design an approach to enforce diversity and fairness in the ranked feed while minimizing loss in engagement (describe algorithms and trade-offs).
  • Explain how you'd architect online learning or contextual bandits for real-time personalization and what safeguards you'd add to prevent feedback loops.
  • If your 100ms latency SLO is violated during peak traffic, what mitigation strategies would you implement (caching, approximation, degraded models)?

Related Questions

1Design a candidate generation and ranking pipeline for a large-scale recommender system
2How to build a low-latency feature store and streaming feature pipeline for online ML serving
3Design an online learning system for news recommendations with exploration-exploitation (bandits)
4How would you evaluate and A/B test ranking models for engagement and long-term satisfaction?

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Meta: Real-Time Personalized Feed Ranking Design System | Voker