ml system design
Roblox
Unity
Epic Games

Roblox ML System Design: Real-time Game Recommendations

Topics:
Learning to Rank
Recommender Systems
Candidate Generation
Roles:
Machine Learning Engineer
Recommender Engineer
Data Scientist
Experience:
Mid Level
Senior
Staff

Question Description

You must design a real-time game recommendation system for a large gaming platform that matches users to games by lifetime average playtime per user and per game. The core task is to ingest session events (user, game, start/end, duration), compute lifetime averages (cumulative playtime / count) for both users and games, and use those aggregates in a low-latency pipeline that serves personalized recommendations within 100–200 ms.

Start by defining the high-level flow: event ingestion → streaming aggregator (incremental updates to per-user and per-game sums/counts) → feature store / materialized view → candidate generation (popularity, playtime-similarity, content-based or approximate k-NN) → learning-to-rank reranker that uses lifetime average playtime plus contextual features → online cache and recommendation API. Include a batch job (daily) to recompute long-running aggregates and long-tail corrections while the stream keeps recent totals fresh.

You should demonstrate knowledge in streaming frameworks (Kafka/Kinesis + Flink/Spark Streaming), stateful exactly-once or idempotent updates, in-memory stores for low-latency (Redis, memcached), wide-column stores for scale (Cassandra/Bigtable), feature store design, and LTR model integration. Discuss trade-offs: eventual consistency vs strict accuracy, incremental vs batch recompute, handling cold-start games/users, skew/hot-keys for popular titles, freshness windows, and metrics (playtime uplift, CTR, latency).

Common Follow-up Questions

  • How would you compute and maintain exact lifetime average playtime with streaming data to tolerate out-of-order and duplicate events (what semantics and frameworks do you choose)?
  • How do you handle cold-start users and games so lifetime averages don't bias recommendations toward only established titles?
  • Design the candidate generation step for a catalog of millions of games: what offline indexes or approximate nearest neighbor strategies do you use to meet 100–200 ms latency?
  • How would you mitigate hotspotting for extremely popular games (skew) in both the aggregator and the serving layer?

Related Questions

1Design a low-latency recommender for a large video or music platform using lifetime consumption signals
2How to build a feature store and online store for real-time recommendation serving
3Design a scalable candidate generation pipeline for millions of items using ANN and inverted indexes
4How to evaluate and A/B test a playtime-optimized ranking model (metrics and experiment design)

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Roblox Real-time Game Recommendation System Design | Voker