Palantir ML System Design: Scalable Music Recommender
Question Description
You are asked to design a scalable, low-latency music recommendation service for a large streaming platform (think Spotify or SoundCloud) that serves personalized playlists and song suggestions to millions of users.
Focus on a high-level architecture that separates candidate generation and ranking, supports real-time updates from listening events, and provides RESTful APIs such as GET /recommendations?user_id=123&limit=10. Describe storage for user profiles, song metadata, listening history, feature stores, and a fast embedding/nearest-neighbor store for real-time retrieval.
A typical interview flow: clarify constraints and SLAs (latency <100ms, 99.9% availability); propose an end-to-end architecture (event ingestion, feature pipeline, offline training, online feature service, ANN store, ranking service, caching, and API gateway); sketch data models and schema choices; and discuss operational concerns (scaling, monitoring, A/B testing, and cost trade-offs).
Skills and signals you should show: system design thinking for distributed services, knowledge of recommender architectures (candidate generation vs ranking), experience with streaming data (Kafka), feature stores and online features, approximate nearest neighbor (ANN) techniques (Faiss/HNSW), caching strategies (Redis/CDN), model deployment/versioning, and evaluation (offline metrics + online A/B tests). Also explain how you would handle cold-start, privacy, and evolving models without downtime.
Common Follow-up Questions
- •How would you design the candidate generation pipeline to scale and reduce latency—what offline vs. online components would you use?
- •How do you handle cold-start for new users and new songs while keeping recommendations relevant in real time?
- •What metrics and A/B test setup would you use to evaluate ranking model changes and measure engagement improvements?
- •Describe trade-offs between using exact nearest-neighbor search and approximate nearest-neighbor (ANN). When would you prioritize precision vs. latency?
- •How would you design the feature store and online feature lookup to keep user features consistent and up-to-date across regions?
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.