LinkedIn ML System Design: Real-Time Nearby Recommendations
Question Description
Design a backend recommender that returns personalized nearby locations (restaurants, cafes, attractions) to mobile users in real time. You must accept streaming GPS coordinates and optional preferences via a REST API, apply efficient geospatial filtering, generate a candidate set, rank results by personalized relevance, and return a top-N list (name, distance, rating) within a tight latency SLA.
Start by defining the high-level flow: ingestion (mobile -> API -> gateway), geospatial filtering (H3/R-tree or geo-index query), candidate generation (popularity + collaborative or content-based candidates), online ranking (learned model or lightweight scoring), caching and edge serving, and feedback logging (clicks, visits, ratings) back into your training and feature pipelines. Consider how streaming frameworks (Kafka, Flink) and a feature store enable real-time features for moving users.
Key trade-offs and constraints you should discuss: meeting 100–200 ms latency (caching, pre-computed candidates, model size), scaling to millions of DAU (horizontal sharding, regional replicas, CDNs), accuracy vs. freshness (how often to refresh candidate pools), and privacy/GDPR compliance for location data (minimization, TTLs, consent).
Skills you’ll show: geospatial indexing, low-latency serving, candidate generation strategies, real-time feature engineering, ranking/model deployment, A/B testing, and secure data handling. Be prepared to justify architecture choices and describe monitoring, fallbacks, and cold-start solutions.
Common Follow-up Questions
- •How would you design the candidate generation stage to guarantee sub-200ms end-to-end latency for a moving user?
- •How do you handle cold-start users and new locations so personalization remains useful and accurate?
- •Describe how you would incorporate streaming contextual features (time of day, weather) into online ranking without violating latency constraints.
- •What privacy-preserving techniques would you apply to location logs to satisfy GDPR while maintaining personalization quality?
- •How would you monitor and evaluate relevance in production? Propose online metrics, offline experiments, and an A/B testing strategy.
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.