eBay System Design: Concert Ticketing with Real-Time
Question Description
Overview
You are asked to redesign a high-demand concert ticketing backend (e.g., for Taylor Swift–level launches) so users see accurate seat availability in real time and tickets cannot be oversold. The system must support millions of concurrent users, short reservation windows, fast read responses (<100ms), and social features (showing friends' purchases) without adding significant latency.
What you'll be asked to design
You should propose a read/write-separated architecture where a durable write store is the source of truth and low-latency read stores provide availability checks. Explain how reservations with TTLs, idempotency keys for checkout, and optimistic/strong concurrency controls prevent oversells. Discuss cache consistency strategies (write-through, cache-aside with versioning, or invalidate-on-write) and how to minimize replication lag (CDC streams like Kafka to push updates to read replicas or materialized views).
High-level flow and stages
- Client requests availability -> edge cache / regional read replica returns seat map (<100ms).
- User reserves seat -> short TTL reserve entry in Redis and a write to the primary store (transactional).
- Checkout -> payment processed with idempotency; final write marks seat sold; CDC updates read projections and invalidates caches.
- Notifications sent asynchronously (email/SMS).
Key skills & signals to demonstrate
Show knowledge of distributed systems (replication, CDC), caching and cache-consistency techniques, concurrency control (optimistic locking, distributed locks like Redis carefully), rate limiting and anti-bot strategies, capacity planning for spikes, fault tolerance, and secure payment handling. Describe trade-offs (latency vs. consistency) and deployment patterns (multi-region read replicas, autoscaling, observability).
Common Follow-up Questions
- •How would you implement the reservation TTL and stale-reservation cleanup so expired holds get released quickly without creating contention?
- •Explain a CDC-based approach (e.g., Kafka + materialized views) to keep read stores in sync and how you would guarantee propagation under 100ms during peak load.
- •How do you design distributed locking/seat allocation at scale to avoid a Redis hot-spot and still prevent oversells? Compare optimistic retries vs. distributed locks.
- •Describe rate-limiting and anti-bot strategies you would add for flash sales (global vs. per-user vs. per-IP) and how they affect fairness and latency.
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.