Pinterest System Design: Scalable Catalog Update System
Question Description
You must design a backend service that lets merchants update product catalogs both in real time and via very large batch uploads. The system should accept single-product edits (price, stock, description) synchronously with low latency and handle asynchronous bulk uploads (CSV/JSON) up to 500,000 items with robust validation and reporting.
Start by describing the end-to-end flow: upload/submit → validation → staging → queued processing → apply changes → notify and audit. For single updates you should show how you achieve strong consistency and low latency (under ~100ms): API design, optimistic locking or versioning, direct writes to a primary store, and synchronous index updates or change propagation. For bulk updates focus on scalability and availability: chunking, distributed job queues, horizontally scalable workers, backpressure, resumable processing, and idempotent apply semantics.
You’ll be asked to explain data models, storage choices (relational vs. key-value vs. append log), sharding/partitioning strategies, and how to keep downstream systems (search, ordering) in sync (CDC, events, reindexing). Discuss validation, error reporting per-item, metrics, merchant-facing status tracking (job IDs, dashboards), notifications, audit logs, and optional rollback strategies. Be prepared to justify trade-offs between strong vs. eventual consistency, throughput vs. latency, and costs. Interviewers want to see systems thinking around job queues, data consistency, migration patterns, and operational concerns (monitoring, retries, and disaster recovery).
Common Follow-up Questions
- •How would you design idempotent bulk processing so retries (or duplicate uploads) never create duplicate or corrupt product data?
- •Describe a safe schema migration strategy for adding new product fields when millions of products and in-flight bulk jobs exist.
- •How would you handle concurrent updates to the same product coming from a bulk job and a real-time API call? Explain versioning or optimistic locking choices.
- •Design the monitoring, alerting, and SLA reporting for bulk job throughput and per-merchant latency spikes during a Black Friday traffic event.
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.