backend system design
ByteDance
TikTok
Kuaishou

ByteDance System Design: Reward Backend for High QPS

Question Description

Overview

You are asked to design a scalable, resilient backend for a mobile app rewarding mechanism where users earn and redeem points. The system must sustain high QPS (e.g., 10k req/s) during promotions while preventing double-spend, preserving auditability, and keeping latency low.

Core problem & constraints

You should cover accurate crediting and deduction of points, near-real-time balance updates, audit logging, user notifications, and graceful behavior under load spikes. Non-functional constraints include 99.9% availability, sub-200ms latency for most requests, fault tolerance, and cost-efficiency. Expect trade-offs between strong and eventual consistency for throughput.

High-level flow / interview stages

  • Ingest: API gateway + rate limiter + authentication
  • Validation: verify action (ad view, purchase, task) possibly via event callbacks
  • Accounting: update user balance (fast path via cache/sharded counter, durable write to ledger)
  • Redemption: reserve points, atomically complete deduction, trigger fulfillment
  • Audit & monitoring: append immutable transaction logs (event stream) and write to OLTP for reconciliation

In an interview you'll be expected to sketch components, explain data flow, and justify consistency and failure-handling choices.

Skill signals interviewers look for

  • Designing stateless services, load balancing, and autoscaling
  • Caching strategies (read-through/write-through, sharded counters) and cache invalidation
  • Rate limiting (token bucket, per-user/global tiers) and backpressure
  • Concurrency control and idempotency to avoid double-spend (optimistic locks, distributed locks, compare-and-swap)
  • Durable event logging (Kafka/event sourcing) and reconcilers for eventual consistency
  • Monitoring, SLA-aware retries, circuit breakers, and cost/performance trade-offs

Prepare to discuss concrete technologies (Redis, Kafka, RDBMS), sharding, and fallback behavior during outages.

Common Follow-up Questions

  • How would you design the system to guarantee idempotent reward operations and prevent double-spending under retries and network partitions?
  • Describe a sharding strategy for user balances and how you'd rebalance shards without downtime during growth or promotions.
  • How would you implement rate limiting and backpressure for global promotions while preserving a fair per-user experience?
  • If Redis (cache) fails during a peak event, what fallback path ensures correctness and acceptable latency for earning and redeeming points?
  • Explain how you'd design reconciliation and audit pipelines (event sourcing vs ledger writes) to detect and fix inconsistencies in balances.

Related Questions

1eBay System Design: Concert Ticketing with Real-Time
2Netflix System Design: Real-Time Ad Impression Limiter
3NVIDIA System Design Interview: Distributed Rate Limiter
4Rate Limiter Dropped Requests (Coding) - Snowflake
5Roblox Backend System Design: Like/Unlike (1M QPS Scale)
6Design a distributed rate limiter for API traffic and promotional events
7How to implement a global counter service for high-throughput reward systems
8Design a real-time user balance service with low latency and strong correctness
9Implement idempotent transaction patterns for reward credit/debit operations
10Architect a promotion event platform that scales to millions of concurrent users

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Rewarding Mechanism System Design - ByteDance High QPS | Voker