backend system design
ByteDance
TikTok
Kuaishou

ByteDance System Design: Reward Backend for High QPS

Topics:
Caching
Rate Limiting
Load Balancing
Roles:
Software Engineer
Backend Engineer
Platform Engineer
Experience:
Mid Level
Senior
Staff

Question Description

Overview

You are asked to design a scalable, resilient backend for a mobile app rewarding mechanism where users earn and redeem points. The system must sustain high QPS (e.g., 10k req/s) during promotions while preventing double-spend, preserving auditability, and keeping latency low.

Core problem & constraints

You should cover accurate crediting and deduction of points, near-real-time balance updates, audit logging, user notifications, and graceful behavior under load spikes. Non-functional constraints include 99.9% availability, sub-200ms latency for most requests, fault tolerance, and cost-efficiency. Expect trade-offs between strong and eventual consistency for throughput.

High-level flow / interview stages

  • Ingest: API gateway + rate limiter + authentication
  • Validation: verify action (ad view, purchase, task) possibly via event callbacks
  • Accounting: update user balance (fast path via cache/sharded counter, durable write to ledger)
  • Redemption: reserve points, atomically complete deduction, trigger fulfillment
  • Audit & monitoring: append immutable transaction logs (event stream) and write to OLTP for reconciliation

In an interview you'll be expected to sketch components, explain data flow, and justify consistency and failure-handling choices.

Skill signals interviewers look for

  • Designing stateless services, load balancing, and autoscaling
  • Caching strategies (read-through/write-through, sharded counters) and cache invalidation
  • Rate limiting (token bucket, per-user/global tiers) and backpressure
  • Concurrency control and idempotency to avoid double-spend (optimistic locks, distributed locks, compare-and-swap)
  • Durable event logging (Kafka/event sourcing) and reconcilers for eventual consistency
  • Monitoring, SLA-aware retries, circuit breakers, and cost/performance trade-offs

Prepare to discuss concrete technologies (Redis, Kafka, RDBMS), sharding, and fallback behavior during outages.

Common Follow-up Questions

  • How would you design the system to guarantee idempotent reward operations and prevent double-spending under retries and network partitions?
  • Describe a sharding strategy for user balances and how you'd rebalance shards without downtime during growth or promotions.
  • How would you implement rate limiting and backpressure for global promotions while preserving a fair per-user experience?
  • If Redis (cache) fails during a peak event, what fallback path ensures correctness and acceptable latency for earning and redeeming points?
  • Explain how you'd design reconciliation and audit pipelines (event sourcing vs ledger writes) to detect and fix inconsistencies in balances.

Related Questions

1Design a distributed rate limiter for API traffic and promotional events
2How to implement a global counter service for high-throughput reward systems
3Design a real-time user balance service with low latency and strong correctness
4Implement idempotent transaction patterns for reward credit/debit operations
5Architect a promotion event platform that scales to millions of concurrent users

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Rewarding Mechanism System Design - ByteDance High QPS | Voker