ByteDance ML System Design: Live Stream Violation Penalty
Question Description
You must design the post-detection flow for a live streaming violation penalty system that receives real-time alerts from an ML detector (hate speech, nudity, copyright, etc.). The focus is on processing violation alerts, applying rule-based penalties, notifying stakeholders, and storing auditable records—not on how the ML model itself works.
Core tasks you’ll address:
- Ingest and validate violation alerts (stream ID, timestamp, violation type, confidence). Ensure deduplication and idempotency when alerts arrive from multiple detectors or retries.
- Decide penalty logic: map violation types + confidence + user history to actions (warning, temporary suspension, stream stop, permanent ban). Include escalation and decay of strikes.
- Real-time enforcement: stop or restrict a live stream within seconds, update user state, and ensure distributed enforcement consistency.
- Notifications & appeals: push in-app alerts and emails to streamers and moderators with appeal links and human review handoff.
- Data & audit: define schemas for violations, penalties, user state, and immutable audit logs for compliance and debugging.
High-level flow/stages you should walk through in an interview:
- Ingestion & validation (queue, schema check)
- Rule engine & decision (apply penalty rules, check history)
- Enforcement & state update (execute action, write user/stream state)
- Notification & human review (alerts, appeals pipeline)
- Logging & monitoring (audit trails, metrics, retries)
Skill signals interviewers look for: knowledge of low-latency streaming pipelines, distributed consistency (idempotency, exactly-once considerations), schema design for auditability, scale strategies (partitioning, sharding, backpressure), and extensible rule engines. You should also discuss reliability, monitoring, and how to handle false positives and appeal workflows.
Common Follow-up Questions
- •How would you design the system to minimize impact from false positives (appeals, human-in-the-loop, confidence thresholds)?
- •Explain how you'd ensure exactly-once penalty enforcement across a distributed fleet of stream servers and retrying ML alerts.
- •How can you design a scalable, extensible rule engine to add new violation types and complex escalation logic without redeploying services?
- •Describe metrics, monitoring, and alerting you'd add to detect outages, backlog growth, or incorrect penalty application in real time.
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.