Databricks: Real-Time Harmful Content Detection (ML)
Question Description
Design a scalable, low-latency machine learning system to detect harmful user-generated text (hate speech, harassment, misinformation, spam) on a large social media platform. You will be asked to propose an architecture that ingests billions of posts daily, runs real-time inference, and surfaces high-confidence flags for automated action or human review while minimizing false positives that can unfairly censor users.
Start by outlining the high-level flow: content ingestion (streaming APIs/Kafka), lightweight preprocessing (normalization, emoji handling, tokenization), model inference (multi-label classifier or hierarchical taxonomy producing severity scores), post-processing (thresholding, rule-based overrides, confidence calibration), and human-in-the-loop feedback for continuous retraining. Describe storage choices for raw posts, predictions and moderator decisions for auditing and model fine-tuning.
Be explicit about trade-offs: latency vs. model complexity (use distilled/quantized models or two-stage pipelines with a fast filter + heavier contextual model), cost vs. coverage (sampling vs. full inference), and precision vs. recall (tuning thresholds, class weights, and business rules). Discuss robustness: multilingual/ code-mixed support, adversarial obfuscation defenses, and bias mitigation across demographics. Cover monitoring and metrics you would track (precision, recall, ROC/AUC, confusion by class, P99 latency, drift alerts) and how you would automate retraining and safe model rollouts (canary, shadow mode, human review). This question tests system design, online inference, safety classification, and practical ML ops considerations.
Common Follow-up Questions
- •How would you extend the design to support multilingual and code-mixed text while keeping latency low?
- •What defenses would you add to mitigate adversarial evasion (obfuscation, spacing, unicode tricks) and how would you detect new evasion patterns?
- •How do you calibrate thresholds and choose precision vs. recall targets for different harm categories (hate speech vs. misinformation)?
- •Describe an online learning / continuous training pipeline that updates models from moderator feedback with safe deployment (canary, shadow testing).
- •How would you measure and mitigate demographic bias and unfair false positive rates in your moderation pipeline?
Related Questions
Explore More Questions
Practice This Question with AI
Get real-time hints, detailed requirements, and insightful analysis of the question.