Google ML System Design: Fuzzy Video Deduplication

Question Description

You must design a fuzzy deduplication system that detects near-duplicate short videos in real time at Google-scale. The system should ingest millions of uploads per day, make a deduplication decision in seconds, and allow creators to appeal false positives.

Core requirements: use learned embeddings (video-frame + audio + text fusion) to detect fuzzy matches rather than exact hashes; support extremely high concurrency (thousands RPS); be fault tolerant and cost-aware; and include a human-in-the-loop appeal and calibration workflow.

High-level flow: (1) lightweight pre-filter (perceptual hashing, uploader metadata, and text/audio heuristics) to reject obvious uniques; (2) frame sampling and feature extraction with a compact embedding model; (3) two-stage retrieval: low-dim ANN for recall (HNSW/IVF+PQ via FAISS/ScaNN/Milvus) then high-dim rerank for precision; (4) decision logic (block/soft-flag/warn) and immediate notification with appeal links; (5) human review queue that feeds labeled cases back for offline retraining and threshold calibration.

Skill signals you should show: designing scalable ANN indices and sharding, embedding model tradeoffs (128 vs 512 dims), latency vs accuracy modeling (memory, network, search cost, P95 latency), metrics (precision/recall/F1, A/B and offline evaluation), operational concerns (monitoring, rollback, cost, consistency), and designing a robust human-in-the-loop feedback loop to reduce false positives over time.

Google ML System Design: Fuzzy Video Deduplication

Question Description

Common Follow-up Questions

Related Questions

Explore More Questions

Practice This Question with AI