backend system design
Bloomberg
Reuters
The New York Times

Bloomberg System Design: Global News Aggregation & Trending

Topics:
Stream Processing
Search Engine
Distributed Systems
Roles:
Software Engineer
Backend Engineer
Site Reliability Engineer
Experience:
Mid Level
Senior
Staff

Question Description

You are asked to design the backend for a global news platform that ingests articles from publishers and the Newser API (which only returns one region per call), processes real-time user interactions, computes trending scores, and serves a low-latency, personalized news feed and search to users worldwide.

Key challenges you must address include aggregating across regions despite Newser's per-region call limit, respecting API rate limits and quotas, supporting publisher push ingestion, and calculating real-time trending with time-decay while keeping results consistent and fast.

High-level flow/stages you should cover:

  • Regioned ingestion: parallel region workers, regional fetch scheduler, and a deduplication layer to merge duplicate articles.
  • Stream processing: durable message bus (e.g., Kafka) and stream processors (Flink/Beam) to compute time-decayed trending scores and materialize views.
  • Storage & indexing: a write-optimized store for raw articles, a search index (Elasticsearch/OpenSearch) for keyword queries, and a fast cache (Redis + CDN) for feeds.
  • Serving layer: feed and search APIs, personalization filters (language, interest, region), and push updates (WebSockets/SSE) for real-time UI.

Skill signals interviewers look for: distributed systems and scaling patterns, stream processing and windowing, search indexing and real-time index updates, API design and rate-limit strategies, data modeling for time-decay metrics, caching and cache invalidation, and operational concerns (monitoring, retries, fault tolerance).

In your design, justify trade-offs for consistency vs. latency, strategies to reduce Newser API cost (batching, caching, region prioritization), and how you would test correctness and recover from failures.

Common Follow-up Questions

  • How would you design the ingestion scheduler to minimize Newser API calls and handle regional rate limits while ensuring freshness?
  • Describe the time-decay trending algorithm and how you would implement it in a stream processor to support sliding windows and high cardinality.
  • How would you ensure search indices reflect near-real-time updates (new articles and interaction-based boosts) without overloading the indexer?
  • Explain your approach to personalization: how do you combine global trending scores with user preferences (interests, language, region) at low latency?

Related Questions

1Design a scalable webhook ingestion system for publisher pushes and deduplication
2Build a real-time analytics pipeline to compute top-K events with time-decay
3Design a low-latency search index that supports real-time updates and filtering for a news site

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Bloomberg System Design: Global News Aggregation & Trending | Voker