Stateless Prompt Playground Design - Anthropic Interview

You are asked to design a stateless Prompt Engineering Playground backend (similar to an LLM playground) where each request to the model is independent unless a user explicitly saves context. The system must accept very large prompts (10MB+), support multiple browser windows/tabs per user, and let users share or export full conversations via unique links or JSON.

Core content

Ingest: accept large prompt uploads via presigned object-storage URLs or chunked streaming to avoid overloading the app server. Use content-addressable storage (hash + dedupe) for large files.
Preprocessing: chunk and optionally compress or summarize very large prompts before sending to the LLM. Consider client-side or edge pre-processing to reduce bandwidth and cost.
LLM access: treat the LLM as an external API. Use a queuing layer (message queue or stream processor) to batch, rate-limit, and retry calls. Stream tokens back to the user (WebSocket/SSE) for low-latency UX.
Persistence & sharing: store minimal metadata in a primary DB (User, Conversation, PromptRef, ResponseRef) and keep large payloads in object storage. Generate shareable links with randomized IDs or signed short-lived tokens; allow JSON export.

Flow / stages you should explain in an interview

Client upload (presigned PUT / multipart / chunked). 2. Server validates, stores metadata, and returns a prompt reference. 3. Preprocessor (sync/async) chunks or summarizes large prompts. 4. Queue/worker sends request to LLM API, streams response back via WebSocket or SSE. 5. Persist response references and make the conversation shareable.

Skill signals to demonstrate

Distributed systems: queues, workers, backpressure, and horizontal scaling.
Streaming & performance: SSE/WebSockets, partial response delivery, chunking strategies.
Cost optimization: deduplication, summarization, batching, and caching of frequent prompts.
Data modeling & indexing: how you model User, Conversation, Prompt, Response, and your indexing strategy for fast retrieval.
Security & reliability: signed URLs, access control on shared links, retries, idempotency, and monitoring.

When answering, walk through trade-offs (latency vs. cost, client vs. server preprocessing, strict statelessness vs. saved context) and be explicit about failure modes and mitigation (partial uploads, LLM timeouts, replay protection).

Anthropic System Design: Stateless Prompt Playground

Question Description

Common Follow-up Questions

Related Questions

Explore More Questions

Practice This Question with AI