ml system design
Adobe
Salesforce
HubSpot

Adobe ML System Design: Personalized Q&A Assistant

Topics:
Knowledge Retrieval
Personalization
Question Answering
Roles:
Machine Learning Engineer
ML Engineer
Data Scientist
Experience:
Mid Level
Senior
Staff

Question Description

Overview

You are asked to design a natural-language Q&A assistant for a marketing platform (think Adobe Experience Platform). The assistant must answer two classes of questions: general platform questions (pulled from public documentation) and personalized questions about a user's account data (via secure APIs). You should ensure scope limitation so the assistant says "I am unable to answer this type of question." for unrelated queries.

High-level flow / stages

  1. Ingest and index public documentation (crawl, normalize, create embeddings) and maintain a separate secure index for per-customer metadata or precomputed summaries.
  2. Front-end receives a query and performs authentication/authorization checks.
  3. Perform query classification: general vs personalized vs out-of-scope.
  4. For general queries, run retrieval over public docs and synthesize a concise answer. For personalized queries, call authenticated APIs, fetch/aggregate relevant account data, then summarize.
  5. Apply safety filters, provenance/attribution, caching, and return the answer within latency targets.

Skill signals you should demonstrate

  • Retrieval-augmented generation (RAG) and vector search strategies
  • Embeddings, semantic search, and metadata filtering for tenant isolation
  • Authentication/authorization patterns (OAuth, role-based ACLs) and data governance
  • Caching, batching, and async pipelines to meet 2–3s latency and scale to thousands of users
  • Instrumentation, monitoring, and techniques to reduce hallucinations (source attribution, confidence thresholds)

Design choices should balance accuracy, security, latency, scalability and operational cost while providing clear provenance and scope limitation for liability management.

Common Follow-up Questions

  • How would you design tenant-isolated vector stores and metadata filtering so personalized retrieval never leaks other customers' data?
  • What caching and precomputation strategies would you use to keep personalized query latency under 2–3 seconds at thousands of concurrent users?
  • How do you handle hallucinations and ensure factual accuracy when synthesizing answers from documentation and live account data (provenance, confidence scores)?
  • Which embedding models and vector index (e.g., approximate nearest neighbor) architectures would you pick and why? Discuss cost and accuracy trade-offs.
  • Describe authentication and authorization flows (OAuth, RBAC) you’d implement to securely fetch customer data and audit access.

Related Questions

1Design a retrieval-augmented generation (RAG) system for product documentation search
2How to scale vector search for multi-tenant SaaS platforms
3Best practices for secure data access and masking in ML-driven assistants
4Prompt engineering and grounding techniques to reduce hallucinations in a Q&A bot
5Caching, batching, and async patterns to meet strict latency SLAs for NLP services

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Adobe ML System Design: Personalized Q&A Assistant | Voker