infrastructure foundation
Coupang
Amazon
Google

Coupang Infra Interview: Kubernetes Storage Management

Topics:
Persistent Volumes
Storage Classes
Stateful Applications
Roles:
Software Engineer
Site Reliability Engineer
Platform Engineer
Experience:
Entry Level
Mid Level
Senior

Question Description

This question tests your practical understanding of Kubernetes persistent storage and how you manage stateful workloads in production.

You will be asked to explain key components—PersistentVolume (PV), PersistentVolumeClaim (PVC), StorageClass—and the lifecycle that connects them (provisioning, binding, using, releasing, reclaiming). Expect to walk through both static and dynamic provisioning: when you define PV objects yourself versus when the cluster provisions storage on demand using a StorageClass and a CSI driver.

The interview typically progresses in stages: first you define concepts and trade-offs (e.g., reclaim policies, access modes, ReadWriteOnce vs ReadWriteMany), then you describe a hands-on setup (YAML for PV/PVC, associating a claim with a Pod or StatefulSet), and finally you cover operational concerns (resizing PVCs, snapshots, backups, access control, multi-tenant storage isolation). You may be asked to sketch a storage class that favors performance (local SSD, IOPS) or durability (replicated block storage) and to explain how CSI fits into dynamic provisioning.

To demonstrate readiness, show familiarity with common use cases (databases, file shares, logs), explain trade-offs (performance vs durability, shared vs exclusive mounts), and mention tooling/commands you’d use (kubectl, CSI snapshotter, storage provisioner logs). Practical examples and failure-recovery steps (reclaim policy handling, restoring from volume snapshots) strengthen your answer.

Common Follow-up Questions

  • How would you design dynamic provisioning and StorageClass strategy for a multi-tenant Kubernetes cluster with mixed performance/storage-cost requirements?
  • Explain the PVC resizing process and the CSI requirements for online vs. offline volume expansion. How would you handle a resize failure in production?
  • Describe how you’d implement consistent backups and restores for stateful applications using VolumeSnapshots and CSI snapshotter. What are the failure modes?
  • If a PV is Released with a Retain reclaimPolicy, walk through the safe steps to reclaim and reuse that storage while preventing data loss or tenant leakage.

Related Questions

1Design persistent storage for a distributed database running on Kubernetes (StatefulSet + PV strategy)
2How to choose a StorageClass: IOPS, throughput, durability trade-offs for stateful workloads
3Kubernetes volume types vs hostPath/emptyDir: when to use each for stateful applications
4StatefulSet vs Deployment: storage implications and best practices for application upgrades

Explore More Questions

Practice This Question with AI

Get real-time hints, detailed requirements, and insightful analysis of the question.

Kubernetes Storage Interview: Persistent Volumes (Coupang) | Voker