Kubernetes Storage: Persistent Volumes and Claims
Kubernetes Storage: Persistent Volumes and Claims
Kubernetes Storage: Persistent Volumes and Claims
Learn how to manage persistent storage in Kubernetes using Persistent Volumes (PVs), Persistent Volume Claims (PVCs), and Storage Classes.
Lab: Working with Persistent Storage
Prerequisites
- Completed "Introduction to Kubernetes" module
- Access to a Kubernetes cluster with a storage provisioner
- kubectl CLI configured
Step 1: Create a Namespace
Create a namespace for this lab:
kubectl create namespace storage-lab
Step 2: Create a Persistent Volume
Create a file named pv.yaml
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: /mnt/data
Apply the PersistentVolume:
kubectl apply -f pv.yaml
Verify the PV was created:
kubectl get pv
The STATUS should show "Available".
Step 3: Create a Persistent Volume Claim
Create a file named pvc.yaml
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
namespace: storage-lab
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
storageClassName: manual
Apply the PVC:
kubectl apply -f pvc.yaml
Check the PVC status:
kubectl get pvc -n storage-lab
The PVC should be "Bound" to the PV.
Step 4: Use the PVC in a Pod
Create a file named pod-with-pvc.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: storage-pod
namespace: storage-lab
spec:
containers:
- name: app
image: nginx:latest
volumeMounts:
- mountPath: /usr/share/nginx/html
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: my-pvc
Create the pod:
kubectl apply -f pod-with-pvc.yaml
Wait for the pod to be running:
kubectl wait --for=condition=Ready pod/storage-pod -n storage-lab --timeout=60s
Step 5: Write Data to Persistent Storage
Write some content to the persistent volume:
kubectl exec -it storage-pod -n storage-lab -- /bin/bash -c "echo 'Hello from persistent storage!' > /usr/share/nginx/html/index.html"
Verify the content:
kubectl exec storage-pod -n storage-lab -- cat /usr/share/nginx/html/index.html
Step 6: Test Persistence
Delete the pod:
kubectl delete pod storage-pod -n storage-lab
Recreate the pod using the same YAML:
kubectl apply -f pod-with-pvc.yaml
Wait for the pod to be ready:
kubectl wait --for=condition=Ready pod/storage-pod -n storage-lab --timeout=60s
Verify the data persisted:
kubectl exec storage-pod -n storage-lab -- cat /usr/share/nginx/html/index.html
You should see the same content!
Step 7: Explore Storage Classes
List available storage classes in your cluster:
kubectl get storageclass
Get details on a storage class:
kubectl describe storageclass <storage-class-name>
Step 8: Create a Dynamic PVC
Create a file named dynamic-pvc.yaml
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-pvc
namespace: storage-lab
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: <default-storage-class>
Replace <default-storage-class>
with your cluster's default storage class, then apply:
kubectl apply -f dynamic-pvc.yaml
Watch the PV be automatically provisioned:
kubectl get pv,pvc -n storage-lab
Step 9: Create a StatefulSet with Storage
Create a file named statefulset.yaml
:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
namespace: storage-lab
spec:
serviceName: "web"
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
Create the StatefulSet:
kubectl apply -f statefulset.yaml
Watch the pods and PVCs being created:
kubectl get pods,pvc -n storage-lab -w
Each pod gets its own PVC automatically!
Step 10: Clean Up
Delete all resources:
kubectl delete namespace storage-lab
kubectl delete pv my-pv
Concepts: Understanding Kubernetes Storage
The Storage Problem in Kubernetes
By default, containers are ephemeral—when a Pod is deleted, all its data is lost. For stateful applications like databases, this is unacceptable. Kubernetes provides several abstractions to handle persistent storage.
Storage Architecture
Kubernetes storage architecture consists of several key components:
- Persistent Volumes (PV): Cluster-wide storage resources
- Persistent Volume Claims (PVC): Requests for storage by users
- Storage Classes: Dynamic provisioning of storage
- Volume Plugins: Interfaces to storage systems
Persistent Volumes (PV)
A PersistentVolume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It's a cluster resource, just like a node.
Key Properties:
- Capacity: Size of the storage
- Access Modes: How the volume can be mounted (ReadWriteOnce, ReadOnlyMany, ReadWriteMany)
- Reclaim Policy: What happens when the PVC is deleted (Retain, Recycle, Delete)
- Storage Class: For dynamic provisioning
- Volume Type: The underlying storage system (hostPath, NFS, AWS EBS, etc.)
Persistent Volume Claims (PVC)
A PersistentVolumeClaim is a request for storage by a user. It's similar to how a Pod consumes node resources; a PVC consumes PV resources.
Binding Process:
- User creates a PVC with desired storage size and access mode
- Kubernetes finds a matching PV (or dynamically provisions one)
- The PVC is bound to the PV
- Pods can now use the PVC
Access Modes
Storage can be mounted in different modes:
- ReadWriteOnce (RWO): Volume can be mounted read-write by a single node
- ReadOnlyMany (ROX): Volume can be mounted read-only by many nodes
- ReadWriteMany (RWX): Volume can be mounted read-write by many nodes
Not all storage types support all access modes. For example, AWS EBS only supports RWO.
Storage Classes
Storage Classes provide a way to describe different "classes" of storage. They enable dynamic provisioning of PVs.
Benefits:
- No manual PV creation by administrators
- Automatic provisioning when PVC is created
- Different tiers of storage (SSD vs HDD, fast vs slow)
- Cloud provider integration (AWS EBS, Google Persistent Disk, Azure Disk)
Example Storage Class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
Reclaim Policies
When a PVC is deleted, the PV can be handled in three ways:
- Retain: Manual reclamation—PV still exists with data
- Delete: Automatically delete the PV and underlying storage
- Recycle: (Deprecated) Basic scrub and make available for new claim
For production workloads with important data, "Retain" is typically the safest choice.
StatefulSets and Storage
StatefulSets are designed for stateful applications. They provide:
- Stable network identifiers: Each pod gets a predictable name
- Stable storage: PVCs persist across pod rescheduling
- Ordered deployment and scaling: Pods are created/deleted in order
The volumeClaimTemplates
field in a StatefulSet automatically creates a PVC for each pod replica, ensuring each has dedicated storage.
Best Practices
- Use Storage Classes: Enable dynamic provisioning for flexibility
- Size appropriately: Request storage sizes based on actual needs
- Choose correct access mode: Use RWO when possible for better performance
- Set resource limits: Prevent storage exhaustion
- Back up data: Implement backup strategies for critical data
- Monitor usage: Track storage consumption and performance
- Test persistence: Verify data survives pod restarts
- Use StatefulSets for stateful apps: Databases, message queues, etc.
Common Use Cases
- Databases: PostgreSQL, MySQL, MongoDB with persistent data
- Message Queues: RabbitMQ, Kafka with durable messages
- Shared Configuration: ConfigMaps and Secrets mounted as volumes
- Log Aggregation: Persistent storage for log files
- CI/CD Artifacts: Build artifacts and caches
Understanding storage is crucial for running stateful applications successfully in Kubernetes.