Managing storage is a distinct problem from managing compute instances. The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed. To do this, Kubernetes has two API resources: PersistentVolume and PersistentVolumeClaim.
Amazon Elastic Block Store (EBS) is an easy-to-use, scalable, high-performance block-storage service designed for Amazon Elastic Compute Cloud (EC2).
In traditional model EBS volume is directly attached to VM in AWS and processes on VM view it as a native disk drive. In Kubernetes Cluster (EKS) we can use the same EBS volumes and directly consume them inside the application pods. The volume is still attached to a specific node inside the cluster but PV and PVC make creating and consuming the volume inside the pod easier.
I will walk you through a scenario where we will dynamically create a EBS volume and use it inside our pod.
The first step is to create a persistent volume claim (PVC). PVC is the specification for a volume. In PVC we will define what type of persistent storage we want.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-app-pvc
annotations:
volume.beta.kubernetes.io/storage-class: fast
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "1Gi"
Save it as pvc.yaml
The annotation for storage class is not required but we can use it if we are using a custom storage class. By default EKS cluster already has a storage class that is used by PVC if no annotation is added.
We will need to apply this manifest to provision our persistent volume that will in turn create an EBS volume.
kubectl apply -f pvc.yaml
Run this command to get status of PVC and PV.
kubectl get pvc,pv
Next we will use this persistent volume. We will create a simple deployment to use this PVC.
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
selector:
matchLabels:
app: wordpress
tier: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: mysql
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-pass
key: password
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: my-app-pvc
In our deployment manifest inside volumes block we use persistentVolumeClaim key to link out earlier created PVC.
After applying the deployment we can see that PV is bound to newly created pod and application can write data to this volume.
If your cluster is spanning over multiple availability zones it is good idea to create a new storage class with specific availability zones. So that volumes are created in specific zone. Also we will need to use topology aware pod scheduling to avoid any problems when pod is restarted or rescheduled.
You can create a new storage class with zone topology as follows.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2-east-1b
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-east-1b
You’ll also need to add affinity to your deployment spec.template.spec block.
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
values:
- us-east-1b
topologyKey: "failure-domain.beta.kubernetes.io/zone"