{"id":22983,"date":"2024-06-29T11:35:50","date_gmt":"2024-06-29T08:35:50","guid":{"rendered":"https:\/\/kifarunix.com\/?p=22983"},"modified":"2024-06-29T11:35:55","modified_gmt":"2024-06-29T08:35:55","slug":"statefulsets-in-kubernetes-everything-you-need-to-know","status":"publish","type":"post","link":"https:\/\/kifarunix.com\/statefulsets-in-kubernetes-everything-you-need-to-know\/","title":{"rendered":"StatefulSets in Kubernetes: Everything You Need to Know"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1079\" height=\"589\" src=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubernetes-statefulsets.png?v=1719649944\" alt=\"StatefulSets in Kubernetes: Everything You Need to Know\" class=\"wp-image-23005\" title=\"\" srcset=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubernetes-statefulsets.png?v=1719649944 1079w, https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubernetes-statefulsets-768x419.png?v=1719649944 768w\" sizes=\"(max-width: 1079px) 100vw, 1079px\" \/><\/figure>\n\n\n\n<p>In this blog post, you will learn about statefulsets in Kubernetes and everything you need to know: the definition and purpose of StatefulSets, their importance in handling stateful applications, and how they differ from other Kubernetes objects like Deployments. By the end, you&#8217;ll have a thorough understanding of how to create, manage, and scale StatefulSets.<\/p>\n\n\n\n<div class=\"wp-block-rank-math-toc-block\" id=\"rank-math-toc\"><h2>Table of Contents<\/h2><nav><ul><li><a href=\"#understanding-stateful-sets-in-kubernetes\">Understanding StatefulSets in Kubernetes<\/a><ul><li><a href=\"#what-are-stateful-sets\">What are StatefulSets?<\/a><\/li><li><a href=\"#core-concepts-in-kubernetes-stateful-sets\">Core Concepts in Kubernetes StatefulSets<\/a><\/li><li><a href=\"#stateful-vs-stateless-applications\">Stateful vs Stateless Applications<\/a><ul><li><a href=\"#stateful-applications\">Stateful Applications<\/a><\/li><li><a href=\"#stateless-applications\">Stateless Applications<\/a><\/li><\/ul><\/li><li><a href=\"#stateful-sets-vs-deployments\">StatefulSets vs. Deployments<\/a><\/li><li><a href=\"#creating-and-managing-stateful-sets\">Creating and Managing StatefulSets<\/a><ul><li><a href=\"#which-method-to-create-stateful-set-declarative-or-imperative\">Which method to Create StatefulSet. Declarative or Imperative?<\/a><\/li><li><a href=\"#configure-persistent-storage-provisioner\">Configure Persistent Storage Provisioner<\/a><\/li><li><a href=\"#prepare-persistent-volume-pv-for-each-node\">Prepare Persistent Volume (PV) for each Node<\/a><\/li><li><a href=\"#create-a-persistent-volume-claim-pvc\">Create a PersistentVolumeClaim (PVC)<\/a><\/li><li><a href=\"#creating-a-stateful-set-in-kubernetes-cluster\">Creating a StatefulSet in Kubernetes Cluster<\/a><\/li><li><a href=\"#listing-stateful-sets\">Listing StatefulSets<\/a><\/li><li><a href=\"#scaling-stateful-sets\">Scaling StatefulSets<\/a><\/li><li><a href=\"#deleting-stateful-sets\">Deleting StatefulSets<\/a><\/li><\/ul><\/li><\/ul><\/li><li><a href=\"#conclusion\">Conclusion<\/a><\/li><\/ul><\/nav><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"understanding-stateful-sets-in-kubernetes\">Understanding StatefulSets in Kubernetes<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"what-are-stateful-sets\">What are StatefulSets?<\/h3>\n\n\n\n<p>A StatefulSet in Kubernetes is a workload API object specifically designed for managing stateful applications that require persistent storage and stable network identities. They ensure order, identity and persistence of your stateful application. Imagine a database in a Kubernetes cluster. Each database Pod represents an instance, but the data itself needs to persist across restarts or Pod scaling. This is where StatefulSets come into play.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"core-concepts-in-kubernetes-stateful-sets\">Core Concepts in Kubernetes StatefulSets<\/h3>\n\n\n\n<p>There are a number of Kubernetes concepts that are used in StatefulSets that you should be familiar with. These include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Pod<\/strong>: As you already know, a Pod is the smallest deployable unit that represents a set of one or more containers running together on a Kubernetes cluster. Pods are the fundamental building blocks of Kubernetes applications and serve as the basic unit of execution.<\/li>\n\n\n\n<li><strong>Cluster DNS<\/strong>: <em>The name of a StatefulSet object must be a valid&nbsp;DNS label<\/em>. Cluster DNS facilitates service discovery by assigning DNS names to Services and Pods.<\/li>\n\n\n\n<li><strong>Headless Service<\/strong>: In Kubernetes, regular Services use <strong>clusterIP<\/strong> to provide a unified virtual IP for load balancing requests across all Pods, suited for stateless applications. In contrast, headless services, used by StatefulSets, do not use clusterIP (<strong>clusterIP: None<\/strong>) and provide direct access to each Pod&#8217;s IP address or DNS name via internal DNS records served by the cluster&#8217;s DNS service. This setup allows stateful applications to maintain stable and direct connections to specific Pods, crucial for tasks like data replication and configuration where individual Pod identity is essential for maintaining consistency and stability across the cluster.<\/li>\n\n\n\n<li><strong>PersistentVolumes (PVs):<\/strong> Kubernetes objects that represent persistent storage resources available in the cluster (e.g., local storage, cloud storage). They exist independently of Pods and persist data even when Pods are deleted or rescheduled.<\/li>\n\n\n\n<li><strong>PersistentVolume Claims (PVCs)<\/strong>: PVCs act as requests for storage by Pods, specifying the access mode and storage requirements. StatefulSets manage PVCs to ensure Pods have persistent storage across restarts and scaling operations.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"stateful-vs-stateless-applications\">Stateful vs Stateless Applications<\/h3>\n\n\n\n<p>So, what exactly is a stateful application and how does it differ with stateless application? Before diving further into StatefulSets, it&#8217;s essential to grasp the fundamental differences between stateful and stateless applications:<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"stateful-applications\">Stateful Applications<\/h4>\n\n\n\n<p>These are applications that maintain data and state between sessions. They require persistent storage and unique identities to preserve data integrity. Examples include databases (MySQL, PostgreSQL), message queues (Kafka), and caching solutions (Redis) etc.<\/p>\n\n\n\n<p>Managing stateful applications introduces complexities related to data persistence, reliable network communication, and lifecycle management, which StatefulSets address effectively in Kubernetes.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"stateless-applications\">Stateless Applications<\/h4>\n\n\n\n<p>In contrast, stateless applications do not store session data locally. They can be easily scaled horizontally by adding or removing instances without concerns about data persistence or state management. Think of a web servers serving static content, that is an example of a stateless application.<\/p>\n\n\n\n<p>In summary;<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>Feature<\/strong><\/td><td><strong>Stateful Applications<\/strong><\/td><td><strong>Stateless Applications<\/strong><\/td><\/tr><tr><td>State Management<\/td><td>Maintains state and data between sessions<\/td><td>Does not maintain state locally<\/td><\/tr><tr><td>Data Persistence<\/td><td>Requires persistent storage for data integrity<\/td><td>Data is ephemeral and does not persist<\/td><\/tr><tr><td>Scalability<\/td><td>Scaling requires careful management to maintain data consistency<\/td><td>Easily scale horizontally by adding or removing instances<\/td><\/tr><tr><td>Network Identity<\/td><td>Typically have stable network identities (hostnames)<\/td><td>Network identity is not crucial and may change<\/td><\/tr><tr><td>Restart Behavior<\/td><td>Pods may retain state across restarts and rescheduling<\/td><td>Each instance is independent; no state retained<\/td><\/tr><tr><td>Use Cases<\/td><td>Databases (MySQL, PostgreSQL), distributed systems (Kafka, Cassandra)<\/td><td>Web servers, load balancers, API gateways<\/td><\/tr><tr><td>StatefulSet Management<\/td><td>Managed by StatefulSets for ordered deployment, scaling, and identity<\/td><td>Managed by Deployments for simple scaling and restarts<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"stateful-sets-vs-deployments\">StatefulSets vs. Deployments<\/h3>\n\n\n\n<p>How does StatefulSets compare to Kubernetes Deployments? While both are Kubernetes controllers used to manage Pods, they serve different purposes:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>Feature<\/strong><\/td><td><strong>StatefulSet<\/strong><\/td><td><strong>Deployment<\/strong><\/td><\/tr><tr><td>Target Applications<\/td><td>Used with applications that require stable identities and persistent storage for data such databases, message queues, caching solutions<\/td><td>Ideal for stateless applications that are easily scalable and don&#8217;t rely on persistent data such as web servers, API gateways, load balancers<\/td><\/tr><tr><td>Pod Identity<\/td><td>StatefulSets ensure each Pod has a unique and persistent identifier (0,1,2..) that remains constant across restarts or rescheduling. This is crucial for stateful applications that depend on specific Pod order or communication<\/td><td>Deployment treat Pods as interchangeable units. That is, they do not maintain local state or session data that needs to be preserved between restarts or rescheduling. They also receive non-ordinal, random identifiers (<code>Pod-abcdef1234-xyz<\/code>) that may change across restarts making them simpler to manage<\/td><\/tr><tr><td>Data Persistence<\/td><td>Requires persistent storage (integrates with PVs). This guarantees data survives Pod restarts or scaling operations<\/td><td>Data is ephemeral (not persisted)<\/td><\/tr><tr><td>Deployment and Scaling<\/td><td>When you scale a StatefulSet (adding or removing Pods), it follows a predictable order. This is essential for applications with dependencies between Pods or specific initialization sequences.<\/td><td>Pods are independent of each other. Deployments does not impose any specific order for these operations, hence, Pods can start or stop in any sequence based on cluster conditions and scheduling<\/td><\/tr><tr><td>Updates<\/td><td>Facilitates rolling updates, but due to the ordered nature of StatefulSets, updates might require more planning to ensure data consistency across Pods. <\/td><td>Rolling updates: New Pods with the updated configuration are launched gradually, while old Pods are terminated.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"creating-and-managing-stateful-sets\">Creating and Managing StatefulSets<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"which-method-to-create-stateful-set-declarative-or-imperative\">Which method to Create StatefulSet. Declarative or Imperative?<\/h4>\n\n\n\n<p>You can create a Kubernetes StatefulSet using <strong>declarative<\/strong> or <strong>imperative<\/strong> method.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Declarative method<\/strong> allows you to define the desired state of your StatefulSet in a YAML manifest file which specifies the configuration details like container image, replicas, storage needs, and labels. You then use the <strong>kubectl apply<\/strong> command to apply the configurations defined in the YAML file and create the StatefulSet.<\/li>\n\n\n\n<li><strong>Imperative method<\/strong> allows you directly issue <strong>kubectl create<\/strong> command to configure and create the StatefulSet. In the command you can specify parameters like image, replicas, storage, and labels.<\/li>\n<\/ul>\n\n\n\n<p>While both methods work fine, the disadvantages of the imperative method is that it is:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Error Prone:<\/strong> Manual configuration through commands increases the risk of typos or syntax errors.<\/li>\n\n\n\n<li><strong>Difficult to Reproduce:<\/strong> It&#8217;s challenging to recreate the exact configuration later or deploy it consistently across environments without a manifest file.<\/li>\n\n\n\n<li><strong>Limited Functionality:<\/strong> Imperative commands might not offer the same level of flexibility and detail as a well-defined YAML manifest.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"configure-persistent-storage-provisioner\">Configure Persistent Storage Provisioner<\/h4>\n\n\n\n<p>Stateful applications often require persistent storage to store data that survives Pod restarts or rescheduling. For that reason, you need to configure a PersistentVolume StorageClass with a specific <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/storage\/storage-classes\/#provisioner\" target=\"_blank\" rel=\"noreferrer noopener\">provisioner<\/a>, which defines how storage volumes will be provisioned in your Kubernetes cluster. This StorageClass provisioner takes care of dynamically allocating storage based on your Persistent Volume Claim (PVC) requests within the StatefulSet. Some popular provisioners include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Local <\/strong>(local storage path on a node for testing purposes only)<\/li>\n\n\n\n<li><strong>Ceph RBD (Ceph block storage)<\/strong>: provisioner for block storage using Ceph.<\/li>\n\n\n\n<li><strong>AWS Elastic Block Store (EBS)<\/strong> (for AWS deployments)<\/li>\n\n\n\n<li><strong>Azure Managed Disks<\/strong> (for Azure deployments)<\/li>\n\n\n\n<li><strong>Google Persistent Disk (GPD)<\/strong>: Provisioner for Google Cloud Persistent Disks in Google Cloud Platform (GCP).<\/li>\n\n\n\n<li><strong>OpenEBS<\/strong>: OpenEBS provisioner for block storage using iSCSI.<\/li>\n\n\n\n<li><strong>NFS Client Provisioner<\/strong>: Provisioner for NFS volumes provided by an NFS server.<\/li>\n\n\n\n<li>e.t.c<\/li>\n<\/ul>\n\n\n\n<p>In this guide, we are running a Kubernetes cluster on an on-premise local server and for that reason, we will use <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/storage\/storage-classes\/#local\" target=\"_blank\" rel=\"noreferrer noopener\">local<\/a> storage provisioner.<\/p>\n\n\n\n<div class=\"info-panel\">\n    <div class=\"info-panel-header\">Info\n    <\/div>\n    <div class=\"info-panel-content\">Using local storage provisioner is recommended for development or testing only. For production environments, consider using Persistent Volumes (PVs) with storage classes for data persistence and better handling of pod failures or scaling. According to <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/storage\/volumes\/#local\" target=\"_blank\" rel=\"noopener\">Kubernetes<\/a>, local volumes are subject to the availability of the underlying node and are not suitable for all applications. If a node becomes unhealthy, then the local volume becomes inaccessible by the pod. The pod using this volume is unable to run. Applications using local volumes must be able to tolerate this reduced availability, as well as potential data loss, depending on the durability characteristics of the underlying disk.\n    <\/div>\n<\/div>\n\n\n\n<p>Check to see if there are any existing StorageClasses in your Kubernetes cluster:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get storageclass -A<\/code><\/pre>\n\n\n\n<p>If no suitable StorageClass exists in your respective namespace, create one based on your storage solution or requirements.<\/p>\n\n\n\n<p>Example StorageClass for local volumes:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat local-storageclass.yaml<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>apiVersion: storage.k8s.io\/v1\nkind: StorageClass\nmetadata:\n  name: local-storage\nprovisioner: kubernetes.io\/no-provisioner\nvolumeBindingMode: WaitForFirstConsumer\n<\/code><\/pre>\n\n\n\n<p>Where:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>metadata.name<\/strong>: Specifies the name of the StorageClass (<code>local-storage<\/code> in this example).<\/li>\n\n\n\n<li><strong>provisioner<\/strong>: Set to <code>kubernetes.io\/no-provisioner<\/code> since local volumes are not provisioned by a storage provider.<\/li>\n\n\n\n<li><strong>volumeBindingMode<\/strong>: Determines when a PersistentVolume (PV) is bound. <code>WaitForFirstConsumer<\/code> enables binding of the PV when the first PVC using this StorageClass is created.<\/li>\n<\/ul>\n\n\n\n<p>Apply the YAML definition to create the StorageClass in your cluster:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f &lt;filename>.yaml<\/code><\/pre>\n\n\n\n<p>For example;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f local-storageclass.yaml<\/code><\/pre>\n\n\n\n<p>Confirm;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get storageclass<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE\nlocal-storage   kubernetes.io\/no-provisioner   Delete          WaitForFirstConsumer   false                  26s\n<\/code><\/pre>\n\n\n\n<p>In Kubernetes versions 1.30 and above, the default <code>reclaimPolicy<\/code> for StorageClasses created without explicitly specifying the reclaim policy type, is set to &#8220;<strong>Delete<\/strong>.&#8221; This means by default, Kubernetes will attempt to reclaim the volume when the last Persistent Volume Claim (PVC) referencing it is deleted i.e, when a PVC is deleted, the corresponding PV is <strong>automatically deleted<\/strong>. However, since we are using the <code>kubernetes.io\/no-provisioner<\/code> which implies no dynamic storege provisioning, there won&#8217;t be any PVs automatically created for this StorageClass and as such, the <code>reclaimPolicy<\/code> setting in the StorageClass becomes less relevant because there are no PVs that will be dynamically provisioned. Therefore, when you create a PV manually, you can define the reclaim policy to apply to your PVCs.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"prepare-persistent-volume-pv-for-each-node\">Prepare Persistent Volume (PV) for each Node<\/h4>\n\n\n\n<p>Next, you need to manually create Persistent Volume (PV) that matches the StorageClass used above for the local volume.<\/p>\n\n\n\n<p>If you have a single worker node, it is enough to create a PV for that single node only. However, if you have multiple worker nodes and want to schedule pods on each node, then you need to create a PV for each node.<\/p>\n\n\n\n<p>By creating a PV for each node, you define that a particular PV is bound to the specific worker node in the cluster. This ensures that when a pod is scheduled on a node, it will use the PV attached to that node. This node affinity prevents data loss because if a pod is rescheduled due to node failure or maintenance, Kubernetes will attempt to reschedule it on the same node where its PV exists.<\/p>\n\n\n\n<p>Note that, you need to define the <strong>nodeAffinity<\/strong> when using\u00a0<code>local<\/code>\u00a0volumes. Reason being:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Local volumes are tied to specific directories on worker nodes. Without nodeAffinity, a pod requesting a local volume could potentially be scheduled to any node in the cluster, even if the directory containing the data doesn&#8217;t exist there. This can lead to pod scheduling failures or unexpected behavior.<\/li>\n\n\n\n<li>Enforcing nodeAffinity for local volumes ensures explicit scheduling of pods to the intended nodes. This promotes clarity and consistency in your deployments, especially when dealing with multiple worker nodes.<\/li>\n<\/ul>\n\n\n\n<p>We have three worker nodes;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get nodes<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME        STATUS   ROLES           AGE   VERSION\nmaster-01   Ready    control-plane   12d   v1.30.2\nmaster-02   Ready    control-plane   12d   v1.30.2\nmaster-03   Ready    control-plane   12d   v1.30.2\nworker-01   Ready    &lt;none&gt;      33h   v1.30.2\nworker-02   Ready    &lt;none&gt;      33h   v1.30.2\nworker-03   Ready    &lt;none&gt;      33h   v1.30.2\n<\/code><\/pre>\n\n\n\n<p>Here is our sample PV manifest file for each of our three worker nodes.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat mysql-pv.yaml<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code><pre class=\"scroll-box\"><code>apiVersion: v1\nkind: PersistentVolume\nmetadata:\n  name: mysql-pv-worker-01\nspec:\n  capacity:\n    storage: 10Gi\n  volumeMode: Filesystem\n  accessModes:\n    - ReadWriteOnce\n  persistentVolumeReclaimPolicy: Retain\n  storageClassName: local-storage\n  local:\n    path: \/mnt\/k8s\/mysql\/data\/\n  nodeAffinity:\n    required:\n      nodeSelectorTerms:\n        - matchExpressions:\n            - key: kubernetes.io\/hostname\n              operator: In\n              values:\n                - worker-01\n---\napiVersion: v1\nkind: PersistentVolume\nmetadata:\n  name: mysql-pv-worker-02\nspec:\n  capacity:\n    storage: 10Gi\n  volumeMode: Filesystem\n  accessModes:\n    - ReadWriteOnce\n  persistentVolumeReclaimPolicy: Retain\n  storageClassName: local-storage\n  local:\n    path: \/mnt\/k8s\/mysql\/data\/\n  nodeAffinity:\n    required:\n      nodeSelectorTerms:\n        - matchExpressions:\n            - key: kubernetes.io\/hostname\n              operator: In\n              values:\n                - worker-02\n---\napiVersion: v1\nkind: PersistentVolume\nmetadata:\n  name: mysql-pv-worker-03\nspec:\n  capacity:\n    storage: 10Gi\n  volumeMode: Filesystem\n  accessModes:\n    - ReadWriteOnce\n  persistentVolumeReclaimPolicy: Retain\n  storageClassName: local-storage\n  local:\n    path: \/mnt\/k8s\/mysql\/data\/\n  nodeAffinity:\n    required:\n      nodeSelectorTerms:\n        - matchExpressions:\n            - key: kubernetes.io\/hostname\n              operator: In\n              values:\n                - worker-03\n<\/code><\/pre>\n<\/code><\/pre>\n\n\n\n<p>Where:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>apiVersion<\/strong>: Specifies the Kubernetes API version (<code>v1<\/code>).<\/li>\n\n\n\n<li><strong>kind<\/strong>: Specifies the resource type (<code>PersistentVolume<\/code>).<\/li>\n\n\n\n<li><strong>metadata<\/strong>: Provides metadata for the PV, including its name (<code>mysql-pv<\/code>-NODE-NAME).<\/li>\n\n\n\n<li><strong>spec<\/strong>: Specifies the specifications for the PV:\n<ul class=\"wp-block-list\">\n<li><strong>capacity<\/strong>: Defines the storage capacity of the PV (<code>10Gi<\/code>).<\/li>\n\n\n\n<li><strong>volumeMode<\/strong>: Indicates the volume mode (<code>Filesystem<\/code>). Can also be set to <strong>Block<\/strong> is using raw block devices<\/li>\n\n\n\n<li><strong>accessModes<\/strong>: Specifies the access mode (<code>ReadWriteOnce<\/code>), allowing read and write access by a single node concurrently.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>persistentVolumeReclaimPolicy<\/strong>: Sets the reclaim policy for the PV to <code>Retain<\/code>, meaning that when the associated PersistentVolumeClaim (PVC) is deleted, the PV is not automatically deleted.<\/li>\n\n\n\n<li><strong>storageClassName<\/strong>: Associates the PV with a StorageClass named <code>local-storage<\/code> created before.<\/li>\n\n\n\n<li><strong>local<\/strong>: Specifies that this PV uses local storage on the Kubernetes nodes.\n<ul class=\"wp-block-list\">\n<li><strong>path<\/strong>: Defines the local path on the node (<code>\/mnt\/k8s\/mysql\/data<\/code>) where the PV will be mounted. It must already exist on the node when you deploy your application that uses it.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>nodeAffinity<\/strong>: Defines node affinity rules, ensuring the PV is bound only to nodes that match specific criteria:\n<ul class=\"wp-block-list\">\n<li><strong>required<\/strong>: Specifies that the PV should be available on nodes that match the following conditions.\n<ul class=\"wp-block-list\">\n<li><strong>nodeSelectorTerms<\/strong>: Specifies a list of node selectors.\n<ul class=\"wp-block-list\">\n<li><strong>matchExpressions<\/strong>: Defines how nodes are selected based on label keys and values.\n<ul class=\"wp-block-list\">\n<li><strong>key<\/strong>: Specifies the label key (<code>kubernetes.io\/hostname<\/code>).<\/li>\n\n\n\n<li><strong>operator<\/strong>: Defines the operator for matching (<code>In<\/code>, meaning the label value must match one of the specified values).<\/li>\n\n\n\n<li><strong>values<\/strong>: Lists the specific worker node hostnames (<code>worker-01<\/code>, <code>worker-02<\/code>, <code>worker-03<\/code>) where the PV should be available.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>Now, create the PV;;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f mysql-pv.yaml<\/code><\/pre>\n\n\n\n<p>Confirm;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get pv<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME                   CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS        CLAIM   STORAGECLASS    VOLUMEATTRIBUTESCLASS   REASON   AGE\nmysql-pv-worker-01     10Gi       RWO            Retain           Available             local-storage   &lt;unset&gt;                          5s\nmysql-pv-worker-02     10Gi       RWO            Retain           Available             local-storage   &lt;unset&gt;                          5s\nmysql-pv-worker-03     10Gi       RWO            Retain           Available             local-storage   &lt;unset&gt;                          5s\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"create-a-persistent-volume-claim-pvc\">Create a PersistentVolumeClaim (PVC)<\/h4>\n\n\n\n<p>Now, you would create a PersistentVolumeClaim that your application pods in your StatefulSet will use to request storage.<\/p>\n\n\n\n<p>However, to automate the PVC creation and ensure consistent storage configurations across all pods managed by the StatefulSet, we will utilize <strong>volumeClaimTemplates<\/strong>. <code>volumeClaimTemplates<\/code> allow you to define PVC specifications directly within the StatefulSet configuration. Kubernetes uses these templates to automatically create PVCs when the StatefulSet is deployed or updated.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"creating-a-stateful-set-in-kubernetes-cluster\">Creating a StatefulSet in Kubernetes Cluster<\/h4>\n\n\n\n<p>Now that we have the StorageClass, PV and a PVC, you can now create a StatefulSet that provides a persistent storage for each pod.<\/p>\n\n\n\n<p>So, we will combine two resources in one manifest file; the service to expose MySQL app, the MySQL app StatefulSet.<\/p>\n\n\n\n<p>This is our sample manifest file;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat mysql-app.yaml<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>\napiVersion: v1\nkind: Service\nmetadata:\n  name: mysql\n  labels:\n    app: mysql\nspec:\n  ports:\n  - port: 3306\n    name: db\n  clusterIP: None\n  selector:\n    app: mysql\n---\napiVersion: apps\/v1\nkind: StatefulSet\nmetadata:\n  name: mysql\nspec:\n  serviceName: mysql\n  replicas: 3\n  selector:\n    matchLabels:\n      app: mysql\n  template:\n    metadata:\n      labels:\n        app: mysql\n    spec:\n      containers:\n      - name: mysql\n        image: mysql:8.0\n        ports:\n        - containerPort: 3306\n          name: db\n        volumeMounts:\n        - name: mysql-data\n          mountPath: \/var\/lib\/mysql\n      volumes:\n      - name: mysql-data\n        persistentVolumeClaim:\n          claimName: mysql-pvc\n---\napiVersion: v1\nkind: PersistentVolumeClaim\nmetadata:\n  name: mysql-pvc\nspec:\n  accessModes:\n    - ReadWriteOnce\n  resources:\n    requests:\n      storage: 10Gi\n  storageClassName: local-storage\n<\/code><\/pre>\n\n\n\n<p>This sample YAML configuration, consists of a Kubernetes Service (<strong>mysql<\/strong>) and a StatefulSet (<strong>mysql<\/strong>) that deploys MySQL Pods with persistent storage.<\/p>\n\n\n\n<p>In summary:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>This StatefulSet, named <code>mysql<\/code>, orchestrates a MySQL deployment with three replicas, each labeled <code>app: mysql<\/code>. Pods use the <code>mysql:latest<\/code> Docker image and expose MySQL on port 3306. Persistent storage of 10Gi is requested per pod using <code>mysql-persistent-storage<\/code> VolumeClaimTemplates with <code>ReadWriteOnce<\/code> access mode, ensuring reliable data persistence and scalability in Kubernetes environments.<\/li>\n<\/ul>\n\n\n\n<p>Ensure the local storage path already exists on <strong>ALL<\/strong> worker nodes:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo mkdir -p \/mnt\/k8s\/mysql\/data<\/code><\/pre>\n\n\n\n<p>Then  apply the StatefulSet configuration:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f mysql-app.yaml<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"listing-stateful-sets\">Listing StatefulSets<\/h4>\n\n\n\n<p>You can list statefulsets;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get statefulset<\/code><\/pre>\n\n\n\n<p>This shows the statefulsets in the default namespace. To check specific namespace, add the option <strong>[&#8211;namespace &lt;namespace name>|-n &lt;namespace-name>]<\/strong><\/p>\n\n\n\n<p>Sample output;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>NAME    READY   AGE\nmysql   3\/3     30s\n<\/code><\/pre>\n\n\n\n<p>To print a detailed description;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl describe statefulset mysql<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>Name:               mysql\nNamespace:          default\nCreationTimestamp:  Sat, 29 Jun 2024 07:33:20 +0000\nSelector:           app=mysql\nLabels:             <none>\nAnnotations:        <none>\nReplicas:           3 desired | 3 total\nUpdate Strategy:    RollingUpdate\n  Partition:        0\nPods Status:        3 Running \/ 0 Waiting \/ 0 Succeeded \/ 0 Failed\nPod Template:\n  Labels:  app=mysql\n  Containers:\n   mysql:\n    Image:      mysql:8.0\n    Port:       3306\/TCP\n    Host Port:  0\/TCP\n    Environment:\n      MYSQL_ROOT_PASSWORD:  password\n    Mounts:\n      \/var\/lib\/mysql from mysql-data (rw)\n  Volumes:         <none>\n  Node-Selectors:  <none>\n  Tolerations:     <none>\nVolume Claims:\n  Name:          mysql-data\n  StorageClass:  local-storage\n  Labels:        <none>\n  Annotations:   <none>\n  Capacity:      10Gi\n  Access Modes:  [ReadWriteOnce]\nEvents:\n  Type    Reason            Age    From                    Message\n  ----    ------            ----   ----                    -------\n  Normal  SuccessfulCreate  3m27s  statefulset-controller  create Claim mysql-data-mysql-0 Pod mysql-0 in StatefulSet mysql success\n  Normal  SuccessfulCreate  3m27s  statefulset-controller  create Pod mysql-0 in StatefulSet mysql successful\n  Normal  SuccessfulCreate  3m25s  statefulset-controller  create Claim mysql-data-mysql-1 Pod mysql-1 in StatefulSet mysql success\n  Normal  SuccessfulCreate  3m25s  statefulset-controller  create Pod mysql-1 in StatefulSet mysql successful\n  Normal  SuccessfulCreate  3m22s  statefulset-controller  create Claim mysql-data-mysql-2 Pod mysql-2 in StatefulSet mysql success\n  Normal  SuccessfulCreate  3m22s  statefulset-controller  create Pod mysql-2 in StatefulSet mysql successful\n<\/code><\/pre>\n\n\n\n<p>You can also get PVCs;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get pvc<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME                   STATUS   VOLUME               CAPACITY   ACCESS MODES   STORAGECLASS    VOLUMEATTRIBUTESCLASS   AGE\nmysql-data-mysql-0     Bound    mysql-pv-worker-03   10Gi       RWO            local-storage   &lt;unset&gt;                 5m1s\nmysql-data-mysql-1     Bound    mysql-pv-worker-02   10Gi       RWO            local-storage   &lt;unset&gt;                 4m59s\nmysql-data-mysql-2     Bound    mysql-pv-worker-01   10Gi       RWO            local-storage   &lt;unset&gt;                 4m56s\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get pv<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME                 CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                        STORAGECLASS    VOLUMEATTRIBUTESCLASS   REASON   AGE\nmysql-pv-worker-01   10Gi       RWO            Retain           Bound    default\/mysql-data-mysql-2   local-storage   &lt;unset&gt;                          7m43s\nmysql-pv-worker-02   10Gi       RWO            Retain           Bound    default\/mysql-data-mysql-1   local-storage   &lt;unset&gt;                          7m43s\nmysql-pv-worker-03   10Gi       RWO            Retain           Bound    default\/mysql-data-mysql-0   local-storage   &lt;unset&gt;                          7m43s\n<\/code><\/pre>\n\n\n\n<p>And the Pods associated with StatefulSet;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get pod<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME                                  READY   STATUS    RESTARTS   AGE\nmysql-0                               1\/1     Running   0          11m\nmysql-1                               1\/1     Running   0          11m\nmysql-2                               1\/1     Running   0          11m\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"scaling-stateful-sets\">Scaling StatefulSets<\/h4>\n\n\n\n<p>It is also possible to scale StatefulSets. As you can see in the output, we have three statefulsets and all of them in  READY state;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get statefulset<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME    READY   AGE\nmysql   3\/3     15m\n<\/code><\/pre>\n\n\n\n<p>For example;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl scale statefulset mysql --replicas=6<\/code><\/pre>\n\n\n\n<p>However, note that as per the setup of that we have used here where we have PV for each worker node, then you have to also create additional PVs to meet your requirements. Otherwise, that is how you can do stateful set scaling, especially where dynamic provisioning is supported.<\/p>\n\n\n\n<p>You can also scale down;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl scale statefulset mysql --replicas=2<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"deleting-stateful-sets\">Deleting StatefulSets<\/h4>\n\n\n\n<p>When deleting a StatefulSet, ensure that PVCs and the data they hold are managed correctly to avoid data loss.<\/p>\n\n\n\n<p>Otherwise, you can delete using the manifest file;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl delete -f mysql-app.yaml<\/code><\/pre>\n\n\n\n<p>Assuming that the statefulset was defined in the manifest file called <strong>mysql-app.yaml<\/strong>.<\/p>\n\n\n\n<p>Similarly, use the name of the statefulset. For example, to delete just the statefulset without touching the pods or data stored in the PersistentVolumes;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl delete statefulset mysql --cascade=orphan<\/code><\/pre>\n\n\n\n<p>Just omit <strong>&#8211;cascade=orphan<\/strong> to delete everything associated with statefulset.<\/p>\n\n\n\n<p>And that brings us to the end of our tutorial on understanding statefulsets in Kubernetes<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion\">Conclusion<\/h2>\n\n\n\n<p>StatefulSets provide stable identities, persistent storage, and controlled scaling for stateful applications, making them essential for running databases, distributed systems, and other stateful workloads in Kubernetes environments.<\/p>\n\n\n\n<p>Read more on;<\/p>\n\n\n\n<p><a href=\"https:\/\/kubernetes.io\/docs\/concepts\/workloads\/controllers\/statefulset\/\" target=\"_blank\" rel=\"noreferrer noopener\">Kubernetes StatefulSets Documentation<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/kubernetes.io\/docs\/concepts\/storage\/volumes\/\" target=\"_blank\" rel=\"noreferrer noopener\">Kubernetes Volumes<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this blog post, you will learn about statefulsets in Kubernetes and everything you need to know: the definition and purpose of StatefulSets, their importance<\/p>\n","protected":false},"author":10,"featured_media":23005,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[1076,121,1668],"tags":[7549,7548,7547,7551,7550],"class_list":["post-22983","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-containers","category-howtos","category-kubernetes","tag-kubernetes-pv","tag-kubernetes-pvc","tag-kubernetes-statefulsets","tag-local-storage-kubernetes","tag-storageclass","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22983"}],"collection":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/comments?post=22983"}],"version-history":[{"count":22,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22983\/revisions"}],"predecessor-version":[{"id":23007,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22983\/revisions\/23007"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media\/23005"}],"wp:attachment":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media?parent=22983"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/categories?post=22983"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/tags?post=22983"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}