{"id":22724,"date":"2024-06-08T15:01:47","date_gmt":"2024-06-08T12:01:47","guid":{"rendered":"https:\/\/kifarunix.com\/?p=22724"},"modified":"2024-06-15T13:05:44","modified_gmt":"2024-06-15T10:05:44","slug":"kubernetes-nodes-maintenance-drain-vs-cordon-demystified","status":"publish","type":"post","link":"https:\/\/kifarunix.com\/kubernetes-nodes-maintenance-drain-vs-cordon-demystified\/","title":{"rendered":"Kubernetes Nodes Maintenance: Drain vs. Cordon Demystified"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1072\" height=\"598\" src=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubect-drain-cordon-nodes.png?v=1717847922\" alt=\"kubectl cordon\/drain\" class=\"wp-image-22735\" style=\"width:820px;height:auto\" title=\"\" srcset=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubect-drain-cordon-nodes.png?v=1717847922 1072w, https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubect-drain-cordon-nodes-768x428.png?v=1717847922 768w\" sizes=\"(max-width: 1072px) 100vw, 1072px\" \/><\/figure>\n\n\n\n<p>What is the difference between Kubernetes <strong>kubectl drain<\/strong> and <strong>kubectl cordon<\/strong> commands? Well, keeping your Kubernetes cluster healthy and running smoothly requires regular maintenance on individual nodes. The maintenance might involve software updates, hardware upgrades, or even a complete OS reinstall. But how do you prepare a node for maintenance without disrupting your running applications? That&#8217;s where <code>drain<\/code> and <code>cordon<\/code> come in, let&#8217;s explore!<\/p>\n\n\n\n<div class=\"wp-block-rank-math-toc-block\" id=\"rank-math-toc\"><h2>Table of Contents<\/h2><nav><ul><li><a href=\"#kubernetes-nodes-maintenance-drain-vs-cordon\">Kubernetes Nodes Maintenance: Drain vs. Cordon<\/a><ul><li><a href=\"#kubectl-drain-and-cordon-in-layman-terms\">Kubectl Drain and Cordon in Layman Terms<\/a><\/li><li><a href=\"#kubectl-cordon\">kubectl cordon<\/a><\/li><li><a href=\"#example-usage-of-kubectl-cordon-command\">Example usage of kubectl cordon command<\/a><\/li><li><a href=\"#kubectl-drain\">kubectl drain<\/a><\/li><li><a href=\"#example-usage-of-kubectl-drain-command\">Example Usage of kubectl drain command<\/a><\/li><li><a href=\"#kubectl-cordon-drain-illustration\">Kubectl Cordon\/Drain Illustration<\/a><\/li><li><a href=\"#kubectl-drain-node-gets-stuck-forever-apparmor-bug\">kubectl drain node gets stuck forever [Apparmor Bug]<\/a><\/li><li><a href=\"#dealing-with-pod-disruption-budget-pdb\">Dealing with Pod Disruption Budget (PDB)<\/a><\/li><li><a href=\"#re-enable-scheduling-of-pods-on-a-node\">Re-enable Scheduling of Pods on a Node<\/a><\/li><li><a href=\"#conclusion\">Conclusion<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"kubernetes-nodes-maintenance-drain-vs-cordon\">Kubernetes Nodes Maintenance: Drain vs. Cordon<\/h2>\n\n\n\n<p><strong>kubectl drain<\/strong> and <strong>kubectl cordon<\/strong> are two essential commands in the Kubernetes toolbox that are used to manage the scheduling and eviction of pods on a node, typically during maintenance or upgrades.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"kubectl-drain-and-cordon-in-layman-terms\">Kubectl Drain and Cordon in Layman Terms<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Imagine you&#8217;re renovating a house. You want everyone to move out temporarily so you can work without disturbances. So, you politely ask the house occupants (pods) to move out to another house (node) for a while. You make sure they leave gracefully and settle into their new house (nodes) without any hassle. Once the renovations are done, they can move back in smoothly, and your house is ready to shine again! This is act of <strong>kubectl drain<\/strong>.<\/li>\n\n\n\n<li>Now, let&#8217;s say you need to fix a few things in your <strong>apartment<\/strong> (node), but not everything. You put up a sign saying &#8220;closed for renovation,&#8221; indicating that no new tenants (pods) can move in for now. However, the ones already living there (pods running on the node) can stay put and continue with their daily routines. This way, you can manage the repairs without causing any disruptions to the existing occupants. This is the action of <strong>kubectl cordon<\/strong>.<\/li>\n<\/ul>\n\n\n\n<p>Diving further, what does <strong>kubectl drain<\/strong> and <strong>kubectl cordon<\/strong> do exactly?<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"kubectl-cordon\">kubectl cordon<\/h3>\n\n\n\n<p>The <strong>kubectl cordon<\/strong> command marks a Kubernetes node as <strong>unschedulable<\/strong>. This means that no new pods can be scheduled on a node that has been cordoned. However, any existing pods hosted on that specific node are not affected and will continue to run as usual.<\/p>\n\n\n\n<p><strong>kubectl cordon<\/strong> is used when you want to prepare a node for maintenance or upgrades without affecting the currently running pods.<\/p>\n\n\n\n<p>Ideally, if you are planning on deleting the pods running a node, you can simply use the <strong>drain<\/strong> command instead.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"example-usage-of-kubectl-cordon-command\">Example usage of kubectl cordon command<\/h3>\n\n\n\n<p>First of all, if you want to set the a node for maintenance, you need to identify that specific node. You can use the command below to get a list of nodes;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get nodes<\/code><\/pre>\n\n\n\n<p>Sample output;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>NAME        STATUS   ROLES           AGE     VERSION\nmaster-01   Ready    control-plane   3h29m   v1.30.1\nmaster-02   Ready    control-plane   3h21m   v1.30.1\nmaster-03   Ready    control-plane   3h16m   v1.30.1\nworker-01   Ready    <none>          174m    v1.30.1\nworker-02   Ready    <none>          172m    v1.30.1\nworker-03   Ready    <none>          172m    v1.30.1\n<\/code><\/pre>\n\n\n\n<p>Once you know the specific node, then you can go ahead and cordon it.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl cordon &lt;node-name&gt;<\/code><\/pre>\n\n\n\n<p>For example;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl cordon master-01<\/code><\/pre>\n\n\n\n<p>So the command;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>marks <strong>master-01<\/strong> node as unschedulable.<\/li>\n\n\n\n<li>ensures that no new pods will be scheduled on this node.<\/li>\n\n\n\n<li>cause existing pods on this node to continue runing until they are manually evicted or terminated.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"kubectl-drain\">kubectl drain<\/h3>\n\n\n\n<p>The <strong>kubectl drain<\/strong> command on the other hand, performs the actions of the <strong>kubectl cordon<\/strong> on a node and then evicts all the pods from that node. That is to say, it marks the node as unschedulable so that no new deployments are made on that node. After that, it then gracefully evicts all pods from a node. It is typically used for maintenance tasks where the node needs to be emptied of running pods, for example, during a node upgrade or decommissioning.<\/p>\n\n\n\n<p>Note: The <strong>drain<\/strong> command will wait for graceful eviction of all pods. Therefore, you should not operate on the machine until the command completes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"example-usage-of-kubectl-drain-command\">Example Usage of kubectl drain  command<\/h3>\n\n\n\n<p>To drain a node, identify the node to drain:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get nodes<\/code><\/pre>\n\n\n\n<p>Execute the drain command against a node:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl drain &lt;node-name&gt;<\/code><\/pre>\n\n\n\n<p>In regards to <strong>draining<\/strong> the pods running on a Kubernetes node, <strong>all pods except DaemonSets and static\/mirror pods are safely evicted.<\/strong><\/p>\n\n\n\n<p>Therefore, if there are daemon set-managed pods on a node, the drain command will not proceed without <strong>&#8211;ignore-daemonsets<\/strong> option which tells it to skip the drain of such pods.<\/p>\n\n\n\n<p>For more command line options to use, refer to;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl drain --help<\/code><\/pre>\n\n\n\n<p>All evicted pods are rescheduled on other available nodes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"kubectl-cordon-drain-illustration\">Kubectl Cordon\/Drain Illustration<\/h3>\n\n\n\n<p>The process of cordoning and evicting pods from the nodes is illustrated in the diagram below.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1110\" height=\"913\" src=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubectl-drain-illustration.png\" alt=\"kubectl cordon\/drain\" class=\"wp-image-22733\" title=\"\" srcset=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubectl-drain-illustration.png 1110w, https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/06\/kubectl-drain-illustration-768x632.png 768w\" sizes=\"(max-width: 1110px) 100vw, 1110px\" \/><figcaption class=\"wp-element-caption\">https:\/\/kubernetes.io\/images\/docs\/kubectl_drain.svg<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"kubectl-drain-node-gets-stuck-forever-apparmor-bug\">kubectl drain node gets stuck forever [Apparmor Bug]<\/h3>\n\n\n\n<p>Have you tried to drain a node and it taking long or even fails? Well, my worker nodes hosted on Ubuntu 24.04 system are using <strong>containerd<\/strong> CRI. And as of this writing, when I tried to drain the, they took get stuck forever!<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl drain worker-01 --ignore-daemonsets<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>node\/worker-01 cordoned\nWarning: ignoring DaemonSet-managed Pods: calico-system\/calico-node-7lg6j, calico-system\/csi-node-driver-vbztn, kube-system\/kube-proxy-9g7c6\nevicting pod calico-system\/calico-typha-bd8d4bc69-8mnjl\nevicting pod apps\/nginx-app-6ff7b5d8f6-hsljs\n...\n<\/code><\/pre>\n\n\n\n<p>And pods stuck at <strong>terminating<\/strong>&#8230; state!<\/p>\n\n\n\n<p>This prompted me to check the logs on the worker nodes and alas! <strong>Apparmor<\/strong>!<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo tail -f \/var\/log\/kern.log<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>2024-06-14T19:04:43.331091+00:00 worker-01 kernel: audit: type=1400 audit(1718391883.329:221): apparmor=\"<strong>DENIED<\/strong>\" operation=\"signal\" class=\"signal\" profile=\"cri-containerd.apparmor.d\" pid=7445 comm=\"runc\" requested_mask=\"receive\" denied_mask=\"receive\" signal=kill peer=\"runc\"<\/code><\/pre>\n\n\n\n<p>The log file, it basically says;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The <strong>runc<\/strong>, a runtime used by CRI, e.g containerd to manage containers attempted to receive the <strong>kill signal<\/strong> to terminate the containers.<\/li>\n\n\n\n<li>However, when it tried to receive the <strong>kill signal<\/strong>, this action was <strong>denied<\/strong> by the AppArmor security policy under the profile, <strong>cri-containerd.apparmor.d<\/strong>. AppArmor is a Linux kernel security module that restricts what actions processes can perform based on security profiles.<\/li>\n<\/ul>\n\n\n\n<p>Current versions of containerd\/runc;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>root@worker-03:~# containerd --version\ncontainerd github.com\/containerd\/containerd 1.7.12 \nroot@worker-03:~# runc --version\nrunc version 1.1.12-0ubuntu3\nspec: 1.0.2-dev\ngo: go1.22.2\nlibseccomp: 2.5.5\nroot@worker-03:~# crictl --version\ncrictl version v1.30.0\nroot@worker-03:~# \n<\/code><\/pre>\n\n\n\n<p>You realized that while you can circumvent this nuisance by stopping <strong>apparmor service<\/strong>, you might actually be opening a pandora&#8217;s box in your system. Therefore, to avoid having to do this, you can simply unload the <strong>cri-containerd.apparmor.d<\/strong> profile which is not stored under the apparmor profiles directory, <a href=\"file:\/\/\/etc\/apparmor.d\/\">\/etc\/apparmor.d\/<\/a>, by default using the <strong>aa-remove-unknown<\/strong> command.<\/p>\n\n\n\n<p>According to man page;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>aa-remove-unknown will inventory all profiles in \/etc\/apparmor.d\/, compare that list to the profiles currently loaded into the kernel, and then remove all of the loaded profiles that were not found in \/etc\/apparmor.d\/. It will also report the name of each profile that it removes on standard out.<\/code><\/pre>\n\n\n\n<p>Therefore, execute the command;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo aa-remove-unknown<\/code><\/pre>\n\n\n\n<p>Your drain command should now execute to completion.<\/p>\n\n\n\n<p>You can temporarily stop <strong>apparmor<\/strong> service when you are draining the nodes\/deleting pods and start it again after the nodes are drained\/pods terminated.<\/p>\n\n\n\n<p>BUT, fortunately, <a href=\"https:\/\/launchpad.net\/~sebastian-podjasek\" target=\"_blank\" rel=\"noreferrer noopener\">Sebastian<\/a>, managed to hand-craft a <strong>cri-containerd.apparmor.d<\/strong> profile that is handling this bug handsomely. You can install the profile by running the command below.<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>sudo tee \/etc\/apparmor.d\/cri-containerd.apparmor.d &lt;&lt; 'EOL'\n#include &lt;tunables\/global&gt;\n\nprofile cri-containerd.apparmor.d flags=(attach_disconnected,mediate_deleted) {\n  #include &lt;abstractions\/base&gt;\n\n  network,\n  capability,\n  file,\n  umount,\n  # Host (privileged) processes may send signals to container processes.\n  signal (receive) peer=unconfined,\n  # runc may send signals to container processes.\n  signal (receive) peer=runc,\n  # crun may send signals to container processes.\n  signal (receive) peer=crun,\n  # Manager may send signals to container processes.\n  signal (receive) peer=cri-containerd.apparmor.d,\n  # Container processes may send signals amongst themselves.\n  signal (send,receive) peer=cri-containerd.apparmor.d,\n\n  deny @{PROC}\/* w,   # deny write for all files directly in \/proc (not in a subdir)\n  # deny write to files not in \/proc\/&lt;number&gt;\/** or \/proc\/sys\/**\n  deny @{PROC}\/{[^1-9],[^1-9][^0-9],[^1-9s][^0-9y][^0-9s],[^1-9][^0-9][^0-9][^0-9]*}\/** w,\n  deny @{PROC}\/sys\/[^k]** w,  # deny \/proc\/sys except \/proc\/sys\/k* (effectively \/proc\/sys\/kernel)\n  deny @{PROC}\/sys\/kernel\/{?,??,[^s][^h][^m]**} w,  # deny everything except shm* in \/proc\/sys\/kernel\/\n  deny @{PROC}\/sysrq-trigger rwklx,\n  deny @{PROC}\/mem rwklx,\n  deny @{PROC}\/kmem rwklx,\n  deny @{PROC}\/kcore rwklx,\n\n  deny mount,\n\n  deny \/sys\/[^f]*\/** wklx,\n  deny \/sys\/f[^s]*\/** wklx,\n  deny \/sys\/fs\/[^c]*\/** wklx,\n  deny \/sys\/fs\/c[^g]*\/** wklx,\n  deny \/sys\/fs\/cg[^r]*\/** wklx,\n  deny \/sys\/firmware\/** rwklx,\n  deny \/sys\/devices\/virtual\/powercap\/** rwklx,\n  deny \/sys\/kernel\/security\/** rwklx,\n\n  # allow processes within the container to trace each other,\n  # provided all other LSM and yama setting allow it.\n  ptrace (trace,tracedby,read,readby) peer=cri-containerd.apparmor.d,\n}\nEOL\n<\/code><\/pre>\n\n\n\n<p>Then, check the file for syntactical errors and load it into the AppArmor security subsystem.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo apparmor_parser -r \/etc\/apparmor.d\/cri-containerd.apparmor.d<\/code><\/pre>\n\n\n\n<p>This should also sort the issue perfectly. <a href=\"https:\/\/bugs.launchpad.net\/ubuntu\/+source\/containerd-app\/+bug\/2065423\" target=\"_blank\" rel=\"noreferrer noopener\">More on the likely bug<\/a>.<\/p>\n\n\n\n<p>Be sure to watch the logs, <strong>kern.log<\/strong> or <strong>syslog<\/strong> for any issue related to apparmor.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"dealing-with-pod-disruption-budget-pdb\">Dealing with Pod Disruption Budget (PDB)<\/h3>\n\n\n\n<p>A Pod Disruption Budget (PDB) is a Kubernetes resource that specifies the minimum number or percentage of pods in a deployment that must be available at any given time. It is used to ensure that a certain number of pods remain operational during voluntary disruptions, such as maintenance or updates. PDBs are especially important for high availability and stability of applications, ensuring that critical services remain available even during planned disruptions.<\/p>\n\n\n\n<p>When you drain a node using <strong>kubectl drain<\/strong>, Kubernetes checks the PDBs to ensure the defined disruption budget is not violated. If evicting a pod would violate the PDB, Kubernetes will block the eviction and display an error message.<\/p>\n\n\n\n<p>If the drain command is blocked due to PDB violations, you have a few options:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Adjust PDBs:<\/strong> Temporarily adjust the PDBs to allow more flexibility during the maintenance window.<\/li>\n\n\n\n<li><strong>Force Drain:<\/strong> As a last resort, you can use the <strong>&#8211;force<\/strong> flag to force the drain, ignoring PDBs. This should be done with extreme caution as it can cause application downtime.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"re-enable-scheduling-of-pods-on-a-node\">Re-enable Scheduling of Pods on a Node<\/h3>\n\n\n\n<p>When you are ready to put the node back into service, you can make it schedulable again using the command;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl uncordon &lt;cordoned-node-name&gt;<\/code><\/pre>\n\n\n\n<p>For example;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl uncordon master-01<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"conclusion\">Conclusion<\/h3>\n\n\n\n<p>In summary;<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>Command<\/th><th>Scheduling New Pods<\/th><th>Pod Eviction<\/th><\/tr><\/thead><tbody><tr><td>kubectl cordon<\/td><td>Prevents new pods from being scheduled on the node<\/td><td>Does not evict existing pods.<\/td><\/tr><tr><td>kubectl drain<\/td><td>Prevents new pods from being scheduled on the node and evicts existing pods.<\/td><td>Evicts existing pods, excluding DaemonSets and static pods by default.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Read more on:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl cordon --help<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl drain --help<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>What is the difference between Kubernetes kubectl drain and kubectl cordon commands? Well, keeping your Kubernetes cluster healthy and running smoothly requires regular maintenance on<\/p>\n","protected":false},"author":10,"featured_media":22735,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[1076,121,1668],"tags":[7529,7519,7518,7517],"class_list":["post-22724","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-containers","category-howtos","category-kubernetes","tag-apparmordenied-profilecri-containerd-apparmor-d","tag-kubectl-cordon","tag-kubectl-drain","tag-kubernetes-cordon","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22724"}],"collection":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/comments?post=22724"}],"version-history":[{"count":21,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22724\/revisions"}],"predecessor-version":[{"id":22853,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/22724\/revisions\/22853"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media\/22735"}],"wp:attachment":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media?parent=22724"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/categories?post=22724"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/tags?post=22724"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}