{"id":23030,"date":"2024-07-05T21:14:49","date_gmt":"2024-07-05T18:14:49","guid":{"rendered":"https:\/\/kifarunix.com\/?p=23030"},"modified":"2024-07-06T00:05:01","modified_gmt":"2024-07-05T21:05:01","slug":"kubernetes-resource-optimization-with-vertical-pod-autoscaler-vpa","status":"publish","type":"post","link":"https:\/\/kifarunix.com\/kubernetes-resource-optimization-with-vertical-pod-autoscaler-vpa\/","title":{"rendered":"Kubernetes Resource Optimization with Vertical Pod Autoscaler (VPA)"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1072\" height=\"602\" src=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/07\/kubernetes-vpa.png?v=1720203195\" alt=\"Kubernetes Resource Optimization with Vertical Pod Autoscaler (VPA)\" class=\"wp-image-23065\" title=\"\" srcset=\"https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/07\/kubernetes-vpa.png?v=1720203195 1072w, https:\/\/kifarunix.com\/wp-content\/uploads\/2024\/07\/kubernetes-vpa-768x431.png?v=1720203195 768w\" sizes=\"(max-width: 1072px) 100vw, 1072px\" \/><\/figure>\n\n\n\n<p>This blog post will take you through Kubernetes resource optimization with <a href=\"https:\/\/github.com\/kubernetes\/autoscaler\/tree\/master\/vertical-pod-autoscaler\" target=\"_blank\" rel=\"noopener\">Vertical Pod Autoscaler.<\/a> Managing resource allocation efficiently in Kubernetes is crucial for optimizing costs and ensuring consistent application performance. Vertical Pod Autoscaler (VPA) addresses these challenges by dynamically adjusting resource requests and limits for individual pods based on their actual usage patterns. This blog post explores VPA&#8217;s functionalities, benefits over Horizontal Pod Autoscaler (HPA), installation steps, configuration, and practical examples.<\/p>\n\n\n\n<div class=\"wp-block-rank-math-toc-block\" id=\"rank-math-toc\"><h2>Table of Contents<\/h2><nav><ul><li><a href=\"#optimizing-kubernetes-resources-with-vertical-pod-autoscaler-vpa\">Optimizing Kubernetes Resources with Vertical Pod Autoscaler (VPA)<\/a><ul><li><a href=\"#what-is-vertical-pod-autoscaler-vpa-in-kubernetes\">What is Vertical Pod Autoscaler (VPA) in Kubernetes?<\/a><\/li><li><a href=\"#how-does-vertical-pod-autoscaler-work-in-kubernetes\">How does Vertical Pod Autoscaler Work in Kubernetes<\/a><\/li><li><a href=\"#vpa-vs-hpa-understanding-the-differences\">VPA vs. HPA: Understanding the Differences<\/a><\/li><li><a href=\"#getting-started-with-vpa-in-kubernetes-cluster\">Getting Started with VPA in Kubernetes Cluster<\/a><ul><li><a href=\"#install-and-setup-kubernetes-cluster\">Install and Setup Kubernetes cluster<\/a><\/li><li><a href=\"#install-and-setup-metrics-server\">Install and Setup Metrics Server<\/a><\/li><li><a href=\"#install-vpa-controller\">Install VPA Controller<\/a><\/li><li><a href=\"#create-vpa-custom-resource\">Create VPA Custom Resource<\/a><\/li><li><a href=\"#simulating-events-to-trigger-vertical-scaling\">Simulating Events to Trigger Vertical Scaling<\/a><\/li><li><a href=\"#defining-resource-requests-and-limits-for-a-vpa\">Defining Resource Requests and Limits for a VPA<\/a><\/li><li><a href=\"#updating-the-vpa-configuration\">Updating the VPA Configuration<\/a><\/li><li><a href=\"#deleting-the-vpa-in-kubernetes-cluster\">Deleting the VPA in Kubernetes Cluster<\/a><\/li><\/ul><\/li><li><a href=\"#best-practices-when-using-vpa\">Best Practices when using VPA<\/a><\/li><li><a href=\"#further-reading\">Further Reading<\/a><\/li><\/ul><\/li><li><a href=\"#conclusion\">Conclusion<\/a><\/li><\/ul><\/nav><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"optimizing-kubernetes-resources-with-vertical-pod-autoscaler-vpa\">Optimizing Kubernetes Resources with Vertical Pod Autoscaler (VPA)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"what-is-vertical-pod-autoscaler-vpa-in-kubernetes\">What is Vertical Pod Autoscaler (VPA) in Kubernetes?<\/h3>\n\n\n\n<p>Vertical Pod Autoscaler (VPA) is a Kubernetes API resource that automatically adjusts the resource requests (CPU and memory) of pods to better match their actual usage patterns. Unlike Horizontal Pod Autoscaler (HPA), which scales the number of pod instances in the cluster based on metrics like CPU or memory utilization, VPA focuses on optimizing the resource requests of individual pods.<\/p>\n\n\n\n<p>You can read more about Kubernetes Horizontal Pod Autoscaler (HPA) in the guide below;<\/p>\n\n\n\n<p><a href=\"https:\/\/kifarunix.com\/mastering-kubernetes-autoscaling-horizontal-vs-vertical-scaling\/#horizontal-scaling\" target=\"_blank\" rel=\"noreferrer noopener\">Horizontal Scaling in Kubernetes<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"how-does-vertical-pod-autoscaler-work-in-kubernetes\">How does Vertical Pod Autoscaler Work in Kubernetes<\/h3>\n\n\n\n<p>VPA analyzes historical resource usage patterns of pods and adjusts their resource requests (CPU and memory) accordingly. VPA works in multi-step process to adjust the pods resource requests. This process is summarized below;<\/p>\n\n\n\n<p>VPA is made up three components that work to together ensure efficient resource optimization.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>VPA Recommender<\/strong>: The recommender continuously monitors and analyzes the Pods resource requests (CPU and memory&#8230;). Based on the analysis, it provides recommended values for the containers&#8217; cpu and memory requests.<\/li>\n\n\n\n<li><strong>VPA Updater:<\/strong> Based on the recommendations by the recommender, the <strong>Updater<\/strong> checks if the managed Pods have the correct recommended resource requests set. If not, it initiates actions to adjust resource requests and limits of the pods. The action to take depends on the chosen VPA mode which can either be <strong>Auto<\/strong>, <strong>Recreate<\/strong>, <strong>Initial<\/strong> or <strong>Off<\/strong>. There are four modes under which a VPA can operate:\n<ul class=\"wp-block-list\">\n<li><strong>Auto Mode (Recommended):<\/strong>&nbsp;This mode allows VPA to automatically adjust resource requests for Pods when they are being created. However,&nbsp;it also updates resource requests and limits for existing pods based on the recommendations from the <strong>recommender<\/strong>. Therefore,&nbsp;<strong>you don&#8217;t necessarily need to set initial requests and limits<\/strong>&nbsp;for pods managed by VPA in Auto mode.<\/li>\n\n\n\n<li><strong>Recreate<\/strong>: Just like the Auto mode, this mode allows VPA to update existing pods resource requests by evicting and recreating them when the resource requests differ significantly from the new recommendation. It will honor PDB if any is set. When using this mode, be cautious as it can lead to downtime.<\/li>\n\n\n\n<li><strong>Initial Mode:<\/strong>&nbsp;This mode allows VPA to assign resource requests to Pods only on creation and never updates them once the Pods are running. As such, you need to manually set both resource requests and limits for your pods.<\/li>\n\n\n\n<li><strong>Off:<\/strong> In this mode, VPA does not automatically update the resource requirements of the pods. The recommendations are calculated and can be inspected in the VPA object. Uoto admin to apply. Similarly, you need to manually set both resource requests and limits for your pods.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>VPA Admission Controller<\/strong>: This component works alongside the Kubernetes API server. When a new pod managed by the VPA is created, the admission controller intercepts the request and injects the recommended resource values from the recommender into the pod specification before scheduling the pod on a node. This ensures the pod starts with the optimal resource allocation from the beginning.<\/li>\n<\/ul>\n\n\n\n<p>In essence:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The VPA recommender gathers resource utilization data from the Kubernetes metrics server. This data typically includes CPU usage, memory usage, and other relevant metrics depending on your configuration.<\/li>\n\n\n\n<li>Based on this data, the recommender analyzes historical usage patterns for each pod. It considers factors like peak usage, average usage, and resource constraints.<\/li>\n\n\n\n<li>The recommender then calculates the optimal resource requests and limits for each pod. Resource requests define the minimum amount of resources a pod needs to function properly, while limits set the maximum resources a pod can consume.<\/li>\n<\/ul>\n\n\n\n<p>Master Kubernetes with a beginner-friendly book, The Kubernetes Book 2024 Edition by Nigel Poulton;<\/p>\n\n\n\n<figure class=\"wp-block-embed aligncenter is-type-rich is-provider-amazon wp-block-embed-amazon\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"The Kubernetes Book: 2024 Edition\" type=\"text\/html\" width=\"1200\" height=\"550\" frameborder=\"0\" allowfullscreen style=\"max-width:100%\" src=\"https:\/\/read.amazon.com\/kp\/card?preview=inline&#038;linkCode=ll1&#038;ref_=k4w_oembed_laGe4BkBechdKi&#038;asin=B072TS9ZQZ&#038;tag=dc42a8f60962-20\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"vpa-vs-hpa-understanding-the-differences\">VPA vs. HPA: Understanding the Differences<\/h3>\n\n\n\n<p>When you deploy your Kubernetes application, you may estimate its resource requirements and probably set the initial CPU and memory requests\/limits. But what happens when the workload fluctuates? Two things can happen:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Over-provisioning:<\/strong>&nbsp;You may overestimate the resource requirements leading to wastage and higher cloud costs.<\/li>\n\n\n\n<li><strong>Under-provisioning:<\/strong>&nbsp;Underestimating resource requirements can result in pod performance degradation and potential application outages.<\/li>\n<\/ul>\n\n\n\n<p>In HPA, you have to manually define the resource (CPU and Memory) requests and limits, which may lead to situations above. This is what VPA aims to solve by automatically adjusting resource requests and limits for individual pods based on their actual usage. As such, it is suitable for controlling resources requirements for workloads with fluctuating resource demands.<\/p>\n\n\n\n<p>While Kubernetes natively supports HPA, you need to manually install VPA using custom resource definitions (CRDs) to utilize it.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"getting-started-with-vpa-in-kubernetes-cluster\">Getting Started with VPA in Kubernetes Cluster<\/h3>\n\n\n\n<p>So, how can you implement VPA in your Kubernetes cluster to control workload resources.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"install-and-setup-kubernetes-cluster\">Install and Setup Kubernetes cluster<\/h4>\n\n\n\n<p>You can check any of our guides below to install and setup Kubernetes cluster.<\/p>\n\n\n\n<p><a href=\"https:\/\/kifarunix.com\/install-and-setup-kubernetes-cluster-on-ubuntu-24-04\/\" target=\"_blank\" rel=\"noreferrer noopener\">Install and Setup Kubernetes Cluster on Ubuntu 24.04<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/kifarunix.com\/setup-highly-available-kubernetes-cluster-with-haproxy-and-keepalived\/\" target=\"_blank\" rel=\"noreferrer noopener\">Setup Highly Available Kubernetes Cluster with Haproxy and Keepalived<\/a><\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"install-and-setup-metrics-server\">Install and Setup Metrics Server<\/h4>\n\n\n\n<p>You need a Metrics server collect metrics from Pods and send it VPA controller!<\/p>\n\n\n\n<p><a href=\"https:\/\/kifarunix.com\/install-kubernetes-metrics-server-on-a-kubernetes-cluster\/\" target=\"_blank\" rel=\"noreferrer noopener\">Install Kubernetes Metrics Server on a Kubernetes Cluster<\/a><\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"install-vpa-controller\">Install VPA Controller<\/h4>\n\n\n\n<p>VPA (Vertical Pod Autoscaler) is not natively available in Kubernetes like HPA (Horizontal Pod Autoscaler). Therefore, you have to install it.<\/p>\n\n\n\n<p>Therefore, clone the <a href=\"https:\/\/github.com\/kubernetes\/autoscaler\" target=\"_blank\" rel=\"noreferrer noopener\">kubernetes\/autoscaler<\/a> git repo;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo apt install git<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>git clone https:\/\/github.com\/kubernetes\/autoscaler.git<\/code><\/pre>\n\n\n\n<p>Then navigate to the VPA directory and install it as follows;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cd autoscaler\/vertical-pod-autoscaler\/<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>.\/hack\/vpa-up.sh<\/code><\/pre>\n\n\n\n<p>Sample installation output;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>customresourcedefinition.apiextensions.k8s.io\/verticalpodautoscalercheckpoints.autoscaling.k8s.io created\ncustomresourcedefinition.apiextensions.k8s.io\/verticalpodautoscalers.autoscaling.k8s.io created\nclusterrole.rbac.authorization.k8s.io\/system:metrics-reader created\nclusterrole.rbac.authorization.k8s.io\/system:vpa-actor created\nclusterrole.rbac.authorization.k8s.io\/system:vpa-status-actor created\nclusterrole.rbac.authorization.k8s.io\/system:vpa-checkpoint-actor created\nclusterrole.rbac.authorization.k8s.io\/system:evictioner created\nclusterrolebinding.rbac.authorization.k8s.io\/system:metrics-reader created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-actor created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-status-actor created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-checkpoint-actor created\nclusterrole.rbac.authorization.k8s.io\/system:vpa-target-reader created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-target-reader-binding created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-evictioner-binding created\nserviceaccount\/vpa-admission-controller created\nserviceaccount\/vpa-recommender created\nserviceaccount\/vpa-updater created\nclusterrole.rbac.authorization.k8s.io\/system:vpa-admission-controller created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-admission-controller created\nclusterrole.rbac.authorization.k8s.io\/system:vpa-status-reader created\nclusterrolebinding.rbac.authorization.k8s.io\/system:vpa-status-reader-binding created\ndeployment.apps\/vpa-updater created\ndeployment.apps\/vpa-recommender created\nGenerating certs for the VPA Admission Controller in \/tmp\/vpa-certs.\nCertificate request self-signature ok\nsubject=CN = vpa-webhook.kube-system.svc\nUploading certs to the cluster.\nsecret\/vpa-tls-certs created\nDeleting \/tmp\/vpa-certs.\ndeployment.apps\/vpa-admission-controller created\nservice\/vpa-webhook created\n<\/code><\/pre>\n\n\n\n<p>Confirm that the main VPA components have been installed  under the <strong>kube-system<\/strong> namespace;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get pod -n kube-system<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>...\nvpa-admission-controller-c5c5b4fcc-lcb5p   1\/1     Running   0             5m38s\nvpa-recommender-6c4585968-l4jnk            1\/1     Running   0             5m39s\nvpa-updater-7686fd5bf9-kmgxt               1\/1     Running   0             5m39s\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"create-vpa-custom-resource\">Create VPA Custom Resource<\/h4>\n\n\n\n<p>Let&#8217;s create a VPA resource to specify which deployments or pods VPA should manage.<\/p>\n\n\n\n<p>In my example setup, we have an Nginx app deployment in the apps namespace.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get deployment -n apps<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME        READY   UP-TO-DATE   AVAILABLE   AGE\nnginx-app   1\/1     1            1           17d\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl describe deployment nginx-app -n apps<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>apiVersion: apps\/v1\nkind: Deployment\nmetadata:\n  annotations:\n    deployment.kubernetes.io\/revision: \"8\"\n  creationTimestamp: \"2024-06-17T07:28:33Z\"\n  generation: 15\n  labels:\n    app: nginx-app\n  name: nginx-app\n  namespace: apps\n  resourceVersion: \"3811115\"\n  uid: d0437a23-6574-4d3f-b1b9-d9995f7188ca\nspec:\n  progressDeadlineSeconds: 600\n  replicas: 1\n  revisionHistoryLimit: 10\n  selector:\n    matchLabels:\n      app: nginx-app\n  strategy:\n    rollingUpdate:\n      maxSurge: 25%\n      maxUnavailable: 25%\n    type: RollingUpdate\n  template:\n    metadata:\n      creationTimestamp: null\n      labels:\n        app: nginx-app\n    spec:\n      containers:\n      - image: nginx:latest\n        imagePullPolicy: Always\n        name: nginx\n        resources: {}\n        terminationMessagePath: \/dev\/termination-log\n        terminationMessagePolicy: File\n        volumeMounts:\n        - mountPath: \/usr\/share\/nginx\/html\n          name: html-volume\n      dnsPolicy: ClusterFirst\n      restartPolicy: Always\n      schedulerName: default-scheduler\n      securityContext: {}\n      terminationGracePeriodSeconds: 30\n      volumes:\n      - configMap:\n          defaultMode: 420\n          name: html-page\n        name: html-volume\nstatus:\n  availableReplicas: 1\n  conditions:\n  - lastTransitionTime: \"2024-07-01T20:29:21Z\"\n    lastUpdateTime: \"2024-07-01T20:29:21Z\"\n    message: Deployment has minimum availability.\n    reason: MinimumReplicasAvailable\n    status: \"True\"\n    type: Available\n  - lastTransitionTime: \"2024-06-17T07:28:33Z\"\n    lastUpdateTime: \"2024-07-04T17:16:31Z\"\n    message: ReplicaSet \"nginx-app-6ff7b5d8f6\" has successfully progressed.\n    reason: NewReplicaSetAvailable\n    status: \"True\"\n    type: Progressing\n  observedGeneration: 15\n  readyReplicas: 1\n  replicas: 1\n  updatedReplicas: 1\n<\/code><\/pre>\n\n\n\n<p>As you can see from our Deployment details above, we haven&#8217;t defined the resource requests and limits because we are going to create an <strong>Auto<\/strong> mode VPA.<\/p>\n\n\n\n<p>This is our VPA custom resource definition, named <strong>nginx-app-vpa<\/strong>, for auto-scaling the Nginx deployment workload;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat vpa-auto.yaml<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>apiVersion: autoscaling.k8s.io\/v1\nkind: VerticalPodAutoscaler\nmetadata:\n  name: nginx-app-vpa\n  namespace: apps\nspec:\n  targetRef:\n    apiVersion: \"apps\/v1\"\n    kind: Deployment\n    name: nginx-app\n  updatePolicy:\n    updateMode: \"Auto\"\n<\/code><\/pre>\n\n\n\n<p>Apply the manifest file!<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f vpn-auto.yaml<\/code><\/pre>\n\n\n\n<p>Confirm;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get vpa -n apps<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME            MODE   CPU   MEM   PROVIDED   AGE\nnginx-app-vpa   Auto                          6s\n<\/code><\/pre>\n\n\n\n<p>You can describe it to see more details;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl describe vpa nginx-app-vpa -n apps<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>Name:         nginx-app-vpa\nNamespace:    apps\nLabels:       &lt;none&gt;\nAnnotations:  &lt;none&gt;\nAPI Version:  autoscaling.k8s.io\/v1\nKind:         VerticalPodAutoscaler\nMetadata:\n  Creation Timestamp:  2024-07-04T17:47:23Z\n  Generation:          1\n  Resource Version:    3816539\n  UID:                 5ea19dff-6bd8-43cc-b758-bf7c1487b8c4\nSpec:\n  Target Ref:\n    API Version:  apps\/v1\n    Kind:         Deployment\n    Name:         nginx-app\n  Update Policy:\n    Update Mode:  Auto\nStatus:\n  Conditions:\n    Last Transition Time:  2024-07-04T17:48:21Z\n    Status:                True\n    Type:                  RecommendationProvided\n  Recommendation:\n    Container Recommendations:\n      Container Name:  nginx\n      Lower Bound:\n        Cpu:     25m\n        Memory:  262144k\n      Target:\n        Cpu:     25m\n        Memory:  262144k\n      Uncapped Target:\n        Cpu:     25m\n        Memory:  262144k\n      Upper Bound:\n        Cpu:     25m\n        Memory:  262144k\nEvents:          &lt;none&gt;\n\n<\/code><\/pre>\n\n\n\n<p>As you can see, the VPA will monitor and adjust the resource requests of the pods managed by the nginx-app Deployment. It is set to automatically update the resource requests of the pods (<code>nginx-app<\/code> Deployment) based on their resource usage patterns.<\/p>\n\n\n\n<p>From the description output above, the VPA recommender has already analyzed the current Deployment pods resource usage.<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>Status:\n  Conditions:\n    Last Transition Time:  2024-07-04T17:48:21Z\n    Status:                True\n    Type:                  RecommendationProvided\n<\/code><\/pre>\n\n\n\n<p>And the recommendation suggests allocating <strong>25 millicores<\/strong> of CPU and <strong>262144 KiB<\/strong> of memory for the <code class=\"\">nginx<\/code> container.;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>  Recommendation:\n    Container Recommendations:\n      Container Name:  nginx\n      Lower Bound:\n        Cpu:     25m\n        Memory:  262144k\n      Target:\n        Cpu:     25m\n        Memory:  262144k\n      Uncapped Target:\n        Cpu:     25m\n        Memory:  262144k\n      Upper Bound:\n        Cpu:     25m\n        Memory:  262144k\n<\/code><\/pre>\n\n\n\n<p>Where:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Lower Bound:<\/strong>&nbsp;Minimum recommended resource allocation (25m CPU,&nbsp;262144k memory).<\/li>\n\n\n\n<li><strong>Target:<\/strong>&nbsp;Ideal resource allocation based on observed usage (25m CPU,&nbsp;262144k memory).<\/li>\n\n\n\n<li><strong>Uncapped Target:<\/strong>&nbsp;Same as Target in this case.<\/li>\n\n\n\n<li><strong>Upper Bound:<\/strong>&nbsp;Maximum recommended resource allocation (25m CPU,&nbsp;262144k memory).<\/li>\n<\/ul>\n\n\n\n<p>If you check the VPA, now it shows the recommended resources;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get vpa -n apps<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>NAME            MODE   CPU   MEM       PROVIDED   AGE\nnginx-app-vpa   Auto   25m   262144k   True       40m\n<\/code><\/pre>\n\n\n\n<p>As you can see:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>MODE<\/strong>: Indicates the mode in which the VPA is operating. In this case, it is set to <code>Auto<\/code>, which means the VPA automatically adjusts resource requests based on observed usage patterns.<\/li>\n\n\n\n<li><strong>CPU<\/strong>: Specifies the recommended CPU request in milliCPU (m). In this case, it recommends 25 milliCPU, which is equivalent to 0.025 CPU cores.<\/li>\n\n\n\n<li><strong>MEM<\/strong>: Specifies the recommended memory request in kilobytes (k). Here, it recommends 262144 kilobytes, which is 262144 \/ 1024 = 256 megabytes (Mi).<\/li>\n\n\n\n<li><strong>PROVIDED<\/strong>: Indicates whether the recommendations provided by the VPA are being applied (<code>True<\/code>). If <code>False<\/code>, it would mean the VPA is not currently enforcing its recommendations.<\/li>\n\n\n\n<li><strong>AGE<\/strong>: Represents the age of the VPA resource, indicating how long it has been active or since it was created or updated.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"simulating-events-to-trigger-vertical-scaling\">Simulating Events to Trigger Vertical Scaling<\/h4>\n\n\n\n<p>Now, let&#8217;s see VPA in action. How does it react to an increase load on the Deployment workload?<\/p>\n\n\n\n<p>We will use&nbsp;<strong>ApacheBench<\/strong>&nbsp;(<strong>ab<\/strong>) to perform load testing on our web app.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>while true; do sleep 0.01; ab -n 500000 -c 1000 http:\/\/192.168.122.62:30833\/; done<\/code><\/pre>\n\n\n\n<p>The command will send 500,000 HTTP requests to&nbsp;<code>http:\/\/192.168.122.62:30833\/<\/code>&nbsp;with a concurrency level of 1000 requests at a time, pausing for 0.01 seconds between each iteration.<\/p>\n\n\n\n<p>Before we run the stress test command above, let&#8217;s watch the Pod resource usage;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>watch -n 1 'kubectl top pod -n apps -l app=nginx-app'<\/code><\/pre>\n\n\n\n<p>Next, execute the stress test command above.<\/p>\n\n\n\n<p>After a short while, this is the output of the watch command above;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>Every 1.0s: kubectl top pod -n apps -l app=nginx-app                                                                     master-02: Thu Jul  4 19:27:00 2024\n\nNAME                         CPU(cores)   MEMORY(bytes)\nnginx-app-6ff7b5d8f6-km5j5   1127m        6Mi\n<\/code><\/pre>\n\n\n\n<p>As you can see, CPU and Memory resource is being adjusted as per the load.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"defining-resource-requests-and-limits-for-a-vpa\">Defining Resource Requests and Limits for a VPA<\/h4>\n\n\n\n<p>If you understand the resource requirements for your app, then you can set resource limits, for all the containers in a Pod of a Deployment or even just a specific container.<\/p>\n\n\n\n<p>See our updated VPA resource manifest yaml.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat vpn-auto.yaml<\/code><\/pre>\n\n\n\n<pre class=\"scroll-box\"><code>apiVersion: autoscaling.k8s.io\/v1\nkind: VerticalPodAutoscaler\nmetadata:\n  name: nginx-app-vpa\n  namespace: apps\nspec:\n  targetRef:\n    apiVersion: \"apps\/v1\"\n    kind: Deployment\n    name: nginx-app\n  updatePolicy:\n    updateMode: \"Auto\"\n  resourcePolicy:\n    containerPolicies:\n      - containerName: '*'\n        minAllowed:\n          cpu: 100m\n          memory: 56Mi\n        maxAllowed:\n          cpu: 200m  \n          memory: 512Mi\n        controlledResources: [\"cpu\", \"memory\"]\n<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>minAllowed<\/code> specifies the lower bounds (minimum resource requests).<\/li>\n\n\n\n<li><code>maxAllowed<\/code> specifies the upper bounds (maximum resource requests).<\/li>\n<\/ul>\n\n\n\n<p>You can then apply;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f vpn-auto.yaml<\/code><\/pre>\n\n\n\n<p>Or edit the vpa directly and update;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl edit vpa nginx-app-vpa -n apps<\/code><\/pre>\n\n\n\n<p>And add the resource requests\/limits;<\/p>\n\n\n\n<pre class=\"scroll-box\"><code>...\n  name: nginx-app-vpa\n  namespace: apps\n  resourceVersion: \"3842404\"\n  uid: 5ea19dff-6bd8-43cc-b758-bf7c1487b8c4\nspec:\n <strong> resourcePolicy:\n    containerPolicies:\n    - containerName: '*'\n      controlledResources:\n      - cpu\n      - memory\n      maxAllowed:\n        cpu: 200m\n        memory: 512Mi\n      minAllowed:\n        cpu: 100m\n        memory: 56Mi<\/strong>\n  targetRef:\n    apiVersion: apps\/v1\n    kind: Deployment\n    name: nginx-app\n...\n<\/code><\/pre>\n\n\n\n<p>Note that the values defined by the <strong>minAllowed<\/strong> and <strong>maxAllowed<\/strong> settings in the VPA are used as guidance. VPA, based on the continuous analysis of the resource usage, can recommend values outside this range based on observed usage. It strives to balance resource utilization with the need to prevent resource contention or performance degradation. <\/p>\n\n\n\n<p>You can simulate the app stress test to check how VPA will respond to application load.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"updating-the-vpa-configuration\">Updating the VPA Configuration<\/h4>\n\n\n\n<p>If you want to make changes to your VPA configuration, there are two ways;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>edit the VPA custom resource definition manifest file and make the changes. Once done, apply the manifest file.<\/li>\n\n\n\n<li>edit the VPA directly using <strong>kubectl edit<\/strong> command and make the changes. See above section.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"deleting-the-vpa-in-kubernetes-cluster\">Deleting the VPA in Kubernetes Cluster<\/h4>\n\n\n\n<p>You can get a list of available VPAs on various namespaces;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl get vpa -A<\/code><\/pre>\n\n\n\n<p>This list VPAs in all the namespaces.<\/p>\n\n\n\n<p>You can then delete respective VPA on a respective Namespace;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl delete VPA &lt;name-of-vpa&gt; -n &lt;namespace&gt;<\/code><\/pre>\n\n\n\n<p>E.g<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl delete VPA nginx-app-vpa -n apps<\/code><\/pre>\n\n\n\n<p>Or if you created a VPA sing CRD manifest file, then run;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl delete -f &lt;manifest.yaml&gt;<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"best-practices-when-using-vpa\">Best Practices when using VPA<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Before enabling Auto Mode in production, thoroughly test the impact of VPA recommendations in a staging environment. Monitor metrics closely during and after updates to ensure they align with expected outcomes.<\/li>\n\n\n\n<li>Implement rolling updates or other deployment strategies to mitigate downtime. Kubernetes provides mechanisms like Deployment strategies and PodDisruptionBudgets (PDBs) to manage updates and maintain application availability.<\/li>\n\n\n\n<li>Continuously monitor the recommendations provided by the VPA. Validate these recommendations against actual application performance and adjust VPA configurations as necessary to optimize resource allocation.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"further-reading\">Further Reading<\/h3>\n\n\n\n<p>You can read more about Kubernetes Autoscaling on the <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/workloads\/autoscaling\/\" target=\"_blank\" rel=\"noreferrer noopener\">Documentation page<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion\">Conclusion<\/h2>\n\n\n\n<p>VPA is a powerful tool for optimizing resource management in Kubernetes. By dynamically adjusting resource allocation for individual pods, VPA helps reduce costs, improve application performance, and achieve a more efficient and cost-effective containerized environment.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This blog post will take you through Kubernetes resource optimization with Vertical Pod Autoscaler. Managing resource allocation efficiently in Kubernetes is crucial for optimizing costs<\/p>\n","protected":false},"author":10,"featured_media":23065,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[1076,121,1668],"tags":[7555,7557,7556,7560,7558,7559],"class_list":["post-23030","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-containers","category-howtos","category-kubernetes","tag-hpa-vs-vpa","tag-kubernetes-vertical-pod-autoscaler","tag-kubernetes-vpa","tag-vpa","tag-vpa-auto-mode","tag-vpa-recommender","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-50","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/23030"}],"collection":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/comments?post=23030"}],"version-history":[{"count":38,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/23030\/revisions"}],"predecessor-version":[{"id":23091,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/posts\/23030\/revisions\/23091"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media\/23065"}],"wp:attachment":[{"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/media?parent=23030"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/categories?post=23030"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kifarunix.com\/wp-json\/wp\/v2\/tags?post=23030"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}