<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Hemant's DevOps Journey Day 58]]></title><description><![CDATA[Hemant's DevOps Journey Day 58]]></description><link>https://90daysofdevops-hemantdhavale-day-58.hashnode.dev</link><image><url>https://cdn.hashnode.com/uploads/logos/6989064fae91a6ce9ba02aca/1a293dc2-3297-477e-9089-4420d785b5f0.jpg</url><title>Hemant&apos;s DevOps Journey Day 58</title><link>https://90daysofdevops-hemantdhavale-day-58.hashnode.dev</link></image><generator>RSS for Node</generator><lastBuildDate>Wed, 24 Jun 2026 01:23:28 GMT</lastBuildDate><atom:link href="https://90daysofdevops-hemantdhavale-day-58.hashnode.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Day 58 – Metrics Server and Horizontal Pod Autoscaler (HPA)]]></title><description><![CDATA[On Day 58 of my 90 Days of DevOps journey, I learned how Kubernetes automatically scales applications based on resource usage using Metrics Server and Horizontal Pod Autoscaler (HPA).
Yesterday I work]]></description><link>https://90daysofdevops-hemantdhavale-day-58.hashnode.dev/day-58-metrics-server-and-horizontal-pod-autoscaler-hpa</link><guid isPermaLink="true">https://90daysofdevops-hemantdhavale-day-58.hashnode.dev/day-58-metrics-server-and-horizontal-pod-autoscaler-hpa</guid><dc:creator><![CDATA[Hemant Dhavale]]></dc:creator><pubDate>Wed, 20 May 2026 03:24:42 GMT</pubDate><content:encoded><![CDATA[<p>On Day 58 of my 90 Days of DevOps journey, I learned how Kubernetes automatically scales applications based on resource usage using Metrics Server and Horizontal Pod Autoscaler (HPA).</p>
<p>Yesterday I worked with resource requests and limits.</p>
<p>Today I learned how Kubernetes actually monitors CPU and memory usage in real time and automatically increases or decreases Pods when traffic changes.</p>
<p>This felt very close to real production Kubernetes environments.</p>
<h3><strong>Metrics Server</strong></h3>
<p>Today I installed the Kubernetes Metrics Server.</p>
<p>Metrics Server collects resource usage data like:</p>
<p>CPU usage<br />Memory usage</p>
<p>from nodes and Pods inside the cluster.</p>
<p>After enabling Metrics Server, I tested:</p>
<p>kubectl top nodes</p>
<p>and:</p>
<p>kubectl top pods -A</p>
<p>For the first time, I could see real-time CPU and memory usage inside my Kubernetes cluster.</p>
<h3><strong>I also learned:</strong></h3>
<p>kubectl top shows actual resource usage</p>
<p>while:</p>
<p>kubectl describe pod</p>
<p>shows configured requests and limits.</p>
<p>That difference was important.</p>
<p>Exploring kubectl top</p>
<p>Next, I explored different kubectl top commands.</p>
<h3><strong>I checked:</strong></h3>
<p>Node CPU usage<br />Node memory usage<br />Pod resource usage<br />Pods consuming highest CPU</p>
<p>using:</p>
<p>kubectl top pods -A --sort-by=cpu</p>
<p>This helped me understand how Kubernetes monitors workloads continuously.</p>
<p>Deployment with CPU Requests</p>
<p>To use HPA, Kubernetes needs CPU requests.</p>
<h3><strong>I created a Deployment using the:</strong></h3>
<p>registry.k8s.io/hpa-example</p>
<p>image.</p>
<p>Then I added:</p>
<p>resources.requests.cpu: 200m</p>
<p>Without CPU requests, HPA cannot calculate utilization percentages properly.</p>
<p>That is one of the most common mistakes while configuring HPA.</p>
<p>Horizontal Pod Autoscaler (HPA)</p>
<p>Next, I created an HPA using:</p>
<h3><strong>kubectl autoscale</strong></h3>
<p>The HPA was configured with:</p>
<p>minimum replicas: 1<br />maximum replicas: 10<br />target CPU utilization: 50%</p>
<p>Initially, the TARGETS column showed:</p>
<p>because Metrics Server needed some time to collect metrics.</p>
<p>After a short wait, Kubernetes started showing actual CPU utilization values.</p>
<p>This was really interesting to watch.</p>
<p>Generating Load and Auto Scaling</p>
<p>The most exciting part today was testing auto scaling practically.</p>
<p>I created a BusyBox load generator that continuously sent requests to the application.</p>
<p>As CPU usage increased above 50%:</p>
<p>Kubernetes automatically increased the number of replicas.</p>
<p>I watched the scaling in real time using:</p>
<p>kubectl get hpa --watch</p>
<p>The Deployment scaled from:</p>
<p>1 replica → multiple replicas</p>
<p>automatically.</p>
<p>After deleting the load generator Pod, Kubernetes slowly started scaling the Deployment back down.</p>
<p>That felt like real cloud infrastructure behavior.</p>
<h3><strong>Declarative HPA using YAML</strong></h3>
<p>Finally, I created the HPA using YAML with:</p>
<p>autoscaling/v2</p>
<p>I also learned about the behavior section.</p>
<p>The behavior section controls:</p>
<p>how fast Kubernetes scales up<br />how slowly Kubernetes scales down</p>
<p>This gives more control over scaling behavior in production environments.</p>
<p>I also learned the difference between:</p>
<h3><strong>autoscaling/v1</strong></h3>
<h3><strong>autoscaling/v2</strong></h3>
<p>v1 mainly supports CPU metrics.</p>
<p>v2 supports:</p>
<p>CPU<br />memory<br />custom metrics<br />advanced scaling behavior</p>
<p>What I Learned Today</p>
<h3><strong>Today I learned:</strong></h3>
<p>Metrics Server<br />kubectl top<br />Horizontal Pod Autoscaler<br />CPU-based auto scaling<br />Load testing<br />autoscaling/v1 vs v2<br />Scaling behavior configuration<br />Real-time Kubernetes monitoring</p>
<h3><strong>Final Thoughts</strong></h3>
<p>Today was one of the most interesting Kubernetes learning days so far.</p>
<p>Watching Kubernetes automatically increase and decrease Pods based on traffic felt very powerful.</p>
<p>This is how modern applications handle changing traffic in production environments.</p>
<p>Understanding HPA is helping me understand how scalable cloud-native applications work</p>
<h3><strong>See you on Day 59</strong></h3>
]]></content:encoded></item></channel></rss>