Guided Exercise: Application Autoscaling

·

8 min read

Configure an autoscaler for an application and then load test that application to observe scaling up.

Outcomes

You should be able to manually scale up a deployment, configure a horizontal pod autoscaler resource, and monitor the autoscaler.

As the student user on the workstation machine, use the lab command to prepare your system for this exercise.

This command ensures that all resources are available for this exercise. It also creates the reliability-autoscaling project.

[student@workstation ~]$ lab start reliability-autoscaling

Procedure 6.5. Instructions

  1. Log in to the OpenShift cluster as the developer user with the developer password. Use the reliability-autoscaling project.

    1. Log in to the OpenShift cluster.

       [student@workstation ~]$ oc login -u developer -p developer \
         https://api.ocp4.example.com:6443
       Login successful.
       ...output omitted...
      
    2. Set the reliability-autoscaling project as the active project.

       [student@workstation ~]$ oc project reliability-autoscaling
       ...output omitted...
      
  2. Create the loadtest deployment, service, and route. The deployment uses the registry.ocp4.example.com:8443/redhattraining/loadtest:v1.0 container image that provides a web application. The web application exposes an API endpoint that creates a CPU-intensive task when queried.

    1. Review the ~/DO180/labs/reliability-autoscaling/loadtest.yml resource file that the lab command prepared. The container specification does not include the resources section that you use to specify CPU requests and limits. You configure that section in another step. Do not change the file for now.

       apiVersion: v1
       kind: List
       metadata: {}
       items:
         - apiVersion: apps/v1
           kind: Deployment
       ...output omitted...
               spec:
                 containers:
                 - image: registry.ocp4.example.com:8443/redhattraining/loadtest:v1.0
                   name: loadtest
                   readinessProbe:
                     failureThreshold: 3
                     httpGet:
                       path: /api/loadtest/v1/healthz
                       port: 8080
                       scheme: HTTP
                     periodSeconds: 10
                     successThreshold: 1
                     timeoutSeconds: 1
      
         - apiVersion: v1
           kind: Service
       ...output omitted...
      
         - apiVersion: route.openshift.io/v1
           kind: Route
       ...output omitted...
      
    2. Use the oc apply command to create the application. Ignore the warning message.

       [student@workstation ~]$ oc apply -f \
       ~/DO180/labs/reliability-autoscaling/loadtest.yml
       Warning: would violate PodSecurity "restricted:v1.24":
       ...output omitted...
       deployment.apps/loadtest created
       service/loadtest created
       route.route.openshift.io/loadtest created
      
    3. Wait for the pod to start. You might have to rerun the command several times for the pod to report a Running status. The name of the pod on your system probably differs.

       [student@workstation ~]$ oc get pods
       NAME                       READY   STATUS    RESTARTS   AGE
       loadtest-65c55b7dc-r4s4s   1/1     Running   0          49s
      
  3. Configure a horizontal pod autoscaler resource for the loadtest deployment. Set the minimum number of replicas to 2 and the maximum to 20. Set the average CPU usage to 50% of the CPU requests attribute.

    The horizontal pod autoscaler does not work, because the loadtest deployment does not specify requests for CPU usage.

    1. Use the oc autoscale command to create the horizontal pod autoscaler resource.

       [student@workstation ~]$ oc autoscale deployment/loadtest --min 2 --max 20 \
       --cpu-percent 50
       horizontalpodautoscaler.autoscaling/loadtest autoscaled
      
    2. Retrieve the status of the loadtest horizontal pod autoscaler resource. The unknown value in the TARGETS column indicates that OpenShift cannot compute the current CPU usage of the loadtest deployment. The deployment must include the CPU requests attribute for OpenShift to be able to compute the CPU usage.

       [student@workstation ~]$ oc get hpa loadtest
       NAME      REFERENCE            TARGETS         MINPODS  MAXPODS   REPLICAS   AGE
       loadtest  Deployment/loadtest  <unknown>/50%   2        20        2          74s
      
    3. Get more details about the resource status. You might have to rerun the command several times. Wait three minutes for the command to report the warning message.

       [student@workstation ~]$ oc describe hpa loadtest
       Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscaler
       Name:                                                  loadtest
       Namespace:                                             reliability-autoscaling
       ...output omitted...
       Conditions:
         Type           Status  Reason                   Message
         ----           ------  ------                   -------
         AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
         ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get cpu utilization: missing request for cpu
       Events:
         Type     ... Message
         ----     ... -------
       ...output omitted...
         Warning  ... failed to get cpu utilization: missing request for cpu
       ...output omitted...
      
    4. Delete the horizontal pod autoscaler resource. You re-create the resource in another step, after you fix the loadtest deployment.

       [student@workstation ~]$ oc delete hpa loadtest
       horizontalpodautoscaler.autoscaling "loadtest" deleted
      
    5. Delete the loadtest application.

       [student@workstation ~]$ oc delete all -l app=loadtest
       pod "loadtest-65c55b7dc-r4s4s" deleted
       pod "loadtest-65c55b7dc-mgqr6" deleted
       service "loadtest" deleted
       deployment.apps "loadtest" deleted
       replicaset.apps "loadtest-65c55b7dc" deleted
       route.route.openshift.io "loadtest" deleted
      
  4. Add a CPU resource section to the ~/DO180/labs/reliability-autoscaling/loadtest.yml file. Redeploy the application from the file.

    1. Edit the ~/DO180/labs/reliability-autoscaling/loadtest.yml file, and configure the CPU limits and requests for the loadtest deployment. The pods need 25 millicores to operate, and must not consume more that 100 millicores.

      You can compare your work with the completed ~/DO180/solutions/reliability-autoscaling/loadtest.yml file that the lab command prepared.

       ...output omitted...
               spec:
                 containers:
                 - image: registry.ocp4.example.com:8443/redhattraining/loadtest:v1.0
                   name: loadtest
                   readinessProbe:
                     failureThreshold: 3
                     httpGet:
                       path: /api/loadtest/v1/healthz
                       port: 8080
                       scheme: HTTP
                     periodSeconds: 10
                     successThreshold: 1
                     timeoutSeconds: 1
                   resources:
                     requests:
                       cpu: 25m
                     limits:
                       cpu: 100m
       ...output omitted...
      
    2. Use the oc apply command to deploy the application from the file. Ignore the warning message.

       [student@workstation ~]$ oc apply -f \
       ~/DO180/labs/reliability-autoscaling/loadtest.yml
       Warning: would violate PodSecurity "restricted:v1.24":
       ...output omitted...
       deployment.apps/loadtest created
       service/loadtest created
       route.route.openshift.io/loadtest created
      
    3. Wait for the pod to start. You might have to rerun the command several times for the pod to report a Running status. The name of the pod on your system probably differs.

       [student@workstation ~]$ oc get pods
       NAME                        READY   STATUS    RESTARTS   AGE
       loadtest-667bdcdc99-vhc9x   1/1     Running   0          36s
      
  5. Manually scale the loadtest deployment by first increasing and then decreasing the number of running pods.

    1. Scale up the loadtest deployment to five pods.

       [student@workstation ~]$ oc scale deployment/loadtest --replicas 5
       deployment.apps/loadtest scaled
      
    2. Confirm that all five application pods are running. You might have to rerun the command several times for all the pods to report a Running status. The name of the pods on your system probably differ.

       [student@workstation ~]$ oc get pods
       NAME                        READY   STATUS    RESTARTS   AGE
       loadtest-667bdcdc99-5fcvh   1/1     Running   0          43s
       loadtest-667bdcdc99-dpspr   1/1     Running   0          42s
       loadtest-667bdcdc99-hkssk   1/1     Running   0          43s
       loadtest-667bdcdc99-vhc9x   1/1     Running   0          8m11s
       loadtest-667bdcdc99-z5n9q   1/1     Running   0          43s
      
    3. Scale down the loadtest deployment back to one pod.

       [student@workstation ~]$ oc scale deployment/loadtest --replicas 1
       deployment.apps/loadtest scaled
      
    4. Confirm that only one application pod is running. You might have to rerun the command several times for the pods to terminate.

       [student@workstation ~]$ oc get pods
       NAME                        READY   STATUS    RESTARTS   AGE
       loadtest-667bdcdc99-vhc9x   1/1     Running   0          11m
      
  6. Configure a horizontal pod autoscaler resource for the loadtest deployment. Set the minimum number of replicas to 2 and the maximum to 20. Set the average CPU usage to 50% of the CPU request attribute.

    1. Use the oc autoscale command to create the horizontal pod autoscaler resource.

       [student@workstation ~]$ oc autoscale deployment/loadtest --min 2 --max 20 \
       --cpu-percent 50
       horizontalpodautoscaler.autoscaling/loadtest autoscaled
      
    2. Open a new terminal window and run the watch command to monitor the oc get hpa loadtest command. Wait five minutes for the loadtest horizontal pod autoscaler to report usage in the TARGETS column.

      Notice that the horizontal pod autoscaler scales up the deployment to two replicas, to conform with the minimum number of pods that you configured.

       [student@workstation ~]$ watch oc get hpa loadtest
       Every 2.0s: oc get hpa loadtest            workstation: Fri Mar  3 06:26:24 2023
      
       NAME       REFERENCE             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
       loadtest   Deployment/loadtest   0%/50%    2         20        2          52s
      

      Leave the command running, and do not interrupt it.

  7. Increase the CPU usage by sending requests to the loadtest application API.

    1. Use the oc get route command to retrieve the URL of the application.

       [student@workstation ~]$ oc get route loadtest
       NAME       HOST/PORT                                                ...
       loadtest   loadtest-reliability-autoscaling.apps.ocp4.example.com   ...
      
    2. Send a request to the application API to simulate additional CPU pressure on the container. Do not wait for the curl command to complete, and continue with the exercise. After a minute, the command reports a timeout error that you can ignore.

       [student@workstation ~]$ curl \
       loadtest-reliability-autoscaling.apps.ocp4.example.com/api/loadtest/v1/cpu/1
       <html><body><h1>504 Gateway Time-out</h1>
       The server didn't respond in time.
       </body></html>
      
    3. Watch the output of the oc get hpa loadtest command in the second terminal. After a minute, the horizontal pod autoscaler detects an increase in the CPU usage and deploys additional pods.

      NOTE

      The increased activity of the application does not immediately trigger the autoscaler. Wait a few moments if you do not see any changes to the number of replicas.

      You might need to run the curl command multiple times before the application uses enough CPU to trigger the autoscaler.

      The CPU usage and the number of replicas on your system probably differ.

       Every 2.0s: oc get hpa loadtest            workstation: Fri Mar  3 07:20:19 2023
      
       NAME      REFERENCE             TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
       loadtest  Deployment/loadtest   220%/50%   2         20        9          16m
      
    4. Wait five minutes after the curl command completes. The oc get hpa loadtest command shows that the CPU load decreases.

      NOTE

      Although the horizontal pod autoscaler resource can be quick to scale up, it is slower to scale down.

       Every 2.0s: oc get hpa loadtest            workstation: Fri Mar  3 07:23:11 2023
      
       NAME      REFERENCE             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
       loadtest  Deployment/loadtest   0%/50%    2         20        9          18m
      
    5. Optional: Wait for the loadtest application to scale down. It takes five additional minutes for the horizontal pod autoscaler to scale down to two replicas.

       Every 2.0s: oc get hpa loadtest            workstation: Fri Mar  3 07:29:12 2023
      
       NAME      REFERENCE             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
       loadtest  Deployment/loadtest   0%/50%    2         20        2          24m
      
    6. Press Ctrl+C to quit the watch command. Close that second terminal when done.

Finish

On the workstation machine, use the lab command to complete this exercise. This step is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish reliability-autoscaling