IaaS Automated Powersaving, Green Sustainability - Pt.2

IaaS Automated Powersaving, Green Sustainability - Pt.2

Up and running with powersaving

The goal and purpose of this article is to show an example of implementation to accomplish the scheduled Powering off(and on) of VMs in our own On-Premises datacenter the same way Google GCP, AWS, and Azure VMs does it with their Hyperscalers.

This is the second article containing the practical walk through of an example of how to do this with VMware technology.

Note: If you missed the Hyperscaler discussion and how Hyperscalers actually power off and power on VMs on a schedule, and also “the WHY”, then please go ahead and read up on the article IaaS Automated Powersaving, Green Sustainability - Pt.1

.

.

An example: how you can do it

Since you are aware of costs of your Infrastructure, you are probably using more than one technology from VMware.

In this article I have made a suggestion using Orchestrator, Python, and Aria (Aria) Automation to accomplish a simple way to implement Power Off and Power On schedules for you Datacenter.

Requirements

To go through with this you need Aria Automation. Aria Automation contains Aria Orchestrator and Aria Service Broker that we show in this context. Aria Automation also contains SaltStack Config but we’re not using that this time.

A little Python knowledge is OK, but not necessary, because we are providing you with the scripts you need.

Using the Self Service portal

Requesting a deployment with power save

To simplify the consumption of IT services for users by using a Self-Service Provisioning in a portal. In this Multi-Cloud Management service catalog, we as an IT team have predefined our service offering. Here I have a Catalog Item called Save Power, that will deploy virtual machines and tag them with the tag “powersave”.

Request

When we click Request I am presented with a Request form with a possibility to change Power Save and VM Size. Both of which will affect our savings:

By clicking the information icons on those two options, we can get more information about the different options.

Note that We’ve chose to keep it simple with enforced power off at 18:00 and power on at 06:00, this could of course be customizable, but there are multiple reasons to keep it simple.

Powersave mode

Below is the explanation for the Size explains the impact of the Power Save mode

image-20230424102531354

The deployment Size

Below is the explanation for the Size explains the Sizes of the servers we’re about to deploy:

image-20230424102718775

Price

Since our Automaton System (Aria Automation) is has the ability to use pricecards or be connected to the Operations system (Aria Operations) with price cards we can also Calculate the monthly price for the several options e.g. when we choose a X-large or a small server.

Slack Notification

The end result is that every morning and every evening there is a Power On and Off of the servers tagged with powersave = true. There is a slack notification each time:

Slack notifications

See further down for explanation about the Slack portion of the Python script that makes this happen.

**Behind the Scenes **

Aria Automation Cloud Template

The blueprint aka. Cloud template: Behind the Self Service choice there is a simple Cloud Template, in other words a declarative language, such as YAML containing Infrastructure as Code (IaC) to define our desired state of our cloud infrastructure.

You can find a Github IaC YAML code with a copy of the template here

The tagging: The main thing about the Cloud template and the VM you are about to create is that is has a specific TAG. The Tag is created with this code snippet within the cloud template:

  tags:
    - key: os
      value: windows
    - key: powersave
      value: ${input.powersave}

vSphere Tags: in vSphere in the vCenter, this tagging is reflected

The Orchestrator workflow and Schedule

To make sure machines marked (tagged) with the powersave = true tag will run as scheduled. We have created two Scheduled Workflows. One for 06:00:00 in the morning that will power on VMs, and One at 18:00:00 (6pm) that will power off VMs. Both of these two scheduled workflows calls the workflow named “bgro-powersave-schedule”.

Here is an example of the scheduled task

Python code / Orchestrator Workflow

Get the code

GET YOUR copy of the code from THIS GITHUB PAGE

We just use a single Workflow called “bgro-powersave-schedule”. In that workflow, we have one scriptable task with a Python script that actually contains the most of the intelligence. It finds all deployments with the power save tag set to true then powers on or off accordingly.

Orchestrator Scriptable Task

The Python script behind the scriptable task in the workflow “bgro-powersave-schedule” in Aria Orchestrator is made so it can manage the power on/off for VMs by powering them on or off based on a set time window.

Python

The Python script uses the Aria Automation (vRA) API to control the machines and also has a Slack web-hook to send notifications when machines are powered off or on.

Functions

The script got these functions:

  1. ``power_off_resources(resource_ids, inputs, bearer_token)`: Powers off resources given their IDs.
  2. power_on_resources(resource_ids, inputs, bearer_token): Powers on resources given their IDs.
  3. get_resource_ids_with_powersave_tag(bearer_token, inputs): Retrieves resource IDs with the "powersave" tag.
  4. vraauth(inputs): Authenticates with the vRA API (returns a bearer token)
  5. send_to_slack(message, inputs): Sends message to Slack.

The intelligence is of course the power_off_resources and power_on_resources functions will loop through the provided resource IDs and power them off or on using the vRA API.

Function to power on resources

 1# Function to power on resources
 2def power_on_resources(resource_ids, inputs, bearer_token):
 3    # vRA API URL
 4    url = inputs["vra_url"]
 5    # vRA API headers with bearer token
 6    vraheaders = {
 7        "accept": "application/json",
 8        "content-type": "application/json",
 9        "Authorization": "Bearer " + bearer_token
10    }
11    # Loop through each resource ID and power it on
12    with requests.Session() as session:
13        for resource_id in resource_ids:
14            # vRA API payload to power on the resource
15            payload = {
16                "actionId": "Cloud.vSphere.Machine.PowerOn",
17                "inputs": {},
18                "reason": "Power On"
19            }
20            # Send the power on request to vRA using the requests library
21            resp = session.post(f"{url}/deployment/api/resources/{resource_id}/requests", headers=vraheaders, json=payload, verify=False)
22            try:
23                # Raise an error if the response status code is not 200 OK
24                resp.raise_for_status()
25                # Send a message to Slack to inform that the resource is being powered on
26                send_to_slack(f"POWERSAVE: Power on successfully called for resource ID: {resource_id}", inputs)
27            except requests.exceptions.HTTPError as err:
28                # If the status code is 400, log the error and continue to the next resource
29                if err.response.status_code == 400:
30                    print(f"Power on failed for resource ID {resource_id}: {err}. Is it already powered on?", inputs)
31                else:
32                    # If the status code is not 400, raise the error
33                    raise

Function to power off resources

 1
 2# Function to power off resources
 3def power_off_resources(resource_ids, inputs, bearer_token):
 4    # vRA API URL
 5    url = inputs["vra_url"]
 6    # vRA API headers with bearer token
 7    vraheaders = {
 8        "accept": "application/json",
 9        "content-type": "application/json",
10        "Authorization": "Bearer " + bearer_token
11    }
12    # Loop through each resource ID and power it off
13    with requests.Session() as session:
14        for resource_id in resource_ids:
15            # vRA API payload to power off the resource
16            payload = {
17                "actionId": "Cloud.vSphere.Machine.Shutdown",
18                "inputs": {},
19                "reason": "Power Off"
20            }
21            # Send the power off request to vRA using the requests library
22            resp = session.post(f"{url}/deployment/api/resources/{resource_id}/requests", headers=vraheaders, json=payload, verify=False)
23            try:
24                # Raise an error if the response status code is not 200 OK
25                resp.raise_for_status()
26                # Send a message to Slack to inform that the resource is being powered off
27                send_to_slack(f"POWERSAVE: Power off successfully called for resource ID: {resource_id}", inputs)
28            except requests.exceptions.HTTPError as err:
29                # If the status code is 400, log the error and continue to the next resource
30                if err.response.status_code == 400:
31                    print(f"Power off failed for resource ID {resource_id}: {err}. Is it already powered off?", inputs)
32                else:
33                    # If the status code is not 400, raise the error
34                    raise

The send_to_slack function

 1# Function to send a message to a Slack channel
 2
 3def send_to_slack(message, inputs):
 4
 5Slack webhook URL
 6
 7webhook_url = inputs["slack_webhook_url"]
 8
 9# Slack message payload
10payload = {
11    "text": message
12}
13
14# Send the message to Slack using the requests library
15response = requests.post(
16    webhook_url, data=json.dumps(payload),
17    headers={'Content-Type': 'application/json'}
18)
19
20# Raise an error if the response status code is not 200 OK
21if response.status_code != 200:
22    raise ValueError(
23        f'Request to Slack returned an error {response.status_code}, the response is:\n{response.text}'
24    )

Conclusion

If you download everything needed from the Git Repository, as mentioned [Above](#1-Get the code) , the rest of the code is fairly well documented within the code. Pay attention to what it does.