What We Do

21 Jan

In this text, we try to describe what we do when we talk about Green Lean. The goal is to align with the sustainability movement without sacrificing developer productivity. We must strive not only to develop the most energy-efficient code or the greenest systems but also to guide our end users away from the pitfalls of generating unnecessary emissions. There are practical things we can do, but ultimately, it comes down to raising awareness about what matters.

The renewable energy revolution will solve many problems, but we're not there yet. The aim is to meet somewhere in the middle. Stay vigilant!

Green code is not just about code

Naming is hard. The first thing that could come into one’s mind is that sustainability is achieved by efficient code. Code efficiency is one tool in the toolbox — but it shouldn't be the first one most engineers reach for. Why? Rewriting your code is a huge task and can lead to less readable code, thus undermining developer productivity. If you still decide to rewrite, only do the parts with the highest ROI (return of investment) and call those parts from your existing code.

And what about the chosen language? Does it matter? According to a recent study, the programming language itself does not directly determine energy efficiency. Instead, the key factors are how many active processing cores are used and for how long. Reducing these cores reduces energy consumption.

However, leveraging efficient application implementations can influence core and time usage. Reusing libraries or modules, particularly those written in low-level, performance-oriented languages like C, C++, or Rust, can significantly enhance energy efficiency in any project.

A couple of tips for efficiency.

Avoid computation as much as possible and try to reuse computation results.
Execute code more on end-user devices like smartphones, which are usually more energy efficient than, for example, always-on servers.
Execute the code only when there is plenty of renewable energy available on a windy or sunny day.
Serve results appropriate to the situation. For example, serve lower resolution images and videos to devices that can’t take full advantage of the high resolution.

We are already moving towards hardware efficiency, but this needs to be said. Since embodied carbon in devices plays a considerable part, try to ensure that applications stay backward compatible. This way, we won’t need to throw away our otherwise working devices.

Yes, “green code” can mean improvements we can make at the code level, but the low-hanging fruits are elsewhere. So, what should we generally do at this level? Design for performance. Performance often serves as a reliable proxy for efficiency. Just keep in mind that they’re not the same thing: the context matters. What we want to do is to use the least number of machines possible and at the highest utilization possible. Let’s dive into infrastructure resource management Tetris next.

As shown in the images, there’s always some power draw, even when utilization is at zero. This is why it’s more efficient to focus processing on fewer computers. By doing so, we save energy and help the machines last longer by reducing unnecessary wear and tear.

Container orchestration: automate the Tetris

When speaking of energy consumption of IT infrastructure, servers, and hardware, one of the key things is to maximize utilization. Container orchestration platforms such as Kubernetes are a great option for this, as one of the main things in containers is to be able to package and deploy smaller, lighter-weight units. There’s less overhead when compared to deploying virtual machines. Containers have a shared kernel, while in virtual machines, the host kernel shares the use of memory and processors so that each guest OS can be run as such.

It is not only the energy consumption aspect that is at play. When using a hosted container platform such as Google’s GKE or AKS in Azure, or even when running self-hosted on top of virtual machines in a data center or on bare metal, what you are actually paying for is the underlying capacity. In the cloud, there are other options, for example, Google’s GKE Autopilot mode where you pay for just your workload’s resource consumption. But for the most part, when thinking of sizing and scaling together with cost, you need to think of the worker nodes and control plane rather than the containers.

Kubernetes is also a fairly complicated piece of machinery. It has many different knobs. Some of these need to be tweaked and turned to fit the use case. Some of those knobs are related to deployment and answer the questions: how big? How many? Where? These then have a direct effect on the utilization.

Of special importance are container resource requests since these are used by the scheduler to place the resources on the worker nodes. You can think of them as Tetris blocks, and you want to have as many as possible on a node. Smaller blocks are generally easier to stack together than larger blocks. For instance, if the next container to be scheduled is very large in terms of resource requests, you may need to scale out the worker nodes, incurring extra costs and using more electricity rather than fitting on the currently available nodes.

Through autoscaling you can scale your workloads to meet the utilization demands by setting the parameters properly for the number of replicas as well as the resource utilization thresholds. Scaling is important to maximize the usage of resources. In more recent Kubernetes versions, you can also scale containers vertically to set the resource requests dynamically.

In addition to scaling based on resource usage, you can also scale containers based on a fixed schedule or triggered events. You can use these patterns to run workloads during a time when electricity is cleaner or in regions where less carbon dioxide is emitted. In Kubernetes, this is achievable with the help of the Carbon Aware KEDA (Kubernetes Event-Driven Autoscaling) Operator. As you can imagine, this then has a very direct impact on your workload’s carbon intensity. But even before taking KEDA or other advanced methods into use, you can concentrate on maximizing the utilization of your workloads with the means built into Kubernetes.

All that said, each instance of a software and container orchestration platform carries its own share of overhead. If your goal is to reduce energy consumption, make sure to consider how much more processing power you need for running clusters and splitting microservices.

Serverless: outsource the Tetris

Multiple cloud vendors provide serverless services, which are part of the cloud evolution, just like IaaS and PaaS. Here, the optimal handling of hardware is outsourced to cloud vendors. In theory, this should reduce energy consumption since there is an incentive to improve these services when the user base grows. Spoiler: the incentive is probably money, but it does not matter as long as progress is made.

If the use case allows it, serverless could be an excellent choice from multiple perspectives. Firstly, it reduces operational time and cost. Both can lead to improved developer productivity - the reason already mentioned above.

Secondly, you truly pay for what you use: when there is no processing from, for example, AWS Lambda compute, there is no expense. Just remember to clean your Cloudwatch logs (if you even need to generate them in the first place), as they could accumulate large bills in the future!

Thirdly, serverless architectures naturally reward efficient coding and resource management since you pay only for what you use. This approach pushes teams to build lean, event-driven solutions that avoid unnecessary computation. Regarding computing, scaling should be infinite, at least in theory. As we see next, this can be both good and bad.

What about the cost then? In FinOps "Iron Triangle" of cost management (quality, cost, and time), serverless might handle cost most poorly in some cases. Serverless isn’t ideal for high-throughput, constant workloads, like video transcoding, where processing runs continuously with little idle time. The pay-per-invocation model, concurrency limits, and billing increments can make serverless solutions like AWS Lambda prohibitively expensive. For example, cold starts can become costly if a Lambda frequently needs to be cold-booted. It's not just the boot time; with JVM or Node, the runtime may not optimize the code effectively before the instance shuts down.

Luckily, FinOps is not first about saving money but squeezing the maximum value out of the cloud. Let’s continue.

Reducing resource use by reducing cost

Another way to approach resource reduction is through cloud cost optimization. Spending less money on cloud resources most often means reducing resources, which is better for the environment. At the moment, we still have better visibility to money expenses than CO2 emissions as a metric.

Before optimizing cloud costs, they need to be evaluated against the value that cloud utilization generates to ensure they are meaningful. In essence, it is important to understand what would be a significant enough reduction in cloud spending to improve business profitability.

To successfully run a cost optimization initiative, you need people from multiple departments of the organization working together. In a modern tech organization, it is often the engineers or automations created by the engineers (e.g., autoscaling) that are making cloud purchase decisions. On the other hand, engineering often lacks the visibility and knowledge of an organization's financial data to help make appropriate purchasing decisions for the business. Therefore, as we’ve learned from DevOps, SRE, and Agile, you will want engineering to join forces with finance, product, and leadership to understand how much money should be spent on the cloud and when cost optimization matters.

A key part of timely cost optimization is precise and timely cost reporting. The more real-time, the better, as quick feedback loops encourage continuous improvements. An effective way to improve response to changes in cloud costs is to have regularly scheduled reports that break down spending and resource utilization per component basis that tell you how much the costs have changed over time. To ensure you have accurate reports, clear and consistent resource tagging is required.

Once you have visibility on where your money in the cloud is spent, it’s time to find areas where optimizations can be made. Here are some of the usual ways to improve spending:

Eliminating unused resources. Cloud resources may be left running even though they are no longer used. This is often the case in situations where there are gaps in automated resource cleanup tasks, data lifecycle policies are not defined precisely enough, or when it is unclear whether someone uses the resources. Removing these resources eliminates costs with minimal to no impact on production.
Automated scaling. Scaling resources up and down on demand is one of the key qualities the cloud provides, but it is also often overlooked to play safe with service stability. By optimizing software startup time and scaling thresholds, you can spin down resources automatically when there is less demand for them and thus save both resources and money. An alternative scaling strategy is to use scheduled scaling for situations where resource utilization correlates with the time of the day or calendar time.
Compute types. Cloud platforms often provide various ways to run software ranging from bare metal hardware to virtual machines to containers to serverless functions. Rather than boxing every software to run on the same computing platform, you can gain cost savings by running software on a platform that better caters to the style of the software runtime. For example, occasional short-lived tasks are better run on a serverless platform rather than as a standalone virtual machine.
Capacity types. Cloud resources often provide multiple types of capacity that are optimized for specific types of tasks. For example, object storage solutions provide multiple storage tiers where each carries tradeoffs between storage cost and download latency and throughput. Some platforms provide spot capacity that trades low cost with limited availability of the resource, which is optimal for short-lived tasks or tasks where interruption of the computation doesn’t cost much.

Additionally, there are ways to optimize cloud spending using purchase agreements. Many of the vendors provide multi-year purchase commitment plans and enterprise agreements to buy the same resource at a lower cost than the on-demand price. While these do not reduce resource utilization, they can significantly reduce the money spent compared to paying the list price for the same duration of time.

Green Lean: the integrated solutions

As we can see from the points in this text alone, there are multiple things we can do at multiple levels. The last takeaways we want to give to you are the following mindshifts.

From Individual Contribution to Shared Contribution. We need to do this together. Share your knowledge and reuse already implemented energy-efficient code!
From Single Execution to Cumulative Executions. It’s never about that one-time execution but the cumulation from the multiple executions in multiple devices in a long period. You can make a difference!
From User Responsibility to Vendor Responsibility. Improvement done at a higher level, such as hyperscaler clouds or SaaS providers, will benefit us all. We must insist on greener solutions and better metrics!

Technically, there are no solutions. There are only trade-offs. The most efficient code is no code at all. Now go outside and enjoy some fresh air!

Tuomo Niemelä