New Providers for Crossplane donated by Upbound bring up to 4x cost-savings
Have you noticed your providers are faster?
Following the announcement of Upbound donating its control plane provider technology to Crossplane, we are happy to announce that Upbound’s engineering team has made a breakthrough to improve the overall efficiency of Upjet-based providers significantly. The improvements not only bring faster provisioning and reconciliation of resources but also reduce the running cost of the providers, helping users reduce their cloud spend.
The change that provided the breakthrough
Upbound created Upjet to build on the Terraform community’s efforts to integrate with CSPs. When Upbound initially created Upjet to generate Crossplane-compatible providers from the Terraform providers, we relied on the Terraform CLI to orchestrate the calls between Crossplane and the CSPs for provisioning resources. Unfortunately, Terraform CLI was never architected to be used in this continuous reconciliation manner, and so caused significant overhead in the operation mode of a control plane.
In evaluating options to improve the provider efficiency, we realized that improving Terraform CLI would be too significant an effort. To achieve our goals, we needed to bypass the Terraform CLI and integrate directly with the Terraform providers to eliminate the inefficiency.
By eliminating Terraform CLI, the Upjet-generated providers avoid the BSL licensing challenge introduced when HashiCorp relicensed many of their former open-source products. While the community didn’t have to worry about it before, as we stayed on a license-compatible version of Terraform CLI, the issue is now eliminated in addition to the significant efficiency improvements. The Terraform providers continue to be released on the MPL2 license.
Benchmark findings
While benchmarking the new provider architecture, we observed significant improvements compared to the previous approach using the Terraform CLI.
For example, we observed the following results when evaluating the efficiency of the new provider architecture in provisioning AWS RolePolicyAttachment Managed Resources (MRs) to attach managed IAM policies to IAM roles.
Note these results are for provisioning 1,000 RolePolicyAttachment resources on a node with 8 CPU cores.
Old Architecture | New Architecture | Improvement | |
Average Time To Ready State | 19.55 min | 1.012 sec | 1,159x speedup |
Peak Time To Ready State | 95.45 min | 4.00 sec | 1,432x speedup |
Average Memory Utilization | 1.23 GiB | 405 MiB | 60% reduction |
Peak Memory | 1.79 GiB | 421 MiB | 77% reduction |
Average CPU | 95.77% | 2.91% | 92.86% difference |
Peak CPU | 98.17% | 3.46% | 94.71% difference |
When we pushed this further to provision 10,000 RolePolicyAttachment MRs, a number previously not possible while using the Terraform CLI, there was a non-linear increase, keeping the overall utilization low. These are the results when comparing the provisioning of 1,000 and 10,000 MRs using the new provider architecture.
1,000 MRs | 10,000 MRs | |
Average Time To Ready State | 1.012 sec | 2.211 secs |
Peak Time To Ready State | 4.00 sec | 8.00 secs |
Average Memory Utilization | 405 MiB | 358 MiB |
Peak Memory | 421 MiB | 523 MiB |
Average CPU | 2.91% | 7.60% |
Peak CPU | 3.46% | 12.95% |
Cost savings and other improvements
The improvements in CPU and memory utilization bring with them the opportunity to see a significant reduction in the running costs of the worker nodes for Crossplane. In our benchmarking, we’ve found that we could successfully run and provision resources using the new AWS provider in a m5.large (2vCPU, 8GB RAM) AWS compute instance compared to the previous provider needing a c4.4xlarge (16 vCPU, 32GB RAM) AWS compute instance. This can result in up to a 4x cost-saving when comparing the running costs of the Amazon EC2 instances.
Crossplane community members shared their pleasant experiences using the new providers and the improvements to Upjet.
Sid Palas reported significant CPU and memory reduction when upgrading to the new providers at Nominal. Sid noted, “I just upgraded a bunch of my upbound/provider-aws providers from 0.38.0 -> 0.47.1, and the CPU + Memory usage dropped significantly 🤩. Previously, I had to give some providers multiple GiB (gibibytes) of memory to avoid OOM (out of memory) kills… no longer!”
Rachel Sweeny saw that when upgrading to the new providers, “the controllers were using less than 1Gb when we previously had to provision up to 20Gb. 🙂We were over 10,000 MRs for a while, so this was much needed.”
John Thompson shared that the improvements in Upjet also benefited their providers. John said, "We've updated 3 of our homemade Upjet providers to the new Upjet, and it did the usual drop in CPU and memory, but also dropped the max reconciliation time from close to 40 minutes down to 30 seconds for some of our resources.”
Availability
The AWS, GCP and Azure providers are available and free to use for any Crossplane user. Community members who have previously generated their providers using Upjet are encouraged to try Upjet 1.0+ to benefit from the new architecture improvements. For support on using Upjet, reach out to us in #upjet in Crossplane Slack.
Upbound’s latest Official Providers are all available with the new architecture improvements.