Skip to main content

Blog

Simplifying Cloud Infrastructure Management with IAC and GitOps

June 21, 2024       Steven Black               Comments  0

Infrastructure As Code (IAC) allows engineers to securely define and replicate cloud environments. Storing infrastructure like in text files might not seem sexy, but the simplicity of plain text is one of its strengths. Like typical programming languages, IAC can be version controlled in Git and run through CI/CD processes. This means you can view the evolution of your cloud architecture as you discover your infrastructure requirements. Not just in the sense that you can look at the source code but by posting artifacts like Terraform plans to pull requests you can see the specific changes made to the cloud resources.

Creating privileged CI/CD servers allow developers to make these changes without any access to the underlying AWS accounts. Not all organizations require that level of sophistication, and even those that do typically go through several evolutionary steps to get there. This article will cover those steps in order.

Getting Started with IAC: Version Control and Best Practices

The DevOps ecosystem is large. When starting out it’s best to remember KISS – Keep It Simple Stupid. This will help save you from over-engineering solutions too complex to share or too fragile to maintain. For IAC projects where there’s only a few developers who rarely make changes, it’s probably overkill to set up a full CI/CD process for your Terraform. Git, a Terraform backend, and a task runner like make are enough.

The key here is to develop a reliable process so that when a developer does go to make IAC changes, they can quickly get started and do so without stepping one anyone else’s toes. In our projects, there are typically multiple directories each with several workspaces with each workspace corresponding a different environment. When making IAC changes, we’ll checkout feature branches and test the changes on lower environments. It’s considerate to push the feature branch and open a draft PR right away so that if another developer does want to work on that same bit of code, and they see that a Terraform plan is producing a bunch of unanticipated changes, they can open up GitHub and see the open PR. This helps prevent frustrated developers from posting whodunnit to slack. A good Terraform automation tool like Atlantis will set up state locking to do this work for you so developers don’t even get to the point where they are looking at unanticipated changes and scratching their heads.

Other good practices for this stage project replicate other functionality a tool like Atlantis offers. Atlantis will post Terraform plans to pull requests. These can be quite long so wrapping them in a details tag will save reviewers some scrolling.

Making it easy for developers to get started working is key too. No one wants to sit down to work and realize they need to look up the magic incantations to set up a Terraform backend. We tend to use Makefiles to simplify oft-used commands like Terraform init, plan, and apply.

Continuous Integration (CI) for IAC: Validating and Securing Your Terraform Code

The next step is to start running easy stuff in automation. Things that don’t require cloud permissions to run and are low-risk can still validate that your Terraform is valid and secure. There are tools built into Terraform like validate and the newly added test that can do this, but the community has added tfsec and tflint, just to name two.

At Rhythmic Technologies, we make heavy use of the Terraform pre-commit hooks which will run these tests before you create a commit. It’s better to find a mistake locally than push it up for a collaborator to find. These tools can be run in GitHub Actions or similar CI process. Our template module has this set up.

Continuous Deployment (CD) for IAC: Automating Environment Creation

One of the advantages to IAC is easily spinning up new environments. As Terraform usage across an organization increases, dynamic environment creation is typically the next evolutionary step. This often happens before a wholesale commitment to running Terraform in GitOps.

Once the infrastructure requirements for an application are clearly visible in Terraform, the natural next question is “how do we spin up another one?” The Minimum Viable Product (MVP) to do this in automation is usually a GitHub Action that clones a repository with a deploy key, assumes an AWS IAM role, creates a new Terraform workspace, and then runs some Terraform. As hacky as that may seem, we’ve had good results at small-scale clients. This is also essentially when AWS Account Factory for Terraform does.

Implementing GitOps with Atlantis: The Best Tool for Managing Terraform

At Rhythmic, we’ve had good results using Atlantis for GitOps with Terraform. If you’re not familiar with Atlantis, it’s great: you push up changes and then Atlantis will comment with the resulting Terraform plan. You can leave a comment to have Atlantis apply and merge the PR. It takes care of locking Terraform workspaces so developers don’t end up stepping on each other’s toes.

There are other tools that can more-or-less do this, some without quite as much investment. Atlantis does require a running server to work, typically deployed to ECS or EKS. But if you are looking for the best GitOps tool to manage Terraform, Atlantis delivers. Because it’s deployed to your own infrastructure, you don’t need to worry about GitHub actions getting hacked and leaking your secrets. And it runs Terraform for you, meaning you can make edits to your infrastructure from your text editor or even your browser. There’s something magical about the way this comes full circle – starting out provisioning infrastructure by clicking around in the browser and coming back to editing infrastructure through the browser. But in the latter case, it’s version controlled and peer-reviewed.

Conclusion

Defining infrastructure configurations using IAC and storing that IAC in version control can work miracles for organizations unsure of their exact infrastructure requirements. In most organizations, IAC adoption typically evolves through several stages that start out establishing an IAC process and clarifying infrastructure configurations.

Once enough of the environment is defined in IAC it can enable developer agility and organizational security. If you’re at an organization that makes heavy use of IAC tools, then it would be wise to invest in a GitOps tool like Atlantis to make infrastructure changes standardized and visible.

Leave a Reply