Managing GitHub with Terraform
8 minute read
Rich Keenan
Terraform is a tool for managing infrastructure as code. It allows you to define your infrastructure in a declarative way, and then apply it to an environment. It's a great tool for managing infrastructure, but you might not realise that it's also a great tool for managing GitHub.
Terraform can manage GitHub?
Terraform is typically used to define cloud resources, at Countingup we use AWS so we use it to define our EC2 instances, RDS databases, and S3 buckets.
If you've only ever used Terraform for a single cloud provider you might not have realised that Terraform itself doesn't know anything about AWS* (or GCP, or Azure...), Terraform knows about resources, dependencies and state management. I really like the lifecycle diagram in Terraform In Action to understand what Terraform is actually doing.
* This isn't strictly true because it knows how to use S3 for storing/retrieving the terraform.tfstate
file
but it doesn't expose a generic S3 resource to user land.
To use Terraform to manage specific resources you need to install a provider for that resource, so for AWS you need to install the "AWS provider". This provider manages AWS resources through the AWS API using the Go SDK.
Generally speaking, if something can be managed through an API, it can be managed through Terraform.
GitHub has an API for managing its resources - repositories, users, teams, etc. - and that means it's possible to manage GitHub resources with Terraform too, as long as there's a provider (or we're willing to write one).
Terraform GitHub provider
Luckily, there's a community-supported GitHub provider as part of GitHub's Integrations organisation that does exactly what we need 🎉.
OK, but, like why? 🤨
There's a whole list of reasons why you might want to manage GitHub with Terraform and it's the same list of reasons why you might want to manage any other infrastructure with Terraform:
- Decreased risk of errors - Manually clicking and typing in the UI is error-prone
- Increased visibility - All changes are reviewed and tracked in version control
- Increased consistency - All changes are applied in the same way across all repositories
- Single source of truth - All changes are applied through Terraform, not through the UI
Countingup doesn't have a mono-repo. Each backend service has its own repo, and there's various frontend repos and tooling repositories. At the time of writing, we have about 100 all in. Ensuring consistency and correctness between those is hard to do without automation so a little over a year ago we decided to start using Terraform to manage our GitHub organisation.
Setup
I won't go into setting up Terraform and how to manage the state file, I'll jump straight into the GitHub specific bits.
Firstly you need to make a GitHub Personal Access Token with appropriate permissions then set this as an environment variable:
export GITHUB_TOKEN=<token>
Then add the provider configuration to your Terraform file.
provider "github" {
owner = "Countingup"
// You can set the token here instead but it will be publicly visible
}
Resources
Once the provider is set-up, resources are defined the same way any Terraform resource is defined, you declare a resource
with a type prefixed with the provider name.
resource "github_repository" "my_repo" {
name = "my-repo"
visibility = "private"
}
Note that all of the supported resource types and their attributes are documented in the GitHub provider documentation.
Here are a few of the resources we manage.
Repositories
We rely heavily on Terraform modules to manage our GitHub repositories as there are a lot of options and not a lot of differences between our repositories. Here's a slightly trimmed down version of what a github_repository
looks like for us:
resource "github_repository" "repo" {
name = var.name
visibility = "private"
topics = concat(local.defaultTopics, var.topics)
allow_merge_commit = false
allow_rebase_merge = false
allow_squash_merge = true
auto_init = true
archive_on_destroy = true
}
name
- Module variable as this obviously changes per repovisibility
- None of our public repositories are managed by Terraformtopics
- We set some default topics based on module options but also allow arbitrary topicsallow_merge_commit
et al. - Ensures consistent merge strategy between reposauto_init
- Creates a README.md on master with title and descriptionarchive_on_destroy
- Aterraform destroy
action will archive the repository rather than delete it. This is a very nice safety feature.
Branch Protection
We have branch protection rules enabled for all our repositories, this adds an extra layer of access control to our code and these rules are consistent across all of our backend service repositories.
The syntax is a little complex, there's probably a better way of writing this in Terraform but this works for us.
resource "github_branch_protection_v3" "protection" {
# Support multiple branches
for_each = var.protected_branches
repository = github_repository.repository.name
branch = each.key
restrictions {
# Specify teams with push-access, default to none
teams = try(each.value.push_access_teams, [])
}
dynamic "required_pull_request_reviews" {
# For each branch that requires pull requests...
for_each = each.value.require_pull_requests ? [1] : []
content {
# ...require at least one review, etc
required_approving_review_count = 1
require_code_owner_reviews = true
dismiss_stale_reviews = true
}
}
}
This resource is defined in a Terraform module with a default value for the protected_branches
variable that most of our repositories don't override,
variable "protected_branches" {
description = "Set protection rules"
default = {
master = {
require_pull_requests = true
# No direct pushes to master
push_access_teams = []
}
}
}
Organisation and Team access
When a new developer starts at Countingup we add them to the developers
team and any subteams that they need to be in. After applying the change GitHub sends an invite to the user (assuming there are enough seats in our organisation account, which we almost always forget to check and GitHub doesn't have a public API to manage so we can't include that in the Terraform, sadly)
Our teams
module looks like this,
resource "github_team" "team" {
name = var.name
description = var.description
privacy = "closed"
parent_team_id = var.parent_team_id
}
resource "github_team_membership" "team_members" {
for_each = toset(var.members)
team_id = github_team.team.id
username = each.value
role = "member"
}
resource "github_team_membership" "team_maintainers" {
for_each = toset(var.maintainers)
team_id = github_team.team.id
username = each.value
role = "maintainer"
}
This lets us easily specify who is in what team and what role they have within that team.
And an example of how we use it,
module "developers_team" {
source = "../modules/team"
name = "Developers"
description = "All Countingup Developers"
maintainers = [
"username_1",
"username_2",
]
members = [
"username_3",
"username_4",
]
}
Downsides
It's been about a year since we started using Terraform to manage GitHub. It's mostly been an excellent decision, but there are some downsides.
- It's slow - The GitHub API seems to be quite slow and the way some of the resources are managed by the provider isn't optimised. A
terraform plan
can take about 5 minutes to run which gets really annoying if you've made a mistake and need to re-run it. There's an open issue on GitHub that has some interesting suggestions for fixes that we should probably look into but we don't tend to make frequent changes so it's never been a high enough priority. - We can still make changes in the UI - There's a really strong culture around not making changes to our AWS infrastructure using the AWS console - even for our development environment. This is very much not the case for GitHub, so things can go out of sync. This tends to show up most often when a team member joins or leaves and we need to update the team membership quickly. There's some balance between pragmatism and consistency here that we could do better on.
- We don't manage everything with Terraform - Not all repositories are managed by Terraform, sometimes it just doesn't make sense for prototypes, one-offs, etc. This has led to confusion where some people aren't sure which repos are managed and which aren't. We added the
terraform-managed
label to the default set to help solve this problem.
Future
Once you realise that "if something can be managed through an API, it can be managed through Terraform." you start to see possibilities everywhere.
We use Auth0 for our Company Formations product. This was set up through the Auth0 web UI but there's totally an official Terraform provider for this. We could (and probably should) be using this to manage our Auth0 configuration and I suspect we'll be taking a look at this soon.
We use Segment for moving data between our products and our analytics tools. This is manually configured through their web UI and it's getting a little unwieldy. Unfortunately, there isn't an officially supported Terraform provider and there doesn't seem to be a defacto community provider either. I think this is a great opportunity for Segment to invest in the community and help to develop a fully featured provider - I suspect we'll have developers eager to contribute.
Whilst I'm waiting for the slow terraform plan
to finish maybe I'll use Nat Henderson's (in-)famous Dominos Terraform provider to order a margherita 🍕.