19 June, 20232 minute read

Be wary of bad Terraform defaults

Terraform and other infrastructure-as-code tools are incredibly powerful, but there are some footguns to be wary of if you’re not yet an experienced cloud practitioner. API design is hard, and developers of Terraform providers need to pick defaults which don’t alienate new users, and oftentimes this can result in defaults which are insecure or otherwise suboptimal.

Consider the process of setting up a database using RDS. If you provision your database via ClickOps then the AWS Console helpfully toggles on storage encryption by default, which improves security posture through encrypting your database’s disk at rest.

Using the AWS Terraform provider is a different story: the storage_encrypted option on the aws_db_instance resource defaults to off! If you don’t manually opt in to encryption, then you will end up provisioning less secure infrastructure than you would have using ClickOps. This particular example is especially egregious because you can’t simply toggle on encryption at a later date—there’s a whole process you need to follow instead.

Encryption is a bit of a sore spot across the AWS provider. While AWS themselves enforce server-side encryption on all S3 buckets since January 2023, for a long time it was possible to create S3 buckets without SSE. EBS volumes are also unencrypted by default—the list goes on.

Outside of encryption there are other suboptimal defaults. Basic ECR image scanning is free, but it’s opt-in. ECR image tags are mutable by default, leaving you open to image mutation attacks.

Some other defaults are more understandable, but aren’t necessarily reliable. Point-in-time recovery for DynamoDB is usually something you want turned on, but it’s an opt-in feature in both the AWS Console and Terraform provider. This feature does add cost so it’s not totally unreasonable to default to off, but a different API design could have been to have no default and instead require users to explicitly specify a value here.

Examples like these are why I recommend product teams to not write their own Terraform code and instead leverage modules. The ideal setup at larger companies is to have a platform team that owns the organization’s basic Terraform modules and has responsibility for ensuring secure and opinionated defaults are implemented in these modules. This frees up feature teams to focus on writing code and delivering user value without needing to also become cloud experts. Specialization is good for velocity.

At smaller organizations you should ask whether it’s truly necessary to use Terraform in the first place. Modern PaaS offerings have gotten absurdly good and can take your company extremely far without ever needing to manually define infrastructure. It’s almost always a good play to leverage the platform teams at companies like Vercel and PlanetScale instead of diverting resources from product development. Beneath a certain—quite high—level of scale the difference in cost is minor, and the difference in speed of delivery is immense. Unless you’re scaling to millions of customers you just don’t need the flexibility and control that a raw AWS account gives you.

But if you are going to be writing your own Terraform, make sure to pay attention to the default values your providers are setting. They’re not always the best for your use case.

Don't want to miss out on new posts?

Join 100+ fellow engineers who subscribe for software insights, technical deep-dives, and valuable advice.

Get in touch 👋

If you're working on an innovative web or AI software product, then I'd love to hear about it. If we both see value in working together, we can move forward. And if not—we both had a nice chat and have a new connection.
Send me an email at hello@sophiabits.com