Terraform Security Anti-Patterns: 10 Misconfigurations Found in Real Production Code
Every cloud breach investigation that starts with an exposed credential or an open S3 bucket ends the same way: someone finds a
.tffile, or aterraform.tfstatein an S3 bucket, or a CI pipeline that ranterraform applywith admin keys baked into an environment variable. Terraform is not inherently insecure but the patterns that make it fast to use are precisely the patterns that create the largest attack surface. Hardcoded secrets survive in Git history after deletion. State files store every resource attribute in plaintext, including passwords, private keys, and connection strings, regardless of whether you marked them sensitive. Security groups drift from0.0.0.0/0during a 2 AM incident and never get corrected. IAM policies accumulate wildcards because the initial prototype was never tightened. These are not hypothetical risks they are the literal findings in every major cloud IR engagement of the past five years.
This post documents ten anti-patterns extracted from real production Terraform code, grouped into five sections that follow the natural attack path through a compromised cloud environment: secret exfiltration, network access, data store exposure, privilege escalation, and persistent visibility gaps. For each anti-pattern you get the exact broken code, the minimal-change fix, and a working Checkov or tfsec policy-as-code rule that blocks it in CI before terraform plan ever runs. The final section covers complete tfsec, Checkov, and Terrascan integration into GitHub Actions and GitLab CI pipelines with enforcement gates. By the end, your IaC pipeline will reject the ten most dangerous Terraform configurations before they reach a cloud account.
Section 1: Hardcoded Secrets and State File Exposure Why terraform.tfstate Is the Most Dangerous File Your Pipeline Produces
The most common Terraform secret anti-pattern is also the most obvious: credentials typed directly into resource blocks or variable default values. What is less understood is that fixing the .tf file is not sufficient. Terraform state files store the resolved value of every resource attribute at apply time, including attributes you never explicitly set RDS master passwords, generated API keys, private key material from tls_private_key resources, and connection strings assembled by the provider. Every terraform apply writes these values into terraform.tfstate in plaintext JSON, regardless of whether you declared the variable as sensitive = true. The sensitive flag only controls console output masking; it has zero effect on what gets written to state.
The Git history problem compounds this. A developer hardcodes aws_access_key_id = "AKIAIOSFODNN7EXAMPLE" in a provider.tf, realizes the mistake, removes it, and commits the fix. The key is gone from the current file but it exists permanently in Git history at the commit SHA before the deletion. git log -p -- provider.tf retrieves it in seconds. Tools like truffleHog, gitleaks, and git-secrets scan for exactly this pattern, and so do automated scanners that continuously index public GitHub repositories. The Uber 2022 breach began with a hardcoded credential in a private GitHub repository that an attacker accessed after compromising a contractor's account.
The second state-specific anti-pattern is storing terraform.tfstate in a Git repository or in an unprotected S3 bucket. State backends default to local storage terraform.tfstate in the working directory which gets committed by developers who don't configure a remote backend. When the state backend is S3, the bucket is frequently misconfigured without server-side encryption, versioning, or access logging, and without state locking via DynamoDB. An attacker who gains read access to that S3 bucket gets every secret in your infrastructure in a single API call.
Attack Flow: From Exposed State to Full Account Compromise
Anti-Pattern 1: Hardcoded Credentials in Provider and Resource Blocks
# ❌ ANTI-PATTERN: Hardcoded AWS credentials in provider block
# Found in production code from a fintech startup in 2024 IR engagement
# These credentials survive in git history even after the commit is reverted
terraform {
required_providers {
aws = { source = "hashicorp/aws", version = "~> 5.0" }
}
}
provider "aws" {
region = "us-east-1"
access_key = "AKIAIOSFODNN7EXAMPLE" # Hardcoded will be in git history forever
secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" # Same
}
resource "aws_db_instance" "production" {
identifier = "prod-postgres"
engine = "postgres"
instance_class = "db.t3.medium"
allocated_storage = 100
# ❌ Password hardcoded also written to tfstate in plaintext
username = "dbadmin"
password = "SuperSecret123!" # Recoverable from tfstate AND git history
# ❌ Publicly accessible addressed in Section 2
publicly_accessible = true
}
resource "aws_instance" "bastion" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
# ❌ Private key stored as a variable default committed to .tfvars
# user_data installs this key for SSH access
user_data = <<-EOF
#!/bin/bash
echo "ssh-rsa AAAAB3NzaC1yc2EAAA...PRIVATE_KEY_MATERIAL" >> /root/.ssh/authorized_keys
EOF
}
# ✅ FIX: Use provider-level credential resolution chain + AWS Secrets Manager
# Credentials come from: instance profile, environment variables, or ~/.aws/credentials
# Never from .tf files
provider "aws" {
region = var.aws_region
# No access_key or secret_key uses default credential chain:
# 1. Environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY)
# 2. Shared credentials file (~/.aws/credentials)
# 3. EC2/ECS/Lambda instance profile (preferred in CI/CD)
# 4. EKS service account IRSA token
}
# RDS password: generate randomly and store in Secrets Manager never in .tf or tfstate
resource "random_password" "db_password" {
length = 32
special = true
override_special = "!#$%&*()-_=+[]{}<>:?"
# This value IS stored in tfstate use secrets manager reference to minimize blast radius
}
resource "aws_secretsmanager_secret" "db_password" {
name = "prod/rds/postgres/master-password"
recovery_window_in_days = 7
# KMS encryption for the secret itself
kms_key_id = aws_kms_key.secrets.arn
}
resource "aws_secretsmanager_secret_version" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
secret_string = jsonencode({
username = "dbadmin"
password = random_password.db_password.result
})
}
resource "aws_db_instance" "production" {
identifier = "prod-postgres"
engine = "postgres"
instance_class = "db.t3.medium"
allocated_storage = 100
username = "dbadmin"
# Reference the random password value is in tfstate but not in source code
# Rotate via Secrets Manager rotation Lambda, not terraform apply
password = random_password.db_password.result
# Mark as sensitive masks value in terraform plan/apply output
# NOTE: does NOT prevent writing to tfstate state encryption is the control
lifecycle {
ignore_changes = [password] # Prevents Terraform from resetting rotated passwords
}
}
Checkov Policy: Block Hardcoded Credentials
# checkov/custom_checks/check_hardcoded_aws_credentials.py
# Custom Checkov check place in a directory passed to checkov via --external-checks-dir
# Runs against every .tf file in your repository
from checkov.common.models.enums import CheckCategories, CheckResult
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
import re
class HardcodedAWSCredentials(BaseResourceCheck):
"""
Detects hardcoded AWS access keys and secret keys in provider blocks.
CKV_CUSTOM_1: No hardcoded AWS credentials in provider configuration.
"""
def __init__(self):
name = "Ensure no hardcoded AWS credentials in provider block"
id = "CKV_CUSTOM_1"
# Apply to the aws provider block
supported_resources = ["provider"]
categories = [CheckCategories.SECRETS]
super().__init__(name=name, id=id,
categories=categories,
supported_resources=supported_resources)
def scan_resource_conf(self, conf):
# Check for access_key and secret_key attributes in the provider block
access_key = conf.get("access_key", [""])
secret_key = conf.get("secret_key", [""])
# AWS access key pattern: AKIA... (20 chars, uppercase alphanumeric)
aws_key_pattern = re.compile(r'AKIA[0-9A-Z]{16}')
# If either attribute is set and looks like a real AWS key fail
for key_value in [access_key, secret_key]:
if isinstance(key_value, list):
key_value = key_value[0] if key_value else ""
if key_value and key_value != "" and not key_value.startswith("var."):
if aws_key_pattern.search(str(key_value)):
return CheckResult.FAILED
# Also flag any non-variable, non-empty credential value
if str(key_value).strip() not in ("", "null"):
return CheckResult.FAILED
return CheckResult.PASSED
# Built-in Checkov checks for credential patterns (run these by default):
# CKV_AWS_41: Ensure no hard-coded credentials exist in aws provider
# CKV_SECRET_*: Checkov Secrets scanning (Bridgecrew Secrets module)
# tfsec rule equivalent .tfsec/config.toml for per-repo tfsec configuration
# tfsec ships with built-in checks for hardcoded secrets
# Run tfsec with secrets scanning enabled:
tfsec . \
--include-passed \ # Show passed checks too
--format json \ # Machine-readable output for CI
--out tfsec-results.json \
--minimum-severity HIGH \ # Only fail on HIGH and CRITICAL
--config-file .tfsec/config.toml
# .tfsec/config.toml configure which checks to enforce
[severity_overrides]
# Elevate these to CRITICAL (fail the pipeline)
"aws-iam-no-policy-wildcards" = "CRITICAL"
"aws-s3-no-public-access-with-acl" = "CRITICAL"
"general-secrets-sensitive-in-variable" = "CRITICAL"
[exclude]
# Exclude checks only with documented justification never for convenience
# "aws-vpc-no-public-ingress-sg" = "Bastion host exception tracked in ticket SEC-1234"
# Gitleaks: scan git history for committed secrets (run in CI on every PR)
# Catches secrets removed in subsequent commits but present in history
gitleaks detect \
--source . \ # Scan current directory
--verbose \
--report-format json \
--report-path gitleaks-report.json \
--log-opts "origin/main..HEAD" # Only scan commits in this PR, not full history
# For full history scan (run once on existing repos):
gitleaks detect --source . --log-opts "--all"
# .gitleaks.toml custom rules for Terraform-specific secrets
[[rules]]
description = "Terraform tfvars password"
id = "terraform-password-variable"
regex = '''(?i)(password|passwd|pwd|secret)\s*=\s*["'][^"']{8,}["']'''
path = '''.*\.tfvars$'''
tags = ["terraform", "secret"]
[[rules]]
description = "Terraform state file with sensitive data"
id = "terraform-state-secret"
regex = '''"(password|secret_key|private_key|token)"\s*:\s*"[^"]{8,}"'''
path = '''.*\.tfstate.*'''
tags = ["terraform", "state"]
KQL: Detect Terraform State File Access in AWS
// KQL Detect access to Terraform state S3 buckets from unexpected principals
// Source: AWS CloudTrail forwarded to Sentinel
// Terraform state buckets should only be accessed by CI/CD service accounts and Terraform runners
AWSCloudTrail
| where TimeGenerated > ago(7d)
| where EventName in ("GetObject", "PutObject", "DeleteObject", "ListBucket")
// Filter to known state bucket names (maintain as a Watchlist)
| where RequestParameters has_any (toscalar(
_GetWatchlist('TerraformStateBuckets')
| summarize make_list(BucketName)
))
| extend
CallerPrincipal = tostring(parse_json(UserIdentity).arn),
CallerType = tostring(parse_json(UserIdentity).type),
SourceIP = tostring(SourceIPAddress)
// Flag access from anything that isn't your CI/CD service account or Terraform runner role
| where CallerPrincipal !contains "terraform-ci-role"
and CallerPrincipal !contains "github-actions"
and CallerPrincipal !contains "gitlab-runner"
and CallerType != "AWSService"
// Flag GetObject specifically reading state file = reading all secrets
| where EventName == "GetObject"
| project TimeGenerated, CallerPrincipal, CallerType, SourceIP,
EventName, tostring(RequestParameters), tostring(ResponseElements)
| order by TimeGenerated desc
Secret Injection Method Comparison
| Method | Secrets in .tf Source? | Secrets in tfstate? | Rotation Support | CI/CD Compatible | Recommended For |
|---|---|---|---|---|---|
Hardcoded in .tf / .tfvars | Yes critical risk | Yes | Manual only | No | Never |
TF_VAR_* environment variable | No | Yes in tfstate | Manual | Yes (inject at CI runtime) | Low-sensitivity vars only |
aws_secretsmanager_secret data source | No | No (reference only) | Yes (Lambda rotation) | Yes | All production secrets |
| HashiCorp Vault provider | No | No (leased secret) | Yes (Vault lease TTL) | Yes (Vault agent sidecar) | Secrets requiring audit trail |
random_password + sensitive = true | No | Yes plaintext in tfstate | Via terraform taint | Yes | Only with encrypted state backend |
| AWS SSM Parameter Store data source | No | No (reference only) | Yes (manual or Lambda) | Yes | Mid-sensitivity configuration |
Fixing how secrets enter Terraform eliminates one attack vector, but it does nothing about the network-level exposure that's configured by the resources Terraform creates. The next anti-pattern class is the one that most cloud IR engagements still open with: a security group with port 22 open to 0.0.0.0/0 that was created during an incident three years ago and never closed.
Section 2: Overpermissive Network Rules Why 0.0.0.0/0 on Port 22 Is Still the #1 Finding After a Decade of Cloud Security
Security group misconfiguration is the oldest cloud security finding and the one that keeps reappearing in production Terraform code because it is the path of least resistance during development. When something doesn't connect, the debugging reflex is to open the port wider. When it connects after opening to 0.0.0.0/0, the debug change gets committed as the "fix." The SolarWinds SUNBURST malware operators, once they had a foothold in victim networks, specifically targeted servers accessible via RDP from the internet because internet-facing RDP on port 3389 is so common that it blends into background noise. Shodan permanently indexes port 22 and 3389, and automated credential-stuffing bots enumerate all exposed endpoints within minutes of a new public IP appearing.
The specific Terraform anti-patterns are: ingress rules in aws_security_group resources with cidr_blocks = ["0.0.0.0/0"] on administrative ports (22, 3389, 1433, 3306, 5432, 6379, 27017), aws_db_instance resources with publicly_accessible = true, and aws_s3_bucket_public_access_block either absent or with any of the four block flags set to false. Each of these creates a distinct attack surface, and each has a different detection profile in CloudTrail.
The publicly_accessible = true flag on RDS is particularly dangerous because it is the default in older provider versions and in many tutorial configurations. When set, AWS assigns the RDS instance a publicly resolvable DNS name and places it in a subnet route that allows internet ingress even if the security group appears to restrict access. Attackers who obtain the RDS endpoint (from tfstate, from DNS enumeration, or from an exposed error message) can attempt connections directly, bypassing any VPN or bastion host requirement. The Capital One 2019 breach included RDS instances in a VPC with overly permissive security group rules that allowed the SSRF exploit to reach the internal metadata service and then pivot to database queries.
Anti-Pattern Network Exposure Decision Tree
Anti-Pattern 3 & 4: Open Security Groups and Public RDS
# ❌ ANTI-PATTERN 3: Security group open to internet on administrative ports
# Observed verbatim in startup production environments during 2023-2024 IR engagements
resource "aws_security_group" "web_server" {
name = "web-server-sg"
description = "Web server security group"
vpc_id = aws_vpc.main.id
# ❌ SSH open to entire internet indexed by Shodan within minutes
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # Every IP on the internet
description = "SSH access" # No description of WHY this is open
}
# ❌ All outbound traffic standard but worth reviewing for data exfil risk
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "database" {
name = "db-sg"
vpc_id = aws_vpc.main.id
# ❌ Database port open to internet not just VPC internal
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # PostgreSQL accessible from anywhere
}
}
# ❌ ANTI-PATTERN 4: RDS publicly accessible
resource "aws_db_instance" "production" {
identifier = "prod-db"
engine = "postgres"
instance_class = "db.t3.medium"
allocated_storage = 100
publicly_accessible = true # ❌ DNS name resolves to a public IP
db_subnet_group_name = aws_db_subnet_group.public.name # ❌ In public subnets
vpc_security_group_ids = [aws_security_group.database.id]
skip_final_snapshot = true # ❌ No backup on destroy
}
# ✅ FIX: Principle of least-network-access restrict every ingress to minimum required source
resource "aws_security_group" "web_server" {
name = "web-server-sg"
description = "Web server: HTTPS from ALB only, no direct SSH"
vpc_id = aws_vpc.main.id
# ✅ HTTPS from Application Load Balancer security group only not from internet directly
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
security_groups = [aws_security_group.alb.id] # Reference SG, not CIDR
description = "HTTPS from ALB only"
}
# ✅ SSH removed entirely use AWS Systems Manager Session Manager instead
# SSM Session Manager: no open port 22, full session logging to CloudWatch/S3
# aws ssm start-session --target i-0123456789abcdef0
# No security group rule needed SSM agent connects outbound to SSM endpoint
# ✅ Egress: restrict to only what the application needs
egress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTPS outbound for AWS API calls and package updates"
}
egress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.database.id]
description = "PostgreSQL to database tier only"
}
}
resource "aws_security_group" "database" {
name = "db-sg"
description = "Database: accepts connections from app tier SG only"
vpc_id = aws_vpc.main.id
# ✅ Database accessible only from application server security group
# No CIDR block SG reference means only instances in the app SG can connect
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.web_server.id]
description = "PostgreSQL from app tier only"
}
# ✅ No egress needed databases don't initiate outbound connections
# Explicit deny by omitting egress (or add explicit deny-all)
}
resource "aws_db_instance" "production" {
identifier = "prod-db"
engine = "postgres"
instance_class = "db.t3.medium"
allocated_storage = 100
publicly_accessible = false # ✅ No public DNS name assigned
db_subnet_group_name = aws_db_subnet_group.private.name # ✅ Private subnets only
vpc_security_group_ids = [aws_security_group.database.id]
deletion_protection = true # ✅ Prevents accidental destroy
skip_final_snapshot = false # ✅ Backup on destroy
# ✅ Encryption at rest covered in Section 5
storage_encrypted = true
kms_key_id = aws_kms_key.rds.arn
}
tfsec and Checkov Rules: Enforce Network Security
# Run tfsec specifically for network security findings
tfsec . \
--include-checks aws-ec2-no-public-ingress-sgr,aws-rds-no-public-db-access,aws-ec2-no-public-ip-subnet \
--format lovely \
--minimum-severity HIGH
# Checkov equivalent targeted network checks
checkov -d . \
--check CKV_AWS_24,CKV_AWS_25,CKV_AWS_23,CKV_AWS_17,CKV_AWS_88 \
--output cli \
--compact
# CKV_AWS_24: Ensure no security groups allow ingress from 0.0.0.0:0 to port 22
# CKV_AWS_25: Ensure no security groups allow ingress from 0.0.0.0:0 to port 3389
# CKV_AWS_23: Ensure every security group and rule has a description
# CKV_AWS_17: Ensure RDS is not publicly accessible
# CKV_AWS_88: Ensure RDS is not publicly accessible (duplicate with different check ID)
# Custom Checkov check: detect ANY 0.0.0.0/0 ingress on non-web ports
# Catches database ports, Redis, MongoDB, Elasticsearch, and others
from checkov.common.models.enums import CheckCategories, CheckResult
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
# Ports that should NEVER be open to 0.0.0.0/0
SENSITIVE_PORTS = {
22, # SSH
23, # Telnet
25, # SMTP (outbound spam risk)
3389, # RDP
1433, # MSSQL
3306, # MySQL
5432, # PostgreSQL
6379, # Redis (no auth by default)
27017, # MongoDB (no auth by default)
9200, # Elasticsearch
9300, # Elasticsearch transport
2379, # etcd
2380, # etcd peer
}
class NoPublicIngressSensitivePorts(BaseResourceCheck):
"""
CKV_CUSTOM_2: No 0.0.0.0/0 ingress on sensitive ports.
Catches ports not covered by built-in CKV_AWS_24/25.
"""
def __init__(self):
name = "Ensure no security group allows ingress from internet on sensitive ports"
id = "CKV_CUSTOM_2"
supported_resources = ["aws_security_group", "aws_security_group_rule"]
categories = [CheckCategories.NETWORKING]
super().__init__(name=name, id=id,
categories=categories,
supported_resources=supported_resources)
def scan_resource_conf(self, conf):
ingress_rules = conf.get("ingress", [])
if isinstance(ingress_rules, list):
for rule in ingress_rules:
if isinstance(rule, dict):
cidr_blocks = rule.get("cidr_blocks", [])
ipv6_blocks = rule.get("ipv6_cidr_blocks", [])
from_port = int(rule.get("from_port", 0) or 0)
to_port = int(rule.get("to_port", 0) or 0)
is_public = (
"0.0.0.0/0" in (cidr_blocks or []) or
"::/0" in (ipv6_blocks or [])
)
if is_public:
# Check if any sensitive port falls in the from_port:to_port range
for port in SENSITIVE_PORTS:
if from_port <= port <= to_port:
return CheckResult.FAILED
# Also flag protocol -1 (all traffic) with 0.0.0.0/0
if rule.get("protocol") in ("-1", "all"):
return CheckResult.FAILED
return CheckResult.PASSED
KQL: CloudTrail Detect Security Group Changes Opening Internet Access
// KQL Detect security group rules being added with 0.0.0.0/0 source
// Source: AWS CloudTrail forwarded to Microsoft Sentinel
AWSCloudTrail
| where TimeGenerated > ago(1d)
| where EventName in ("AuthorizeSecurityGroupIngress", "CreateSecurityGroup",
"ModifyNetworkInterfaceAttribute")
| extend RequestParams = parse_json(RequestParameters)
| extend IpPermissions = RequestParams.ipPermissions
| mv-expand IpPermission = IpPermissions.items
| extend
FromPort = toint(IpPermission.fromPort),
ToPort = toint(IpPermission.toPort),
Protocol = tostring(IpPermission.ipProtocol),
CidrRange = tostring(IpPermission.ipRanges.items[0].cidrIp),
GroupId = tostring(RequestParams.groupId)
// Flag: internet-sourced ingress
| where CidrRange in ("0.0.0.0/0", "::/0")
// Further flag sensitive ports all traffic (-1) is automatic critical
| extend IsSensitivePort = case(
Protocol == "-1", true,
FromPort <= 22 and ToPort >= 22, true,
FromPort <= 3389 and ToPort >= 3389, true,
FromPort <= 3306 and ToPort >= 3306, true,
FromPort <= 5432 and ToPort >= 5432, true,
FromPort <= 6379 and ToPort >= 6379, true,
false
)
| extend Severity = iif(IsSensitivePort, "CRITICAL", "HIGH")
| extend CallerArn = tostring(parse_json(UserIdentity).arn)
| project TimeGenerated, CallerArn, EventName, GroupId,
Protocol, FromPort, ToPort, CidrRange, Severity
| order by Severity asc, TimeGenerated desc
Security Group Misconfiguration Risk Matrix
| Port / Protocol | Service | Risk if Open to 0.0.0.0/0 | Automated Attack Time | Observed in IR Engagements |
|---|---|---|---|---|
| 22 (SSH) | Secure Shell | Critical credential brute force, key theft | Minutes (Shodan bots) | Nearly every cloud IR engagement |
| 3389 (RDP) | Remote Desktop | Critical BlueKeep (CVE-2019-0708), NLA bypass, credential stuffing | Minutes | SolarWinds, healthcare sector breaches 2023 |
| 5432 / 3306 / 1433 | PostgreSQL / MySQL / MSSQL | Critical direct DB query if credentials obtained | Hours (after cred harvest) | Capital One (2019), multiple 2024 fintech incidents |
| 6379 (Redis) | Redis Cache | Critical no auth by default in Redis < 6; RCE via module load | Minutes | Numerous exposed Redis campaigns 2022–2024 |
| 9200 (Elasticsearch) | Elasticsearch | Critical no auth by default in OSS version; full data read | Minutes | Elasticsearch exposure is a perennial data leak source |
| 443 (HTTPS) | Web traffic | Low if intentional load balancer; High if direct-to-app | N/A expected for public endpoints | Only flag if the target is not a load balancer |
| 0–65535 (protocol -1) | All traffic | Critical complete network exposure | Immediate | Developer "fix everything" debug change left in prod |
Open ports are the most detectable anti-pattern because they appear in CloudTrail and can be scanned from outside. The next anti-pattern class is subtler: S3 bucket misconfigurations where the exposure comes from the intersection of three independent settings ACLs, bucket policies, and public access block that can contradict each other in ways that create unexpected public access even when each setting appears correct in isolation.
Section 3: S3 Misconfiguration Why Three Overlapping Access Control Layers Produce Unexpected Public Buckets Even When Each Layer Looks Correct
S3 access control has three independent layers that must all be configured correctly, and they interact in non-obvious ways. The first layer is the bucket ACL a legacy mechanism that predates bucket policies and is still present in Terraform as acl = "public-read" or acl = "private". The second is the bucket policy a JSON IAM policy attached to the bucket that can grant s3:GetObject to Principal: "*", making all objects public regardless of object ACLs. The third is the S3 Block Public Access setting four independent Boolean flags that override ACLs and policies to prevent public access, and which must all be true to actually block public access. The misconfiguration trap is that setting block_public_acls = true does not block a public bucket policy; you need restrict_public_buckets = true as well. Most Terraform code that adds aws_s3_bucket_public_access_block gets only one or two of the four flags right.
AWS introduced aws_s3_bucket_public_access_block in 2018 as a circuit breaker after a wave of public-bucket breaches. The four flags work as follows: block_public_acls prevents new public ACLs from being set and ignores existing public ACLs for access decisions. ignore_public_acls ignores all existing public ACLs on buckets and objects this is separate from blocking new ones. block_public_policy prevents bucket policies that grant public access from being applied. restrict_public_buckets is the one that actually enforces access restriction it overrides any existing public bucket policy and prevents public access even if block_public_policy is false. If you set only block_public_acls = true and block_public_policy = true but leave ignore_public_acls = false and restrict_public_buckets = false, an existing public ACL still grants public access because it's not being ignored. This exact combination is found frequently in Terraform code that was ported from the AWS console, where the UI makes the flags look mutually exclusive.
The encryption and logging gaps compound the ACL issue. S3 server-side encryption with aws:kms was not retroactively applied to existing objects before Terraform manages the bucket objects written before SSE was enabled remain unencrypted. S3 access logging, when enabled, is frequently pointed at the same bucket it monitors, which means the logs are deleted when the objects are deleted and may be publicly readable if the bucket itself is misconfigured. The correct pattern is a dedicated, locked-down logging bucket with a separate access policy.
S3 Access Control Layer Interaction
Anti-Pattern 5 & 6: Public S3 Bucket Configuration
# ❌ ANTI-PATTERN 5 & 6: Multiple S3 misconfigurations in a single bucket resource
# This pattern was found in a healthcare data platform during a 2024 compliance audit
resource "aws_s3_bucket" "data_lake" {
bucket = "company-data-lake-prod"
# ❌ acl argument deprecated in provider v4+ but still valid and dangerous
acl = "public-read" # Makes ALL objects readable by anyone
}
# ❌ Public access block present but INCOMPLETE only blocks new ACLs
# Does NOT ignore existing public ACLs, does NOT restrict public bucket policies
resource "aws_s3_bucket_public_access_block" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
block_public_acls = true # ✅ Blocks new public ACLs
block_public_policy = false # ❌ Allows public bucket policies
ignore_public_acls = false # ❌ Existing public-read ACL still active
restrict_public_buckets = false # ❌ Public bucket policies still enforced
}
# ❌ No encryption configuration objects stored in plaintext
# ❌ No versioning objects can be deleted without recovery
# ❌ No access logging no record of who read what
# ❌ ANTI-PATTERN: Bucket policy that makes the bucket public
resource "aws_s3_bucket_policy" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "PublicReadGetObject"
Effect = "Allow"
Principal = "*" # ❌ All principals any internet user
Action = "s3:GetObject"
Resource = "${aws_s3_bucket.data_lake.arn}/*"
# No Condition block no IP restriction, no VPC endpoint restriction
}
]
})
}
# ✅ FIX: Complete S3 hardening all four public access block flags, encryption, logging
# Dedicated logging bucket separate from the data bucket it monitors
resource "aws_s3_bucket" "access_logs" {
bucket = "company-s3-access-logs"
force_destroy = false
}
resource "aws_s3_bucket_public_access_block" "access_logs" {
bucket = aws_s3_bucket.access_logs.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_s3_bucket_server_side_encryption_configuration" "access_logs" {
bucket = aws_s3_bucket.access_logs.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.s3_logs.arn
}
bucket_key_enabled = true # Reduces KMS API call costs significantly
}
}
# Main data bucket fully hardened
resource "aws_s3_bucket" "data_lake" {
bucket = "company-data-lake-prod"
force_destroy = false # ✅ Prevents accidental destruction
}
# ✅ ALL FOUR flags set to true complete public access prevention
resource "aws_s3_bucket_public_access_block" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
block_public_acls = true # Block new public ACLs
block_public_policy = true # Block new public bucket policies
ignore_public_acls = true # Ignore existing public ACLs
restrict_public_buckets = true # Override existing public policies
}
# ✅ KMS encryption all objects encrypted on write
resource "aws_s3_bucket_server_side_encryption_configuration" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.s3_data.arn
}
bucket_key_enabled = true
}
}
# ✅ Versioning enables object recovery and MFA-Delete protection
resource "aws_s3_bucket_versioning" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
versioning_configuration {
status = "Enabled"
# MFA delete requires out-of-band MFA confirmation to delete object versions
# Prevents ransomware and insider deletion
mfa_delete = "Enabled"
}
}
# ✅ Access logging to separate logging bucket
resource "aws_s3_bucket_logging" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
target_bucket = aws_s3_bucket.access_logs.id
target_prefix = "data-lake/"
}
# ✅ Bucket policy: restrict to VPC endpoint only no public internet access
resource "aws_s3_bucket_policy" "data_lake" {
# This policy depends on the public access block being in place first
depends_on = [aws_s3_bucket_public_access_block.data_lake]
bucket = aws_s3_bucket.data_lake.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "DenyNonVPCEndpointAccess"
Effect = "Deny"
Principal = "*"
Action = "s3:*"
Resource = [
aws_s3_bucket.data_lake.arn,
"${aws_s3_bucket.data_lake.arn}/*"
]
Condition = {
StringNotEquals = {
# Only allow access through the VPC endpoint
"aws:SourceVpce" = aws_vpc_endpoint.s3.id
}
}
},
{
Sid = "EnforceSSLOnly"
Effect = "Deny"
Principal = "*"
Action = "s3:*"
Resource = [
aws_s3_bucket.data_lake.arn,
"${aws_s3_bucket.data_lake.arn}/*"
]
Condition = {
Bool = { "aws:SecureTransport" = "false" }
}
}
]
})
}
Checkov: S3 Security Checks
# Full S3 security scan maps all relevant Checkov checks to S3 resources
checkov -d . \
--check \
CKV_AWS_18,\ # S3 access logging enabled
CKV_AWS_19,\ # S3 encryption enabled
CKV_AWS_20,\ # S3 bucket not publicly readable via ACL
CKV_AWS_21,\ # S3 versioning enabled
CKV_AWS_52,\ # S3 MFA delete enabled
CKV_AWS_53,\ # S3 block public ACLs
CKV_AWS_54,\ # S3 block public policy
CKV_AWS_55,\ # S3 ignore public ACLs
CKV_AWS_56,\ # S3 restrict public buckets
CKV_AWS_145,\ # S3 encryption uses KMS (not AES256)
CKV2_AWS_6,\ # S3 public access block exists at account level
CKV2_AWS_62 # S3 event notifications configured
// KQL Detect S3 public access block being disabled or bucket policy granting public access
// Source: AWS CloudTrail
AWSCloudTrail
| where TimeGenerated > ago(1d)
| where EventName in (
"PutBucketPublicAccessBlock", // Public access block being modified
"DeleteBucketPublicAccessBlock", // Public access block being removed
"PutBucketPolicy", // Bucket policy being set/changed
"PutBucketAcl" // ACL being changed
)
| extend RequestParams = parse_json(RequestParameters)
| extend BucketName = tostring(RequestParams.bucketName)
// For PutBucketPublicAccessBlock: flag if any flag is being set to false
| extend PublicAccessConfig = RequestParams.PublicAccessBlockConfiguration
| extend IsWeakening = (
EventName == "PutBucketPublicAccessBlock" and (
PublicAccessConfig.BlockPublicAcls == "false" or
PublicAccessConfig.BlockPublicPolicy == "false" or
PublicAccessConfig.IgnorePublicAcls == "false" or
PublicAccessConfig.RestrictPublicBuckets == "false"
)
) or EventName == "DeleteBucketPublicAccessBlock"
// For PutBucketPolicy: flag if policy contains Principal: *
| extend PolicyDocument = tostring(RequestParams.bucketPolicy)
| extend HasPublicPrincipal = PolicyDocument contains '"Principal":"*"' or
PolicyDocument contains '"Principal": "*"'
| where IsWeakening or HasPublicPrincipal
| extend CallerArn = tostring(parse_json(UserIdentity).arn)
| project TimeGenerated, CallerArn, EventName, BucketName,
IsWeakening, HasPublicPrincipal, PolicyDocument
| order by TimeGenerated desc
S3 Security Control Coverage Matrix
| Control | Terraform Resource | Checkov Check | tfsec Check | Risk if Missing |
|---|---|---|---|---|
| Block public ACLs | aws_s3_bucket_public_access_block (block_public_acls) | CKV_AWS_53 | aws-s3-block-public-acls | ACL-based public access possible |
| Block public policy | aws_s3_bucket_public_access_block (block_public_policy) | CKV_AWS_54 | aws-s3-block-public-policy | Public bucket policy applicable |
| Ignore public ACLs | aws_s3_bucket_public_access_block (ignore_public_acls) | CKV_AWS_55 | aws-s3-ignore-public-acls | Existing public ACLs still enforced |
| Restrict public buckets | aws_s3_bucket_public_access_block (restrict_public_buckets) | CKV_AWS_56 | aws-s3-no-public-buckets | Public policies still enforced despite other flags |
| KMS encryption | aws_s3_bucket_server_side_encryption_configuration | CKV_AWS_145 | aws-s3-enable-bucket-encryption | Data at rest unencrypted plaintext if storage media stolen |
| Access logging | aws_s3_bucket_logging | CKV_AWS_18 | aws-s3-enable-bucket-logging | No record of who accessed what data blind to exfiltration |
| Versioning | aws_s3_bucket_versioning | CKV_AWS_21 | aws-s3-enable-versioning | Deleted objects unrecoverable ransomware/insider risk |
| MFA Delete | aws_s3_bucket_versioning (mfa_delete = Enabled) | CKV_AWS_52 | N/A | Object versions deletable without second factor |
The S3 anti-patterns have one thing in common with the IAM anti-patterns in the next section: they require understanding how multiple independent controls interact, rather than applying a single fix. IAM takes this further a Terraform-managed IAM policy with a single * wildcard can be the only thing standing between an attacker who compromised a Lambda function and full account takeover.
Section 4: IAM Wildcard Policies and Trust Relationship Abuse How Terraform's Convenience Defaults Create Privilege Escalation Paths Into Your Account
IAM misconfigurations in Terraform fall into two categories that are often treated as the same problem but have completely different attack surfaces. The first category is overpermissive action/resource wildcards in IAM policies Action: "*" or Resource: "*" in a policy attached to a role or user. The second is overpermissive trust relationships on IAM roles who is allowed to sts:AssumeRole that role. Both appear constantly in Terraform code because developers copy the AWS documentation examples (which use * for clarity) and never tighten them. The consequences are different: wildcard actions let a compromised principal do anything in your account; wildcard trust relationships let any principal in your account (or any account) escalate to that role's permissions.
The iam:PassRole privilege escalation vector is the specific mechanism that makes Terraform-managed IAM particularly dangerous. iam:PassRole allows a principal to attach an IAM role to a service Lambda, EC2, ECS, Glue. If an attacker compromises a developer's AWS credentials that have iam:PassRole and lambda:CreateFunction (both common in developer accounts), they can create a new Lambda function, pass it any role in the account (including an admin role), and invoke the Lambda to execute arbitrary AWS API calls as that admin role. The Terraform-created developer role with Action: "lambda:*" and iam:PassRole on Resource: "*" is a complete privilege escalation path to admin. This specific vector is documented in Rhino Security Labs' AWS privilege escalation research and observed in real cloud intrusions.
The assume role trust policy anti-pattern is subtler. A trust policy with Principal: { AWS: "arn:aws:iam::123456789012:root" } which looks like it restricts access to a specific account actually grants every principal in that account the ability to assume the role, including newly-created users, compromised API keys, and Lambda function execution roles. The account root principal means "anyone in account 123456789012 who has sts:AssumeRole permissions," not "only the root user." This is a well-documented misunderstanding of how trust policies work.
IAM Privilege Escalation Chain
Anti-Pattern 7 & 8: IAM Wildcards and Trust Policy Misconfigurations
# ❌ ANTI-PATTERN 7: Wildcard IAM policy observed in Terraform code for developer roles
# "Let's give them everything for now and restrict later" "later" never comes
resource "aws_iam_policy" "developer_policy" {
name = "developer-full-access"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "DeveloperAccess"
Effect = "Allow"
Action = "*" # ❌ Every AWS API action admin-equivalent
Resource = "*" # ❌ Every resource in the account
}
]
})
}
# ❌ ANTI-PATTERN 8: Trust policy allowing entire account root
resource "aws_iam_role" "ci_deploy_role" {
name = "ci-deploy-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
# ❌ Account root = any principal in this account can assume this role
# Not just the root user every IAM user and role in the account
AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"
}
Action = "sts:AssumeRole"
# ❌ No Condition no IP restriction, no MFA requirement, no service restriction
}
]
})
}
# ❌ Overly permissive policy attached to the CI role
resource "aws_iam_role_policy" "ci_deploy" {
name = "ci-deploy-permissions"
role = aws_iam_role.ci_deploy_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = ["iam:*", "lambda:*", "s3:*", "ec2:*"] # ❌ Broad wildcards
Resource = "*" # ❌ On all resources
}
]
})
}
# ✅ FIX: Least privilege IAM policies with specific actions, resources, and conditions
# ✅ Developer policy: specific actions for specific resources
# Replace with actual services your developers need not a universal template
resource "aws_iam_policy" "developer_policy" {
name = "developer-limited-access"
description = "Developer access limited to sandbox account resources"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "LambdaDevelopment"
Effect = "Allow"
Action = [
"lambda:CreateFunction",
"lambda:UpdateFunctionCode",
"lambda:UpdateFunctionConfiguration",
"lambda:InvokeFunction",
"lambda:GetFunction",
"lambda:ListFunctions",
"lambda:DeleteFunction"
]
# ✅ Resource scoped to functions with specific prefix not all functions
Resource = "arn:aws:lambda:${var.aws_region}:${data.aws_caller_identity.current.account_id}:function:${var.team_prefix}-*"
},
{
Sid = "LambdaPassRoleScoped"
Effect = "Allow"
Action = ["iam:PassRole"]
# ✅ PassRole only to the specific execution roles for this team's functions
# Cannot pass an admin role only the designated Lambda execution role
Resource = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${var.team_prefix}-lambda-execution-role"
Condition = {
StringEquals = {
"iam:PassedToService" = "lambda.amazonaws.com"
}
}
},
{
Sid = "DenyIAMAdminActions"
Effect = "Deny"
# ✅ Explicit deny on IAM privilege escalation actions overrides any Allow
Action = [
"iam:CreateUser",
"iam:AttachUserPolicy",
"iam:AttachRolePolicy",
"iam:CreatePolicy",
"iam:CreateRole",
"iam:PutRolePolicy",
"iam:PutUserPolicy"
]
Resource = "*"
}
]
})
}
# ✅ CI deploy role: trust only the specific CI service, not the entire account
resource "aws_iam_role" "ci_deploy_role" {
name = "ci-deploy-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
# ✅ Specific IAM role that CI runner uses not account root
AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/github-actions-runner"
}
Action = "sts:AssumeRole"
Condition = {
StringEquals = {
# ✅ External ID prevents confused deputy attacks from third-party services
"sts:ExternalId" = var.ci_external_id
}
# ✅ Also restrict to specific source IP range if CI runners have fixed IPs
# IpAddress = { "aws:SourceIp" = ["10.0.0.0/8"] }
}
}
]
})
}
# ✅ CI policy: deploy-only permissions, no IAM mutation
resource "aws_iam_role_policy" "ci_deploy" {
name = "ci-deploy-permissions"
role = aws_iam_role.ci_deploy_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "ECRPush"
Effect = "Allow"
Action = [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload"
]
Resource = "arn:aws:ecr:${var.aws_region}:${data.aws_caller_identity.current.account_id}:repository/${var.app_name}-*"
},
{
Sid = "ECSUpdateService"
Effect = "Allow"
Action = ["ecs:UpdateService", "ecs:DescribeServices", "ecs:DescribeTaskDefinition"]
Resource = "arn:aws:ecs:${var.aws_region}:${data.aws_caller_identity.current.account_id}:service/${var.cluster_name}/${var.app_name}-*"
},
{
Sid = "DenyAllIAM"
Effect = "Deny"
Action = "iam:*" # ✅ CI pipeline should never mutate IAM
Resource = "*"
}
]
})
}
Checkov and tfsec: IAM Wildcard Detection
# Checkov IAM checks run against all .tf files in your repository
checkov -d . --check \
CKV_AWS_40,\ # IAM policies attached directly to users (should use groups/roles)
CKV_AWS_1,\ # IAM policy documents contain wildcard actions
CKV_AWS_2,\ # Lambda function should use supported runtimes (not admin bypass)
CKV_AWS_274,\ # IAM policy should not have statements with both Action * and Resource *
CKV_AWS_275,\ # IAM policy should not allow * action
CKV2_AWS_40,\ # IAM managed policies are attached only to roles (not users directly)
CKV2_AWS_56 # Ensure IAM role policies are managed policies, not inline
# tfsec for IAM analysis note the check IDs for aws-iam
tfsec . --include-checks \
aws-iam-no-policy-wildcards,\
aws-iam-no-root-access-key,\
aws-iam-enforce-mfa,\
aws-iam-no-user-attached-policies
// KQL Detect IAM privilege escalation paths being executed
// Specifically: PassRole + CreateFunction (classic Lambda privesc) and other escalation combos
// Source: AWS CloudTrail
// Detect the Lambda privilege escalation sequence
let LambdaPrivEsc = AWSCloudTrail
| where TimeGenerated > ago(1h)
| where EventName in ("PassRole", "CreateFunction", "InvokeFunction",
"CreateUser", "AttachUserPolicy", "CreateRole", "AttachRolePolicy")
| extend CallerArn = tostring(parse_json(UserIdentity).arn)
| summarize
Actions = make_set(EventName, 10),
ActionCount = count(),
FirstAction = min(TimeGenerated),
LastAction = max(TimeGenerated)
by CallerArn, bin(TimeGenerated, 1h);
// Flag accounts performing privilege escalation sequences
LambdaPrivEsc
| where (Actions has "PassRole" and Actions has "CreateFunction") // Lambda privesc
or (Actions has "CreateUser" and Actions has "AttachUserPolicy") // Backdoor user creation
or (Actions has "CreateRole" and Actions has "AttachRolePolicy") // Backdoor role creation
| extend WindowMinutes = datetime_diff('minute', LastAction, FirstAction)
| project FirstAction, LastAction, CallerArn, Actions, ActionCount, WindowMinutes
| order by FirstAction desc
IAM Privilege Escalation Via Terraform Anti-Patterns
| Anti-Pattern | Terraform Code | Attack Path | Escalation Result | Checkov Check |
|---|---|---|---|---|
Action: "*" wildcard | Action = "*" in policy Statement | Any compromised principal with this policy can call any AWS API | Full account admin via API | CKV_AWS_275 |
Resource: "*" wildcard | Resource = "*" with sensitive actions | Actions scope to all resources no isolation between environments | Cross-environment access | CKV_AWS_274 |
| Account root in trust policy | Principal: {AWS: "arn:aws:iam::ACCT:root"} | Any IAM entity in the account can assume the role if they have sts:AssumeRole | Lateral movement to high-priv role | No built-in check use custom |
iam:PassRole on * | Action: ["iam:PassRole"], Resource: "*" | Can pass any role (including admin) to Lambda/EC2/ECS | Full account takeover via service | CKV_AWS_60 |
| No external ID in cross-account trust | Trust allows another account with no ExternalId | Confused deputy attack if third-party SaaS is compromised | Access to your account via compromised partner | CKV_AWS_156 |
| Inline policies instead of managed | aws_iam_role_policy (inline) | Inline policies don't appear in IAM policy list harder to audit | Hidden permissions surviving policy reviews | CKV2_AWS_56 |
The IAM anti-patterns are the hardest to detect reactively because wildcard policies don't generate alerts they generate capability. The attacker's use of that capability generates CloudTrail events, but by then the action has already been taken. The final anti-pattern class covers the gaps that make reactive detection harder: missing encryption, missing audit logs, and an unprotected state backend the three conditions that turn a recoverable incident into a catastrophic one.
Section 5: Encryption Gaps, Missing Audit Logs, and CI/CD Integration The Anti-Patterns That Survive Manual Review and How to Automate Their Detection
The final class of Terraform anti-patterns is the most dangerous precisely because it is invisible at deploy time. An unencrypted RDS instance serves queries the same way an encrypted one does. An EC2 instance with an unencrypted root volume boots and runs applications identically to one with encryption enabled. VPC Flow Logs, when absent, leave no artifact indicating their absence there is simply no traffic data. These gaps do not cause failures; they only matter after an attacker has already accessed a database or exfiltrated data, and even then they matter only to the IR team trying to determine what happened. The Capital One 2019 breach investigation was significantly complicated by gaps in access logging. The Uber 2022 breach investigators had incomplete data about which systems the attacker accessed because S3 server access logging was not enabled on all buckets. These are not theoretical risks.
The encryption gap for RDS, EBS, and SQS specifically: AWS encrypts neither by default in all regions for all resource types. RDS instances created without storage_encrypted = true store database files in plaintext on the underlying EBS volumes. If an attacker obtains a snapshot (via the rds:CreateDBSnapshot and rds:CopyDBSnapshot to another account technique), they get plaintext data. The attack requires the IAM permissions to create and share snapshots which are frequently present in developer roles not direct database access. EBS volume encryption is similarly not default in older Terraform provider configurations, and unencrypted EBS snapshots shared via aws_ebs_snapshot_copy are another exfiltration path.
The aws_cloudtrail Terraform resource is frequently either absent or misconfigured: include_global_service_events = false misses IAM and STS events (where credential theft and role assumption happen), is_multi_region_trail = false misses activity in regions where the attacker pivots to, and enable_log_file_validation = false means the CloudTrail log files can be tampered with after the fact. All three of these are the default values in many Terraform examples because the AWS provider does not enforce them.
CI/CD Policy Gate Architecture
Anti-Pattern 9 & 10: Encryption and Logging Gaps
# ❌ ANTI-PATTERN 9: Unencrypted storage resources
# Unencrypted RDS
resource "aws_db_instance" "production" {
identifier = "prod-db"
engine = "postgres"
instance_class = "db.t3.medium"
allocated_storage = 100
storage_encrypted = false # ❌ Plaintext database files on EBS
# No kms_key_id using default AWS-managed key if encrypted (weaker audit trail)
}
# Unencrypted EBS volume
resource "aws_ebs_volume" "data" {
availability_zone = "us-east-1a"
size = 100
encrypted = false # ❌ Plaintext volume snapshots are also plaintext
}
# Unencrypted SQS queue (contains message data may include PII or credentials)
resource "aws_sqs_queue" "processing" {
name = "data-processing-queue"
# No kms_master_key_id messages stored in plaintext
}
# ❌ ANTI-PATTERN 10: Missing audit trail configuration
# CloudTrail with critical gaps
resource "aws_cloudtrail" "main" {
name = "main-trail"
s3_bucket_name = aws_s3_bucket.cloudtrail_logs.id
include_global_service_events = false # ❌ Misses IAM, STS, Route53 events
is_multi_region_trail = false # ❌ Only captures events in one region
enable_log_file_validation = false # ❌ Log files can be tampered post-incident
# No CloudWatch Logs integration logs only go to S3, not searchable in near-real-time
# No KMS encryption on CloudTrail log files
}
# No VPC Flow Logs no record of network connections to/from any resource
# No S3 access logging no record of who accessed which S3 objects
# ✅ FIX: Complete encryption and audit logging configuration
# ✅ Encrypted RDS with customer-managed KMS key
resource "aws_db_instance" "production" {
identifier = "prod-db"
engine = "postgres"
instance_class = "db.t3.medium"
allocated_storage = 100
storage_encrypted = true # ✅ EBS encryption at rest
kms_key_id = aws_kms_key.rds.arn # ✅ Customer-managed key full audit trail in CloudTrail
deletion_protection = true
}
# ✅ KMS key for RDS with rotation enabled
resource "aws_kms_key" "rds" {
description = "KMS key for RDS encryption"
deletion_window_in_days = 30
enable_key_rotation = true # ✅ Annual automatic key rotation
policy = data.aws_iam_policy_document.kms_rds.json
}
# ✅ Encrypted EBS volumes
resource "aws_ebs_volume" "data" {
availability_zone = "us-east-1a"
size = 100
encrypted = true
kms_key_id = aws_kms_key.ebs.arn
}
# ✅ Encrypted SQS with KMS
resource "aws_sqs_queue" "processing" {
name = "data-processing-queue"
kms_master_key_id = aws_kms_key.sqs.arn
# Enforce HTTPS-only access via queue policy
}
# ✅ CloudTrail: comprehensive, multi-region, validated
resource "aws_cloudtrail" "main" {
name = "main-trail"
s3_bucket_name = aws_s3_bucket.cloudtrail_logs.id
include_global_service_events = true # ✅ Captures IAM, STS, Route53, CloudFront
is_multi_region_trail = true # ✅ Captures events in ALL regions attacker pivots included
enable_log_file_validation = true # ✅ SHA-256 hash of each log file detects tampering
is_organization_trail = var.is_organization_account # ✅ Covers all member accounts if using AWS Orgs
# ✅ CloudWatch Logs integration searchable in near-real-time, not just S3 batch
cloud_watch_logs_group_arn = "${aws_cloudwatch_log_group.cloudtrail.arn}:*"
cloud_watch_logs_role_arn = aws_iam_role.cloudtrail_cloudwatch.arn
# ✅ KMS encryption of log files even if S3 bucket policy is misconfigured, logs are encrypted
kms_key_id = aws_kms_key.cloudtrail.arn
# ✅ Log S3 data events who accessed which S3 objects (captures exfiltration)
event_selector {
read_write_type = "All"
include_management_events = true
data_resource {
type = "AWS::S3::Object"
values = ["arn:aws:s3:::"] # All S3 objects in all buckets
}
data_resource {
type = "AWS::Lambda::Function"
values = ["arn:aws:lambda"] # All Lambda invocations
}
}
}
# ✅ VPC Flow Logs record all network connections
resource "aws_flow_log" "main" {
vpc_id = aws_vpc.main.id
traffic_type = "ALL" # ACCEPT, REJECT, and ALL traffic
iam_role_arn = aws_iam_role.flow_logs.arn
log_destination = aws_cloudwatch_log_group.vpc_flow_logs.arn
log_destination_type = "cloud-watch-logs"
# ✅ Extended format includes source/dest port, protocol, packets, bytes, direction
log_format = "$${version} $${account-id} $${interface-id} $${srcaddr} $${dstaddr} $${srcport} $${dstport} $${protocol} $${packets} $${bytes} $${windowstart} $${windowend} $${action} $${flowdirection} $${traffic-path}"
}
# ✅ Terraform remote state backend: encrypted, locked, versioned
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "production/terraform.tfstate"
region = "us-east-1"
encrypt = true # ✅ State file encrypted at rest
kms_key_id = "arn:aws:kms:us-east-1:ACCT:key/KEY-ID" # ✅ CMK encryption
dynamodb_table = "terraform-state-lock" # ✅ State locking prevents concurrent applies
# ✅ Access logging on this bucket via aws_s3_bucket_logging (separate resource)
}
}
Complete CI/CD Integration: tfsec, Checkov, Terrascan in GitHub Actions
# .github/workflows/terraform-security.yml
# Complete IaC security gate for pull requests blocks merge on CRITICAL findings
name: Terraform Security Analysis
on:
pull_request:
branches: [main, staging]
paths:
- '**.tf'
- '**.tfvars'
- '.github/workflows/terraform-security.yml'
permissions:
contents: read
pull-requests: write # Required to post findings as PR comments
security-events: write # Required to upload SARIF to GitHub Security tab
jobs:
secrets-scan:
name: "Secret Detection (gitleaks)"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history needed for git log scanning
- name: Run gitleaks on PR commits
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITLEAKS_LICENSE: ${{ secrets.GITLEAKS_LICENSE }} # Required for org repos
tfsec:
name: "tfsec Analysis"
runs-on: ubuntu-latest
needs: secrets-scan
steps:
- uses: actions/checkout@v4
- name: Run tfsec
uses: aquasecurity/tfsec-action@v1.0.3
with:
working_directory: .
minimum_severity: HIGH # HIGH and CRITICAL fail the build
format: sarif # Upload findings to GitHub Security tab
github_token: ${{ secrets.GITHUB_TOKEN }}
additional_args: >
--config-file .tfsec/config.toml
--out tfsec-results.sarif
- name: Upload tfsec SARIF
uses: github/codeql-action/upload-sarif@v3
if: always() # Upload even if tfsec found issues
with:
sarif_file: tfsec-results.sarif
checkov:
name: "Checkov Analysis"
runs-on: ubuntu-latest
needs: secrets-scan
steps:
- uses: actions/checkout@v4
- name: Run Checkov
id: checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: .
framework: terraform
output_format: sarif
output_file_path: checkov-results.sarif
# Only hard-fail on CRITICAL findings HIGH creates annotation but doesn't block
soft_fail_on: HIGH
# External checks directory for custom checks (like the ones in this post)
external_checks_dir: ./checkov/custom_checks
# Suppress known accepted risks with documented justification
skip_check: "" # Add CKV IDs here only with ticket reference
- name: Upload Checkov SARIF
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: checkov-results.sarif
terrascan:
name: "Terrascan OPA Policy Evaluation"
runs-on: ubuntu-latest
needs: [tfsec, checkov]
steps:
- uses: actions/checkout@v4
- name: Run Terrascan
id: terrascan
uses: tenable/terrascan-action@main
with:
iac_type: terraform
iac_version: v15
policy_type: aws
only_warn: false # Fail on policy violations
sarif_upload: true
# Custom OPA policies from your policies directory
policy_path: ./terrascan/policies/
- name: Upload Terrascan SARIF
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: terrascan.sarif
conftest-plan:
name: "OPA Conftest on Terraform Plan"
runs-on: ubuntu-latest
needs: [tfsec, checkov]
env:
AWS_ROLE_ARN: ${{ secrets.TF_PLAN_ROLE_ARN }}
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials (read-only plan role)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ env.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.8.0
- name: Terraform Init
run: terraform init -backend-config="key=pr-${{ github.event.pull_request.number }}/terraform.tfstate"
- name: Terraform Plan (JSON output for OPA)
run: |
terraform plan -out=tfplan.binary
# Convert binary plan to JSON for OPA evaluation
terraform show -json tfplan.binary > tfplan.json
- name: Install conftest
run: |
curl -Lo conftest.tar.gz https://github.com/open-policy-agent/conftest/releases/download/v0.53.0/conftest_0.53.0_Linux_x86_64.tar.gz
tar xzf conftest.tar.gz
chmod +x conftest
- name: Evaluate OPA policies against plan
run: |
# conftest evaluates the terraform plan JSON against OPA Rego policies
# Fails the pipeline if any policy returns a deny
./conftest test tfplan.json \
--policy ./opa/policies/ \
--namespace terraform \
--output github
# opa/policies/terraform_security.rego
# OPA Rego policies evaluated against terraform plan JSON
# These catch issues that static analysis misses like resource relationships
package terraform
import future.keywords.in
import future.keywords.if
# ---- Policy 1: No security group allows 0.0.0.0/0 ingress on sensitive ports ----
SENSITIVE_PORTS := {22, 3389, 1433, 3306, 5432, 6379, 27017}
deny[msg] if {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_security_group"
ingress := resource.values.ingress[_]
cidr := ingress.cidr_blocks[_]
cidr in {"0.0.0.0/0", "::/0"}
port in SENSITIVE_PORTS
ingress.from_port <= port
ingress.to_port >= port
msg := sprintf("CRITICAL: Security group '%s' allows internet access (0.0.0.0/0) on sensitive port %d",
[resource.address, port])
}
# ---- Policy 2: RDS must not be publicly accessible ----
deny[msg] if {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_db_instance"
resource.values.publicly_accessible == true
msg := sprintf("CRITICAL: RDS instance '%s' is publicly accessible must be false",
[resource.address])
}
# ---- Policy 3: All S3 buckets must have all four public access block flags true ----
deny[msg] if {
bucket := input.planned_values.root_module.resources[_]
bucket.type == "aws_s3_bucket"
bucket_name := bucket.values.bucket
# Check if a corresponding public_access_block resource exists
pab := input.planned_values.root_module.resources[_]
pab.type == "aws_s3_bucket_public_access_block"
pab.values.bucket == bucket_name
# Any flag set to false is a violation
flag_value in [
pab.values.block_public_acls,
pab.values.block_public_policy,
pab.values.ignore_public_acls,
pab.values.restrict_public_buckets
]
flag_value == false
msg := sprintf("CRITICAL: S3 bucket '%s' has incomplete public access block configuration",
[bucket_name])
}
# ---- Policy 4: RDS storage must be encrypted with a customer-managed KMS key ----
deny[msg] if {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_db_instance"
# Either not encrypted, or using default AWS key (no kms_key_id)
not resource.values.storage_encrypted == true
msg := sprintf("CRITICAL: RDS instance '%s' does not have storage encryption enabled",
[resource.address])
}
# ---- Policy 5: CloudTrail must be multi-region and include global service events ----
deny[msg] if {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_cloudtrail"
not resource.values.is_multi_region_trail == true
msg := sprintf("HIGH: CloudTrail '%s' is not multi-region events in other regions will not be captured",
[resource.address])
}
deny[msg] if {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_cloudtrail"
not resource.values.include_global_service_events == true
msg := sprintf("HIGH: CloudTrail '%s' does not include global service events IAM/STS events will be missed",
[resource.address])
}
# .pre-commit-config.yaml local hooks that run before any git commit
# Install: pip install pre-commit && pre-commit install
# After install: runs automatically on git commit
repos:
# Secret detection catches hardcoded secrets before they reach git
- repo: https://github.com/gitleaks/gitleaks
rev: v8.18.4
hooks:
- id: gitleaks
name: Detect secrets in staged files
# Fails the commit if any secret pattern is found
# tfsec fast local security check
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.90.0
hooks:
- id: terraform_tfsec
args:
- --args=--minimum-severity=HIGH
- --args=--config-file=.tfsec/config.toml
# Blocks commit if tfsec finds HIGH or CRITICAL issues
- id: terraform_checkov
args:
- --args=--quiet
- --args=--compact
- --args=--framework=terraform
# Only CRITICAL findings block local commits
- --args=--soft-fail-on=HIGH,MEDIUM,LOW
# Also run: terraform fmt and terraform validate
- id: terraform_fmt
name: Terraform format check
- id: terraform_validate
name: Terraform validation
IaC Security Tool Comparison
| Tool | Check Types | Plan Evaluation | Custom Policies | CI Integration | False Positive Rate | Best For |
|---|---|---|---|---|---|---|
| tfsec | Static .tf analysis, 150+ built-in checks | No (source only) | YAML/JSON rules | GitHub Actions, GitLab, Jenkins | Low | Fast network + encryption checks in PR |
| Checkov (Bridgecrew) | Static .tf + plan JSON, 1000+ checks | Yes (via terraform show -json) | Python class-based | All major CI platforms | Medium | Comprehensive policy coverage; integrates with Prisma |
| Terrascan | Static .tf + OPA Rego policies | Yes | OPA Rego | All major CI platforms | Low–Medium | Custom OPA policies; Kubernetes too |
| conftest + OPA | Plan JSON only | Yes (primary use case) | OPA Rego (Turing complete) | Any (runs as binary) | Controllable | Complex cross-resource policies; the aws_s3_bucket + aws_s3_bucket_public_access_block relationship check |
| Semgrep (IaC rules) | Static .tf pattern matching | No | YAML rules | GitHub Actions native | Low | Simple pattern-based checks; fast |
| AWS Config + Conformance Packs | Runtime (post-apply) | N/A continuous | CDK/CloudFormation | CloudWatch Events | Low | Drift detection after apply; complements pre-apply checks |
CISO Action The Gate Model That Stops These Anti-Patterns Before They Reach a Cloud Account
The ten anti-patterns in this post have one thing in common: they are all faster to create than to fix in production. Hardcoding a secret takes one line; removing it from Git history requires git filter-repo across all branches and invalidating all team members' local clones. Opening 0.0.0.0/0 on port 22 takes thirty seconds; auditing which systems have been accessed through that exposure takes a forensic engagement. Disabling encryption retroactively requires snapshot-and-restore cycles with downtime. The investment in pre-commit hooks, CI gates, and OPA policies pays back the first time it blocks a CRITICAL finding which, based on the observed frequency of these anti-patterns in production code, will be within the first week of deployment.
DevSecOps Pipeline Security Architecture
Prioritized Control Table
| Control | Impact | Effort | Implementation Pointer |
|---|---|---|---|
| Pre-commit hooks: tfsec + gitleaks | Critical | Low (1 day) | pip install pre-commit && pre-commit install with .pre-commit-config.yaml from Section 5. Catches secrets and network misconfigs before any code leaves the developer's machine. Zero CI cost runs locally on every commit. |
| CI gate: Checkov CRITICAL findings block merge | Critical | Low–Medium (2–3 days) | Add bridgecrewio/checkov-action@v12 to every Terraform PR workflow. Set soft_fail_on: HIGH only CRITICAL findings block. Upload SARIF to GitHub Security tab. Add external_checks_dir for the custom checks from Section 1 and Section 2. |
OPA conftest evaluation against terraform plan JSON | Critical | Medium (1 week) | The five OPA Rego policies in Section 5 cover the most critical cross-resource relationships that static analysis misses. terraform show -json tfplan.binary > tfplan.json && conftest test tfplan.json --policy ./opa/policies/. |
| Encrypted Terraform state backend (S3 + DynamoDB + KMS) | Critical | Low (hours) | Add backend "s3" block to terraform {} with encrypt = true, kms_key_id, and dynamodb_table for state locking. Use a CMK, not the default AWS key CMK usage is logged in CloudTrail per-decrypt. |
| Multi-region CloudTrail with global events + log validation | High | Low (hours) | Set is_multi_region_trail = true, include_global_service_events = true, enable_log_file_validation = true, and kms_key_id in aws_cloudtrail. Integrate with CloudWatch Logs for real-time alerting. One trail covers all regions. |
| S3 public access block all four flags true at account level | High | Low (hours) | aws_s3_account_public_access_block with all four flags true sets the account-level default that blocks all public access regardless of individual bucket configuration. Individual aws_s3_bucket_public_access_block resources per bucket for defense in depth. |
| AWS Config Rules for runtime drift detection | High | Medium (1 week) | Enable AWS Config Conformance Pack Operational-Best-Practices-for-CIS-AWS-v1.4-Level2. This continuously evaluates all resources against 100+ security rules and feeds findings to Security Hub. Catches drift after terraform apply and manual console changes. |
| KMS CMK encryption for RDS, EBS, SQS with key rotation | High | Medium (1 week) | Create aws_kms_key resources with enable_key_rotation = true and deletion_window_in_days = 30. Reference in each resource's encryption attribute. CMK means every decrypt is logged in CloudTrail provides forensic access trail that AWS-managed keys do not. |
| VPC Flow Logs with extended format to CloudWatch | High | Low (hours) | Add aws_flow_log resource to every VPC with traffic_type = "ALL" and the extended log format from Section 5. Ship to CloudWatch Logs for KQL/Sentinel queries. Without Flow Logs, lateral movement and exfiltration over the network leave no artifact. |
| IAM Access Analyzer continuous IAM policy analysis | Medium–High | Low (hours) | aws_accessanalyzer_analyzer resource with type = "ORGANIZATION". Automatically flags IAM roles and policies that grant external access, including the trust policy anti-patterns from Section 4. Results appear in Security Hub. |
The ten anti-patterns in this post are not edge cases found in poorly-managed environments. They appear in code written by experienced engineers under time pressure, in repositories that pass code review, and in infrastructure that runs cleanly for years until it becomes part of a breach investigation. The static analysis tools and OPA policies documented here take one sprint to deploy and they will catch every anti-pattern in this post before it reaches a cloud account. Every finding these tools surface costs nothing to fix at PR time and potentially millions to fix after an incident.
Tags: terraform, infrastructure-as-code, cloud-security, devSecOps, checkov, tfsec, opa, aws-security, iam, s3, detection-engineering
Audience: Cloud Security Engineers · DevSecOps Engineers · Platform Engineers · CISOs
Mapped Controls: CIS AWS Foundations Benchmark v2.0, AWS Well-Architected Security Pillar, NIST SP 800-190 (Container Security), SOC 2 CC6.1/CC6.6/CC7.1