Every AWS account has CloudTrail. Most teams think of it as an audit log - something you look at after an incident. That framing misses most of its value. CloudTrail, combined with CloudWatch Metric Filters and Cost Anomaly Detection, becomes a real-time detection layer for account compromise, cost spikes, and misconfiguration. Here’s how to build it without subscribing to a SIEM.
The Three-Layer Model
Effective AWS security monitoring doesn’t need to be complex. Three layers cover the majority of real incidents:
- CloudTrail → CloudWatch Metric Filters → Alarms: Detect specific API events in real time
- AWS Cost Anomaly Detection: Catch unusual spend before it becomes a large bill
- AWS Config Rules: Detect drift from your security baseline continuously
None of these require third-party tooling, and the baseline cost is under $5/month for most accounts.
Layer 1: CloudWatch Metric Filters for Critical Events
Metric filters convert CloudTrail log entries into CloudWatch metrics. You define a filter pattern, CloudWatch increments a counter for each match, and you alarm on that counter.
resource "aws_cloudwatch_log_metric_filter" "root_usage" {
name = "root-account-usage"
log_group_name = aws_cloudwatch_log_group.cloudtrail.name
pattern = "{ $.userIdentity.type = \"Root\" && $.userIdentity.invokedBy NOT EXISTS && $.eventType != \"AwsServiceEvent\" }"
metric_transformation {
name = "RootAccountUsage"
namespace = "Security/CloudTrail"
value = "1"
}
}
resource "aws_cloudwatch_metric_alarm" "root_usage" {
alarm_name = "root-account-usage"
alarm_description = "Root account activity detected"
metric_name = "RootAccountUsage"
namespace = "Security/CloudTrail"
statistic = "Sum"
period = 60
evaluation_periods = 1
threshold = 1
comparison_operator = "GreaterThanOrEqualToThreshold"
alarm_actions = [aws_sns_topic.security_alerts.arn]
}
Critical events to monitor:
| Event pattern | What it detects |
|---|---|
{ $.userIdentity.type = "Root" } | Root account usage |
{ $.eventName = "ConsoleLogin" && $.errorCode = "Failed authentication" } | Console brute force |
{ $.eventName = "DeleteTrail" || $.eventName = "StopLogging" } | CloudTrail tampering |
{ $.eventName = "AuthorizeSecurityGroupIngress" && $.requestParameters.cidrIp = "0.0.0.0/0" } | Security group opened to world |
{ $.eventName = "CreateAccessKey" } | New IAM access key created |
{ $.eventName = "PutBucketPolicy" } | Bucket policy changed |
{ $.eventName = "GetSecretValue" && $.errorCode = "AccessDenied" } | Secrets Manager access denied |
The last one is particularly useful - repeated GetSecretValue denials often indicate credential theft being used to probe what the compromised identity can access.
Layer 2: Cost Anomaly Detection
A compromised IAM role spinning up GPU instances or exfiltrating data through S3 shows up as a cost spike before it shows up anywhere else. AWS Cost Anomaly Detection uses ML to establish a baseline and alerts when spend deviates significantly.
resource "aws_ce_anomaly_monitor" "services" {
name = "services-monitor"
monitor_type = "DIMENSIONAL"
monitor_dimension = "SERVICE"
}
resource "aws_ce_anomaly_subscription" "daily" {
name = "daily-anomaly-alert"
frequency = "DAILY"
monitor_arn_list = [aws_ce_anomaly_monitor.services.arn]
subscriber {
type = "EMAIL"
address = var.billing_alert_email
}
threshold_expression {
dimension {
key = "ANOMALY_TOTAL_IMPACT_ABSOLUTE"
values = ["10"]
match_options = ["GREATER_THAN_OR_EQUAL"]
}
}
}
Set the threshold to something meaningful but not too sensitive - $10 in a single day is a reasonable trigger for a small account. Tune it based on your normal spend patterns.
One account limitation: you can only have one DIMENSIONAL monitor per monitor type. If you try to create a second one, the API returns an error. Import the existing one into Terraform state rather than creating a new resource.
Layer 3: AWS Config Rules
Config rules evaluate your resources against security policies continuously - not just at deploy time. When a resource drifts from the expected state, Config flags it.
resource "aws_config_config_rule" "s3_public_access_blocked" {
name = "s3-bucket-public-access-prohibited"
source {
owner = "AWS"
source_identifier = "S3_BUCKET_PUBLIC_ACCESS_PROHIBITED"
}
}
resource "aws_config_config_rule" "mfa_enabled_for_iam_console" {
name = "mfa-enabled-for-iam-console-access"
source {
owner = "AWS"
source_identifier = "MFA_ENABLED_FOR_IAM_CONSOLE_ACCESS"
}
}
resource "aws_config_config_rule" "access_keys_rotated" {
name = "access-keys-rotated"
source {
owner = "AWS"
source_identifier = "ACCESS_KEYS_ROTATED"
}
input_parameters = jsonencode({
maxAccessKeyAge = "90"
})
}
resource "aws_config_config_rule" "root_account_mfa" {
name = "root-account-mfa-enabled"
source {
owner = "AWS"
source_identifier = "ROOT_ACCOUNT_MFA_ENABLED"
}
}
AWS provides ~250 managed rules covering common compliance frameworks (CIS, PCI, HIPAA). Start with the CIS AWS Foundations Benchmark pack, which covers the highest-impact rules as a curated set:
resource "aws_config_conformance_pack" "cis" {
name = "CIS-AWS-Foundations-Benchmark-Level-2"
template_s3_uri = "s3://aws-configservice-us-east-1/conformance-packs/CIS_AWS_Foundations_Benchmark_v3.0.0.yaml"
}
SNS and Alert Routing
Route all alarms to a single SNS topic, then fan out to email, Slack, or PagerDuty depending on severity.
resource "aws_sns_topic" "security_alerts" {
name = "security-alerts"
}
resource "aws_sns_topic_subscription" "email" {
topic_arn = aws_sns_topic.security_alerts.arn
protocol = "email"
endpoint = var.security_alert_email
}
For Slack: use an SNS → Lambda → Slack webhook. For PagerDuty: SNS has a native PagerDuty integration.
What This Doesn’t Cover
This setup handles the most common account-level threats. What it doesn’t replace:
- GuardDuty: ML-based threat detection including DNS exfiltration, unusual API call patterns, and compromised instance detection. Worth enabling for accounts handling sensitive data ($3-15/month for most accounts).
- Security Hub: Aggregates findings from GuardDuty, Inspector, Config, and Macie into a single dashboard with prioritisation. Good for teams managing multiple accounts.
- VPC Flow Logs analysis: Detecting east-west movement inside your VPC requires flow logs - CloudTrail only sees the control plane.
For a personal project or small startup, the CloudTrail + Config + Anomaly Detection stack described here covers most of what matters. Enable GuardDuty when you start handling customer data or have a team large enough that insider threat becomes a concern.