In Q3 2024, a mid-sized SaaS client came to us with an AWS bill that had grown from $8,400/month to $14,700/month over 18 months โ€” without a corresponding increase in traffic. Over six weeks, we reduced it to $8,400/month. Here is exactly what we found and fixed.

1. The Starting Point

The infrastructure consisted of: 12 EC2 instances (mix of t3.xlarge and m5.2xlarge), 3 RDS PostgreSQL instances, 40TB of S3 storage, and CloudFront distributions for 4 applications. All EC2 instances were on-demand. No Reserved Instances. No Savings Plans.

2. Right-Sizing EC2 Instances

The first step was 2 weeks of CloudWatch monitoring to understand actual CPU, memory, and network utilisation. Findings: 7 of 12 instances were running at under 15% average CPU. Three t3.xlarge instances were running at under 5% CPU โ€” effectively idle, running as "just in case" capacity.

We right-sized to t3.large for low-traffic services, kept m5.2xlarge only for the two genuinely compute-bound workloads, and eliminated three redundant instances entirely. Saving: $2,100/month.

3. Reserved Instances & Savings Plans

Converting baseline EC2 usage from on-demand to 1-year Reserved Instances (no upfront) delivered an immediate 40% reduction on that portion of compute spend. We used Compute Savings Plans for workloads with variable instance types. Saving: $1,800/month.

4. RDS Query Optimisation

The largest RDS instance (db.r5.2xlarge at $1,200/month) was running at 8% CPU utilisation on average, spiking to 90% during report generation. We identified 3 unindexed queries responsible for the spikes using pg_stat_statements, added appropriate indexes, and right-sized the instance to db.r5.large. Saving: $680/month.

5. S3 Intelligent Tiering

40TB of S3 storage was all in Standard class. Analysis of access patterns showed 60% of objects had not been accessed in over 90 days. Enabling S3 Intelligent Tiering moved cold objects to Infrequent Access and Glacier Instant Retrieval automatically. Saving: $420/month.

6. CloudFront Cache Tuning

CloudFront cache hit rate was 34% โ€” meaning 66% of requests were passing through to the origin. We audited cache behaviours, fixed incorrect Cache-Control headers on static assets (they were set to no-cache by the application framework's default config), and raised the default TTL. Cache hit rate rose to 89%. Origin request volume dropped by 55%, reducing both EC2 load and data transfer costs. Saving: $380/month.

7. Eliminating Idle Resources

A full resource audit found: 4 unattached EBS volumes ($120/month), 2 idle load balancers ($36/month), 3 NAT Gateways that could be consolidated into 1 ($180/month), and old EC2 snapshots dating back 3 years ($95/month). Total: $431/month eliminated.

Results

Total monthly saving: $5,811 โ€” a 43% reduction. Performance metrics (p95 response time, error rate, uptime) were unchanged. The engagement paid for itself within the first month. The client now has a quarterly infrastructure review process to prevent cost creep from recurring.