Surviving on Spot: Running Production ECS Workloads for Pennies
- SnowLake Consulting
- Mar 2
- 1 min read
Updated: 2 days ago

Fargate Spot offers up to 70% discounts compared to On-Demand pricing. For a startup burning $50k/month on compute, switching to Spot can extend runway by months. But most enterprises are scared of the "2-minute warning"—the SIGTERM signal AWS sends before reclaiming the capacity.
Architecture for Interruption
To run production APIs on Spot, your application must be statistically robust. We recommend:
The 50/50 Split: Use an AWS Auto Scaling Group (ASG) mixed instances policy. Keep your "Base" capacity (min replicas) on On-Demand, and scale out using Spot.
Graceful Shutdowns: Your container entrypoint script must trap SIGTERM. When received, it should stop accepting new HTTP traffic (fail health checks) but continue processing in-flight requests for up to 120 seconds.
Queue-Based De-coupling: For background workers, this is even easier. If a worker dies mid-job, the message visibility timeout in SQS expires, and another worker picks it up. Idempotency is your best friend here.
We recently cut a SaaS client's bill by 55% with zero downtime by implementing this "Spot-First" strategy.




Comments