top of page
Search

Surviving on Spot: Running Production ECS Workloads for Pennies

  • Writer: SnowLake Consulting
    SnowLake Consulting
  • Mar 2
  • 1 min read

Updated: 2 days ago




Fargate Spot offers up to 70% discounts compared to On-Demand pricing. For a startup burning $50k/month on compute, switching to Spot can extend runway by months. But most enterprises are scared of the "2-minute warning"—the SIGTERM signal AWS sends before reclaiming the capacity.


Architecture for Interruption


To run production APIs on Spot, your application must be statistically robust. We recommend:

  • The 50/50 Split: Use an AWS Auto Scaling Group (ASG) mixed instances policy. Keep your "Base" capacity (min replicas) on On-Demand, and scale out using Spot.

  • Graceful Shutdowns: Your container entrypoint script must trap SIGTERM. When received, it should stop accepting new HTTP traffic (fail health checks) but continue processing in-flight requests for up to 120 seconds.

  • Queue-Based De-coupling: For background workers, this is even easier. If a worker dies mid-job, the message visibility timeout in SQS expires, and another worker picks it up. Idempotency is your best friend here.

We recently cut a SaaS client's bill by 55% with zero downtime by implementing this "Spot-First" strategy.


 
 
 

Comments


bottom of page