How I run an online business on AWS for less than $5/month
I've been working on starting several online applications - simple things like edit PDFs/some other task with Stripe integration. I'm using AWS as my cloud provider (just happens to be something I'm familiar with).
Previously, I'd be paying something like $50 USD/month per application.
Now, I've optimized my stack to run them for less than $5 USD/month each.
Typical Setup
Typically, in a large organization, the stack would be something like:
- Backend: ECS or EKS
- DB: RDS
- Frontend: S3 + CloudFront
I deploy via Infrastructure-as-Code (IaC), and to keep things tidy, each application uses its own separate set of resources - one AWS RDS DB for each and so on. This isolates applications, and I can provision or destroy each application independently of the rest, with no dependencies between each one.
The above setup costs (all prices are for the ap-southeast-1
region, based on 750 hours/month):
Service | Price | Notes |
---|---|---|
ECS | $3.91 | 0.25 vCPU, 0.5GB RAM for Fargate Spot instances |
Static IP for ECS | $3.75 | ECS task is in a public subnet |
RDS | $18.75 | db.t4g.micro - the smallest |
ALB | $16.88 | |
S3 | $0.00 | Frontend |
Cloudfront | $0.00 | Free tier includes 1TB of data transfer out per month |
Total: $43.29 USD/month
Why is the ECS task in a public subnet?
The ECS task is placed in a public subnet for it to have outbound internet access, so we need to pay for the IPv4 address. Alternatively, it could be put in a private subnet, but then you'd either have to pay for a NAT Gateway, or use something like fck-nat.
ECS without ALB
It is possible to run ECS without an ALB, by using CloudMap and a HTTP API Gateway as shown here. However, Server Sent Events (SSE) are not supported by API Gateway, and so for my applications which use that (e.g. LLM streaming), it is not viable.
Optimized Setup
After thinking through, I realized the pricey components could be swapped out:
- ECS: Replace with App Runner, a container service that can automatically scale vCPU usage to zero. You only pay for the cached memory. Scaling down is fast - within a minute.
- RDS: Replace with a serverless Postgres provider - I use Neon.
- ALB: Not required with App Runner.
The cost:
Service | Price | Notes |
---|---|---|
App Runner | ~$2.60 | $0.007/GB-hour idle, $0.064/vCPU-hour active. Based on 0.25vCPU/0.5GB. |
DB (Neon) | ~$0.50 | $0.14/CU-hour + $0.35/GB-month storage. First 50 hours free. |
S3 | ~$0 | Frontend |
Cloudfront | ~$0 | Free tier includes 1TB of data transfer out per month |
Note: For Neon, the smallest configuration is 0.25CU, which is 0.25vCPU and 1GB RAM.
Total: ~$3.10 USD/month
Compute: App Runner vs ECS
Assuming the App Runner instance (with 0.25vCPU and 0.5GB RAM) is continually active, this will cost \((\$0.064/4 + \$0.007/2) \times 750 = \$14.625\).
For ECS, an equivalent Fargate Spot instance together with the static IP and ALB would cost \(\$3.91 + \$3.75 + \$16.88 = \$24.54\).
For low-moderate traffic, App Runner is the winner here.
Note: It might be possible to use Cloudfront, CloudMap and ECS (with public IPs). This would still allow for SSE, and cost \(\$3.91 + \$3.75 = \$7.66\), making it more attractive if the App Runner cost were to go above this. I haven't explored this yet.
Database: Neon vs RDS
Neon starts to get expensive when the DB is being constantly queried.
If the database were to be active throughout the whole month, the cost (for the smallest 0.25CU instance, which has 0.25vCPU and 1GB RAM) would be \(\$0.14/4 \times 750 = \$26.25\)
In contrast, a RDS db.t4g.micro instance (with 2 vCPUs and 1GB RAM) costs $18.75.
So if application traffic picks up, I'm switching back to AWS RDS.
FAQ
Why not use AWS Lambdas?
The state of Lambdas has changed quite rapidly, and now container images are supported, as well as even wrapping an entire web framework (FastAPI/Express/Laravel) into a Lambda via the AWS Lambda Web Adapter. You do not pay anything for Lambdas when they are not in use (apart from ECR fees if you are using a container image). The Function URL allows Lmabdas to stream their responses, and Cloudfront supports Function URLs as an origin.
Unfortunately, the biggest drawback is still the cold-start.
I ran my backend on both App Runner and AWS Lambdas:
Service | Cold Start Time |
---|---|
App Runner | ~300ms |
Lambda | ~6000ms |
The UX is quite bad with such a long cold start time, and it is going to turn away a lot of potential users.
That being said, if you don't require SSE and can tolerate the cold starts, Lambdas can be cheaper than App Runner.
What about AWS Lambdas with Provisioned Concurrency/SnapStart?
Provisioned Concurrency pre-initializes a set number of execution environments for your function, reducing cold start times significantly. SnapStart takes a snapshot of the initialized execution environment and uses it to create new execution environments, further reducing cold start times.
The base costs (assuming 0 incoming requests, based on a single 0.5GB instances) are:
Service | Cost | Notes |
---|---|---|
App Runner | $2.60/month | $0.007/GB-hour idle |
Lambda (Provisioned) | $5.48/month | 2,628,000 seconds x 0.5GB x 0.0000041667 USD |
Lambda (SnapStart) | $2.03/month | 0.0000015046 per GB-second for cache |
Unfortunately, SnapStart does not work with container images, which doesn't fit my use case.
Why not AWS Lightsail for the database?
AWS Lightsail (a VPC service) is a cheap way to run Postgres; the cheapest plan is $5/month. You'll need to manage updating and scaling yourself, however. Also, this makes my IaC a bit less isolated, since multiple applications are going to be sharing the same database - have to be careful to avoid using the same name for databases.
Why not use a NoSQL DB like Firestore/DynamoDB?
I'm personally not a fan of NoSQL as there's less type-safety, less constraint enforcement and migrations are more troublesome. Also, with larger scale, you start paying for read/writes, unlike traditional SQL databases where you just pay for the runtime.
Why not Aurora Serverless V2 for the DB?
Aurora Serverless v2 supports auto-pause, which scales the instances to zero after inactivity. However, the reboot time is around 15 seconds, which is quite slow.