11 January, 2024•9 minute read

How to think about serverless pricing

The cloud business model is a simple one. Amazon, Google, Microsoft, and smaller players like DigitalOcean are all just following the age-old runbook of buying in bulk and reselling individually. There’s no magic or “innovation” here. The business model of AWS is identical to that of your local supermarket.

The difference lies only in the type of commodity being sold. Your supermarket might order in 500 apples at $1 each before selling them to you for $1.20 each. Amazon runs one big server at a cost of $1,000/mo and carves it up into six smaller EC2 instances which they then rents to you for $200/mo each. In both cases, the supermarket and Amazon have posted a $200 profit through leveraging an economy of scale.

This model works out pretty well for everyone. A supermarket buying a few thousand kilograms of apples in one bulk order makes a lot of sense, but my household doing the same would be silly. There’s no way I would be able to eat all of those apples before they started to rot. Not only that, but my local supermarket invests significantly more into their supply chain management and operations than my household does. Combine these two factors, and it’s possible the $1.20 price they’re offering me is the best deal around.

AWS and other cloud providers are a similar story. Most companies do not require very much compute at all, and they also tend to have somewhat variable workloads. It makes a lot of sense in this case to flexibly rent small portions of a server from Amazon on an as-needed basis, rather than committing to a big capital outlay and figuring out which colocation provider you want to use.

And, of course, there are different SKUs available. If your household really likes apples then the supermarket will happily sell you a larger bag at a cheaper per-unit price. Amazon, too, will quite happily let you rent out the entirety of that big server at a discounted rate if you use something like EC2 Reserved Instances.

The key thing to realize here is that as you buy larger bags of apples, your per-unit cost will decrease. The worst-case scenario for your household’s unit-level economics would be to purchase apples individually. If you only ever eat one apple between trips to the supermarket, then that’s probably OK. But if you know you need five kilograms of apples to make a large apple pie, then purchasing them individually would be pretty silly.

Serverless pricing models function kind of like purchasing your apples one at a time. Whether that makes sense or not depends on your business model.

Serverless is like buying apples one at a time

This is really the key comparison to keep in mind when evaluating serverless—and by “serverless”, I mean “managed services with a pay-for-usage scale-to-zero pricing model”—technologies for your business. When you use AWS Lambda, Upstash Redis, or Planetscale’s managed Vitess platform you are essentially buying individual apples from the grocery store.

Here’s a really concrete example: a t4g.micro EC2 instance in the us-east-2 region will cost you $0.0084/hour and give you 2 vCPUs and 1 GB of RAM to play with. Whether you use up all that RAM or not, your cost is identical because when you use EC2 you pay for capacity, and not what you actually end up using.

On the other hand, you could provision an ARM architecture Lambda function 1 GB of RAM. In the same us-east-2 region, this Lambda function will cost you $0.0000133334 for every second it executes.

Multiply that per-second cost by 3,600 to convert it into an hourly cost, and you wind up with $0.0480. That’s 5.7x more than our EC2 instance!¹

And that price difference doesn’t account for the difference in CPU performance. Our EC2 instance boasted 2 vCPUs, but our Lambda function only receives ~58% of a single vCPU when it is provisioned with only one gig of memory.

If you need ten kilograms of apples to bake a big batch of apple pies for the Fourth of July, then you’d be a fool to buy them individually. Likewise if you know you need a full hour’s worth of compute you’d be a fool to buy it one function at a time.

So when should you use technologies like Lambda?

Serverless pricing alignment

Serverless is a good fit in the following three cases:

You’re trying to find product-market fit
You have extremely unpredictable traffic
Serving a request earns more revenue than it costs

Let’s break them down.

Finding product-market fit

If you don’t have product-market fit, or are otherwise early in your company’s lifecycle then serverless technologies enable you to move much faster than you would otherwise be able to. I wouldn’t say that patching an EC2 instance or setting up autoscaling policies is “hard”—most cloud engineers can do this work just fine—but it is undifferentiated heavy lifting.

If you don’t have product-market fit then your startup is default dead. That’s an existential emergency that you **must** solve, and every single minute you spend managing EC2 instances is a minute you ***aren’t*** investing into making your product better.

Serverless frees up your team’s capacity to work on actual R&D, and that additional R&D could just be difference between you failing and making it through to your next fundraising round. The more darts you get to throw, the better your odds are.

Serverless is a really good choice in this situation where you need to make a lot happen with few resources in order to make it to the next checkpoint alive.

You have highly unpredictable traffic patterns

Here the “scale-to-zero” and “pay-for-use” models really shine. The EC2 instance is only cheaper if you are actually in a position to use up all of that compute over the course of the next hour. At a 5.7x cost difference, if you average less than 17.5% utilization of your EC2 instance then you’re actually worse off from a cost perspective.

EC2 instances simply don’t give you the same granular level of scaling that Lambda offers. Lambda lets you scale memory on the scale of a single megabyte, whereas EC2 forces you to select between pre-defined instance types which start at 512 MB of RAM.

Lambda is also much more elastic than EC2; if you experience a big burst in traffic then Lambda cold starts are measured in milliseconds, whereas adding a new EC2 instance to your autoscaling group can take minutes.

Note that the requirement here is for your traffic to be unpredictable. If you run an online store then it’s pretty easy to know ahead of time that on Black Friday you’ll see a big spike in traffic. In cases where you can actually see the surge out on the horizon it’s easy to spin up additional EC2 instances long before the traffic actually hits. Cases like this aren’t a perfect fit for serverless.

But if you do have highly variable traffic with no way to predict when these spikes will occur, then serverless is a perfect fit for your operational requirements. Just be mindful Lambda can’t immediately scale you from 0 to infinity due to burst concurrency quota.

Serving a request earns more than it costs

Here’s where things get really interesting, and it’s the big use case for serverless that doesn’t seem to get considered much. A lot of people have the two following thoughts in mind with respect to serverless technologies:

It’s great for prototyping and surviving the early stages. That’s correct!
It’s risky, because if you get a big spike in traffic you’ll wind up with a big bill. This isn’t *quite* correct.

A big concern you’ll see echoed online is that building on top of serverless is risky because of the extreme elasticity. If you deploy an app on three EC2 instances and you take a big traffic hit, then the worst case scenario is that your instances will get overloaded and crash. This is the “Slashdot effect” in action, and it happens all the time when sites go viral on Reddit, Hacker News, et al.

If you were to build on top of a serverless platform, on the other hand, chances are good that you’ll manage to withstand that big burst in traffic. The problem—so the story goes—is that once the dust has settled Amazon is going to serve you with a big fat invoice for all that serverless compute you unwittingly used up.

Word to the wise...

In practice, direct Lambda costs are usually a small part of your bill. What should really keep you up at night are CloudWatch costs! They are killer, and because the default Lambda execution role grants permissions to write to CloudWatch they are a bit unintuitive to avoid.

And look, in some cases this is a valid concern. If you’re in a line of business where you can serve $5,000 worth of traffic and see $0 in revenue then you probably shouldn’t bet the farm on serverless.

But what if you’re running an online store, and you use Lambda (or Algolia) to power your product search? How much does it cost you to serve one search, and how much revenue does each search lead to on average? If it costs one tenth of a cent to serve a search and the average search ends up generating $5 of revenue, then who cares what your Lambda bill is?

Seriously. In this case, a $1 million bill for Lambda compute means you earned $5 billion in revenue. If you’re in this situation and you’re upset about how much Lambda cost you, then you have missed the forest for the trees.

This isn’t to say that cost of revenue is never worth thinking about. If you can spend less than $1 million to make that $5 billion then that would, of course, be nice to do. But the idea of a “surprise” AWS bill in this situation is nonsense.

There are actually a lot of businesses which fall into this category where serving a request makes more money than it costs. Think about it:

Any marketing website is used to convert leads into customers. That’s revenue. Assuming the pitch is on point then rendering that marketing pages is going to cost less than what those new customers pay.
A business like AdmitYogi Essays prices its services with a “pay-per-use” credits system anyway. If someone’s using an AdmitYogi feature, they have already paid the cost of serving their request upfront.
A big social media company like YouTube subsidizes their request costs by serving ads.

There are exceptions, of course. Unmetered SaaS apps like Slack need to handle a tremendous amount of load off of static per-seat prices. Some SaaS companies will be able to bake in enough margin to cover the more expensive pricing model of serverless technologies, but Slack is definitely better off buying their apples by the kilo.

Some practical examples

Serverless pricing models have been at the forefront of my mind as I work on building gethook.dev. We’re ultimately going to have a usage-based pricing model which means that we need to seriously think about our own per-request costs.

Simply put: if we charge you $1 for every million webhook events we process for you, then on our end we need to be able to do that with less than $1 of cost. The unit-level economics are really important for us, because unit-level economics are all we have on the revenue side.

If we were to build gethook.dev on top of Lambda we’d be screwed. I’ve focused predominantly on the cost of compute, but Lambda also charges a flat $0.20 fee for every million invocations of a Lambda function. That’s a big number in a world where you yourself are charging your customers $1 per million—especially when “processing a webhook” in our case might involve execution of multiple different backend processes.

In other words, the nature of the gethook business means that we must adopt a “serverful” approach. Serverless pricing models don’t make sense for us, because if we were to use them we’d need to increase our prices so much that nobody would even consider using our service in the first place.

On the other hand, I have another side project in the works called “Noochly.” Noochly is a vegan recipe generator targeted towards people who are interested in trying out plant-based foods but are intimidated by the learning curve of using “difficult” or “exotic” ingredients such as tofu or nutritional yeast.

Noochly is a good fit for serverless technologies, because there are easy ways of boosting per-request revenue above our per-request cost. We can use AI on the generated recipe text to suggest relevant kitchen equipment on Amazon and clip the ticket through their affiliate program, or provide US-based users a one click button to purchase the recipe’s ingredients through a partnership with Instacart.

Generating a recipe costs fractions of a cent, and the work is very amenable to caching. While my webhook business might not be a good fit for serverless, Noochly absolutely is.

Conclusion

It’s important to always keep in mind the underlying business reality of your cloud provider. They’re running big servers at a fixed cost, carving them up, and then selling those smaller pieces at a price point that yields a profit. The smaller the piece you buy from them, the larger their margin is.

This isn’t a secret. All the pricing tables are public information and there’s nothing stopping you from starting up a spreadsheet and crunching the numbers. You’ll find that—assuming equal utilization—serverless technologies are many times more expensive than their “serverful” equivalents.

But those unit-level economics aren’t the entire picture. Assuming full utilization is about as realistic as ignoring friction in a high school physics exercise. There are some warts to the cost comparison that muddy the water, and the exact values you need to consider are going to vary wildly from one business to another.

Furthermore, we can zoom out even further and ask much more big picture questions involving the business model. What does serving a request actually get you? If serving a request generates revenue—like it does in an ecommerce store, or like in Noochly—then the cost of using serverless technologies is a far smaller concern. So long as your cost of revenue is less than your revenue, you have the freedom to choose between optimizing cost or growing your customer base.

On the other end of the spectrum you have businesses which either don’t make money from serving requests at all, or which have margins so narrow that compute cost is a materially significant concern for your cashflow. In these cases—and in my experience, they’re actually kind of uncommon—you might want to avoid leaning too heavily on serverless.

And if you think AWS’ pricing is bad, just wait until you do the math on something like Vercel. Vercel charges $40 per 100 GB-hours of serverless function execution which—when converted to the units we’ve been using—works out to a cost of $0.4000 per hour of execution. 8.3x more expensive than using Lambda directly, and a whopping 47.6x more expensive than EC2!↩

AWS

Business

Serverless

« Newer post

Use `varchar`, not `text`

Older post »

How to install private npm packages