Among the many, MANY announcements at re:Invent this year is one that settles a years-long debate or concern amongst people considering using AWS Lambda to build serverless applications: cold starts.
AWS Lambda is now 5 years old. For all of that time, there’s been a concern about latency the first time a function is called. Well, fret no more: with Provisioned Concurrency you can pay a tiny fee to know Lambda functions are always available.
AWS doesn’t offer its users fine-grained detail on the infrastructure that runs your serverless functions. This is by design since the whole point of serverless is that you’re not taking control of or responsibility for the server-layer. But even working from first principles one could surmise cold starts would be a problem, and sure enough, in some contexts users could observe them.
The tiny virtual machines that run Lambda functions have to be started when you first create a Lambda. For AWS to offer the service so cheaply and efficiently (when no requests are coming into your Lambda for ages, to boot!) the service will have to stop some of these virtual instances. Therefore, when traffic to a Lambda is spiking, there is a small but observable delay as the service has to start enough virtual instances to handle the traffic spike.
Cold starts have not historically been a significant problem for most users. Here are the groups that have always been immune to most cold start pain:
Even for those affected, cold start increases were usually measurable but not significant. I recall one presentation from our local serverless meetup where cold start time was clocked at 100ms for a Node.js function. It’s unlikely in that case that this delay was causing a single poor user experience.
But cold starts fall more heavily on languages with longer start-up times. In Yan Cui’s excellent article on this topic (which I’ll link again at the end of this post since it’s so great), he notes this can get significant with Java and .Net.
Java and .Net functions often experience cold starts that last for several seconds! For user-facing APIs, that is clearly not desirable.
Many developers took it upon themselves to send repeated requests to their Lambda functions so that the AWS platform would never shut down all instances.
You can manually make sure your function is ready by sending a bunch of pointless requests, a process known as ‘spamming’
To get a sense of how much people worry about cold starts, one npm package called lambda-warmer has over 5,000 downloads a week!
Rather than having to manually send your own Lambda functions repeated pokes, Provisioned Concurrency turns this process into a simple AWS configuration. You can even schedule increases and decreases in the number of instances that are warm and ready to go!
For a truly deep dive, you must read Yan Cui’s piece on Provisioned Concurrency, but in the meantime go forth and adopt AWS Lambda knowing its most significant performance concern is now a thing of the past.
Stackery is the tool for teams to adopt serverless. Sticky problems like creating complex templates for AWS CloudFormation, and synchronizing your app across multiple environments like Prod and Staging, are all made a lot simpler with Stackery.
Sign up for a Stackery account today, and see how we can improve your team’s process!