Cloud service providers, like Amazon AWS and Microsoft Azure, make it easy to get started provisioning resources every developer is familiar with: servers and databases. Unfortunately, it's just as easy to start building directly on these resources when what you really need to do is step back and architect a scalable solution.
And by scalable I don't just mean the scalability of your service. It also means the scalability of your operations and defect resolution.
Let's look at a few common pitfalls businesses encounter when they build on top of cloud services.
The basic unit of compute resources is the virtual server. In one form or another, you will almost certainly be using them (unless you're an all-Functional stack running on AWS Lambda or Azure Functions, in which case you get to feel smug while everyone else figures out how to manage their complex servers). Even if you are a Docker fanatic you will still need to provision virtual servers for your cluster.
But in this day and age you should highly question your architecture if you are building on top of virtual servers directly. For example, auto-scaling virtual servers usually requires writing bespoke scripts connecting application-specific monitoring solutions with application-specific scaling policies. If instead your app is built using Docker containers on a scalable cluster based on compute and memory reservations you will find that scaling compute resources becomes much easier.
There are always exceptions to the rule. Bastion hosts are a perfect example of a use case for one-off virtual servers for the purpose of interacting with resources residing inside private virtual networks. But in the vast majority of cases, using virtual servers directly these days is an anti-pattern.
Do you use the management console to provision your resources? Don't get me wrong, management consoles are essential. But they can also be a crutch. Beyond the fact that sometimes features aren't available from within management consoles, tasks performed in management consoles are not repeatable. If you created your infrastructure for your app today, how easily will you be able to rebuild part of it tomorrow? The next month? Next year?
Services sometimes need to be rebuilt to resolve defects. Not only did you spend a bunch of time fixing the defect, but you will waste more time releasing the fix if you don't have easy, reliable, and repeatable ways to ship changes to both the app source code and its infrastructure and environment.
This hits on one of the core features of Stackery. Stackery builds change sets for your infrastructure that are repeatable and executable without downtime. But even if you don't use Stackery, you should find some mechanism that enables you to quickly release your changes out to the world.
You can't be scalable if you don't know how to monitor the scalability of your app. The best cloud service providers include monitoring solutions out of the box. They may not be the prettiest solutions (I'm looking at you, AWS CloudWatch), but they are a critical tool for understanding the behavior of your app. Learn how it works, what monitoring is provided out-of-the-box, and then go further by brainstorming all the ways your app could fail without the built-in metrics providing you the data needed to figure out the problem. It's practically guaranteed that, for any meaningful service, custom metrics will be needed to monitor its health.
Once you understand how to monitor your app, figure out how to use metrics to drive auto-scaling of the underlying resources. Then test the living daylights out of it to make sure scaling occurs as expected, both up and down.
These common pitfalls are just a few possible ways your app could go down in a dumpster fire. But covering the basics of any endeavor tends to get you at least half-way to where you want to go. In the process, you will likely learn about advanced techniques to make your app even more robust!