You’ve had a successful product launch and now you’re seeing explosive growth. All that growth is putting a strain on your technology. Recently it’s getting harder to add more users to the application. There have been a few small outages. Your tech team is throwing around the word “scaling”. What does that actually mean, and how can you make informed decisions around scaling a product?
Ask “Will this scale?” no matter what it is
It’s important to find out if things will scale no matter what it is you’re discussing. No one even really knows what that means, but it’s a good catch-all question that generally applies and drives engineers nuts.
– Sarah Cooper in 10 tricks to appear smart during meetings
“Scaling” is somewhat of a weasel word. It means everything and nothing at all. What are your trying to scale? The number of users the app can support? Data throughput? The hardware used by the application?
It’s also important to note the difference between performance and scaling. Performance is the amount of work you can do with the existing resources. Scaling itself has multiple meanings, sometimes referring to increasing the amount of work the application can do or increasing the amount of resources available to the app. Improving an application’s performance is one of several tools at your disposal to help solve scaling problems.
There are many different levers you can pull on to help meet your scaling goals, some of which are vastly more expensive than others.
Below are a list of actions you can take, ordered from cheapest to most expensive and I recommend following them in order. If you accomplish your goals before getting to the end then congratulations! You’ve just saved yourself work that’s expensive both in money and in terms of increased code complexity.
- Define Goals ($)
- Profile and identify bottlenecks ($)
- Hardware ($$)
- Performance optimizations ($$)
- Architecture changes ($$$$)
This list reflects the approach we take when working with our clients to solve scaling problems.
Before you do any work it’s important to know what you are working towards. Any talk of “scaling” an app is meaningless without any numbers attached. Contrast the following:
We need to scale this app
What changes do we need to make to be able to scale up to 100K daily users?
We’re onboarding a new supplier. We’ll need to be able to handle importing their catalogue of 10K parts into our system.
Tickets for a hotly anticipated event are going on sale next week. Our ticketing system needs to be able to support an expected 1200 transactions in the first five minutes.
Better performance and more resources might help you reach that goal. However, they might just add extra cost and complexity if they don’t address the application’s bottlenecks.
The worst thing you can do is blindly make changes hoping for the best. It’s important to figure out what is causing the issues you’ve been seeing before making any changes.
Is your database overwhelmed? Is your web server running out of memory? Are you receiving too many simultaneous web requests? Are you getting rate-limited by a third-party service?
For the vast majority of web-based applications, the first bottleneck you see will be the database.
Once you know what part of your system is slow, then you need to understand why and how slow. And not just in a qualitative “this is slow” sense. You want actual measurements that you can use to compare before and after you make changes.
Let’s say you’ve identified the database is your bottleneck. Then you should start asking yourself questions like: Are you dealing with too many queries? How many? Are your queries taking a long time? How long?
If you keep adding users, there will come a time when you outgrow your hardware. Hardware is cheap, and cloud providers make it trivial to add more. Some places allow you to scale up hardware simply by dragging a slider.
So go ahead and upgrade to that database plan with a higher row limit. Increase your max RAM. Get a bigger connection pool.
Just make sure you identify your bottlenecks first. Putting a faster web framework in front of a slow database may not have any measurable speed or throughput increases. Some bottlenecks can’t be easily be solved by throwing hardware at them. For these, you may want to look at doing some performance optimizations.
Once you understand the problem, it’s time to make some changes! The cheapest solution is often to do some performance tuning. These are usually small code changes that allow you to get more work done with your existing architecture and hardware.
For example, adding an index might dramatically improve the speed of that query. Eliminating an N+1 query loop could dramatically reduce the number of queries sent to the database. A little caching can do wonders! All of a sudden your app can deal with more users while using the same amount of resources.
Performance tuning can also have some downsides. For example, introducing some caching can dramatically increase the performance of your application. That’s a win! It also opens you to the possibility of cache invalidation bugs, a notorious class of bugs often described as one of the hardest problems in computer science. The general advice around caches (and performance work in general) is don’t add one until you need it. Otherwise you pay their cost without getting any value in return.
Eventually you’ll reach the point where your code is as efficient as it can be and adding yet more hardware is getting prohibitively expensive. Now is the time to revisit the architecture of your application.
Re-architecting is expensive so you really don’t want to get this one wrong. Always profile first and make sure you don’t have problem that can be solved with some smaller-scale performance tuning.
Adding an index to your database is far cheaper than introducing sharding and may solve your current issue.
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil
– Donald Knuth in Computer Programming as an Art
Scaling is often a set of tradeoffs. In addition to actual dollar costs for development time and better hardware, performance and scaling changes often introduce code complexity which impacts future development speeds and the rate of bugs introduced into the application.
This complexity means your code is now harder to understand for new hires, harder to extend, and more prone to bugs. Because the system now has more moving parts, there are more places where things could go wrong.
Greater capacity comes at a cost of greater program complexity and slower development speeds. Don’t pay these costs if you don’t need to.
Google and Neflix do X. Should we do the same?
No. Google and Netflix operate at such unimaginably huge scales. Their problems are not your problems.
But we need to support a billion users on launch!
You almost certainly don’t have 15% of the planet’s population waiting to sign up for your app on day one. By targeting capacity for a reasonable amount of traffic and then scaling from there, you can iterate on your product faster for a lower cost.
Should we do a rewrite?
Probably not. Rewrites are expensive and often don’t solve your original problem or just give you a new set of problems. Make sure you understand your application’s bottlenecks and have hard numbers before even looking at a rewrite.
Scaling is an incredibly nuanced topic that involves a lot of tradeoffs. There are countless technical solutions that can help you scale, ranging from single-line code changes to massive re-architectures of your application. This article gave a high-level overview of different families of solutions.
If you are facing scaling challenges and have questions about specifics or maybe aren’t even sure where to start, talk to thoughtbot!