In this post, I’ll try to explain what is application scaling and how it works. Let’s begin with the Wikipedia definition.
Scalability is the property of a system to handle a growing amount of work by adding resources to the system.
https://en.wikipedia.org/wiki/Scalability
Let us understand scaling and scalability in the context of software. Imagine you plan on opening a grocery store. To fit in the internet generation, you also want to allow customers to login and place orders from a website designed for your store. You are starting on a limited budget. So you have set up the store in a small space with a single billing counter with manual billing. You have also employed a single person to help you with running the store and take care of billing. You also set up your online web store. It is running on a single server that you have built from some old components.
Your stores – both physical and online – are up and running. In a few months, your store has gained popularity and many new customers started coming in regularly. Though you are very happy, you encounter a new issue – traffic. With the increased demand for your store, the waiting time at the billing counter has increased. Likewise – your online store has seen a spike in traffic and your poor old server isn’t able to handle it. The customers are getting increasingly frustrated and have started complaining.
Make the process more efficient and powerful – Vertical Scaling
You decide to make a few improvements to your stores. For the billing counters, you setup a billing management system and barcode readers. You also hire a dedicated employee to look after the billing counter and make the process efficient. For the online store – you have upgraded the server by fitting it with the latest and more powerful components. These changes delivered immediate results and reduced wait times both online and offline. The customers were happy with the improvements.
But the results turned out to be short-lived. As the number of customers grew, it became more and more difficult and expensive to make the billing counter more efficient and the servers more powerful. You realized that you might have to look at other options.
Distribute the work – Horizontal Scaling
Instead of buying increasingly costly equipment to make your billing counter more efficient, you hired several employees and set up multiple billing counters. This way you were able to serve multiple customers at a time and avoid large queues. A similar approach worked for your online store too. Instead of buying expensive, cutting edge components for your server, you built several medium priced servers. Each of these servers can handle a portion of the total requests. In this setup, you can scale up whenever needed by just adding a new billing counter or by adding a new server.
In real life software architecting, software application scaling is a combination of both vertical and horizontal scaling. While vertical scaling is easier to achieve, reaching higher levels of scaling can get really difficult and expensive. On the other hand, horizontal scaling might not be suited for smaller volumes but gets more affordable and efficient at higher volumes. A balance between these two is what can be used to achieve the desired result.
This article is a part of ELI5 series. In this series I try to explain various computer science and software architecture concepts with simple and easy to understand analogies. For more such topics, take a look at this page.