What is scalability
Scalability is the property of a system to handle a growing amount of work by adding resources to the system. - Wikipedia
Scalability refers to the ability of a system to expand its capacity in specific areas. In simpler terms, scaling involves increasing the capabilities of a system in various dimensions, such as:
- Processing larger amounts of data
- Managing more simultaneous connections
- Maintaining consistent response times as API requests increase
Basic design principles
One of the main objectives in scaling a system is to enhance its throughput, which is measured by the number of requests it can handle within a given time frame. The fundamental approaches to achieving this goal are replication and optimization.
Replication
Replication involves duplicating resources to augment the system's capacity to handle more requests, such as replicating servers. For example:
- Replicating a web server across multiple instances to handle increased web traffic
- Creating multiple replicas of a database to handle increased read/write operations
- Replicating a message queue to distribute workload among multiple instances
- Replicating a cache to improve read performance and reduce load on backend services
Optimization
Optimization focuses on enhancing the system's capacity without adding resources. This can be achieved by methods such as adding indexes to databases, optimizing algorithms, or rewriting the server code in a more efficient language like C++. For example:
- Optimizing database queries to reduce response times and increase throughput
- Compressing data being transferred between systems to reduce network traffic and increase transfer speeds
- Implementing parallel processing algorithms to distribute workloads across multiple processors or nodes, increasing overall processing power and reducing response times
Trade-Offs
Scalability vs. Architecture
In software development, scalability, availability, performance, security, manageability, and observability are essential non-functional requirements, also known as quality attributes. A design that prioritizes one quality attribute may impact others, resulting in architecture trade-offs.
Quality attributes have varying degrees of importance. During the design phase, the focus is on satisfying high-priority quality attributes while minimizing any negative impacts on others. When designing a software system towards scalablity, we need to consider how the design affects other quality attributes.
Performance
When optimizing a system's performance, we often aim to meet certain metrics for individual requests, such as an average response time. Improving the performance of individual requests generally enhances scalability as it provides more resources to handle more requests.
However, the process of improving performance is not always straightforward. It may involve rewriting algorithms, rewriting the server code in a different language, or finding faster libraries to perform specific tasks. Additionally, storing data in memory rather than a database can improve performance but may reduce scalability as more memory is required for each request.
In some cases, it may be beneficial to slightly slow down individual requests to increase the system's capacity for scalability. Balancing the trade-offs between performance and scalability requires careful consideration to optimize the system's overall performance.
Availability
When scaling a system through replication principle, multiple instances of the server are created, ensuring that the others remain available in case of failure.
However, if the server is stateful, such as a database server, it becomes necessary to determine how to maintain the states across the replicated instances. This is where the issue of replica consistency arises. Consistency is a critical aspect to consider when replicating states to ensure scalability and availability.
Manageability
As we increase the number of instances in a system, their interaction becomes increasingly complex. It is crucial to ensure that the instances and their interaction function as expected and that the performance meets expectations. To achieve this, we require monitoring platforms to check the instances' health and behavior, which is known as observability.
However, meeting the monitoring requirement necessitates adding a significant amount of monitoring code, which involves developing and evolving an observability platform.
To manage the complexity and costs of scalability, automation (DevOps) is essential. DevOps is a collection of tools and practices that facilitate the rapid development and delivery of software systems by automating operations such as testing, deployment, management, upgrades, and system monitoring. However, this also contributes to the system's complexity.
Security
A secure system comprises three fundamental elements: authentication, authorization, and integrity. These elements guarantee that data cannot be intercepted during transit over networks, and data at rest (persistent store) cannot be accessed without authorization.
Security and scalability are negatively correlated. The more security layers a system has, the more resources are allocated to those layers.
In the network layer, our system employs the Transport Layer Security (TLS) protocol. TLS employs asymmetric and/or symmetric cryptography to provide encryption, authentication, and integrity. However, establishing a secure connection from both the server and client sides, as well as encrypting in-flight data using cryptography, entails a significant performance cost.
In the data storage layer, we must encrypt the data again to protect it at rest. This, too, reduces system performance and negatively impacts system scalability.
Scalability vs. Costs
What is the required effort and resources for scaling a system?
Scaling a system can be as simple as running the server on a more powerful virtual machine or replicating it to run more instances. However, in more complicated cases, scaling may require code changes and redesigning the database schema. These trade-offs need to be considered when scaling the system.
In more complex scenarios, scaling a system may require addressing specific issues, such as:
- The server generates excessive amounts of data for each request, causing a decrease in response time. This requires optimizing the code responsible for generating data
- Concurrent read and write requests to the same records may cause conflicts. This requires redesigning the database schema and modifying the code in the data access layer
However, costs become apparent when considering the above options, such as:
- Upgrading the database server, which may require 15 hours of effort and cost $5000/month for cloud services. This option may be expensive and could potentially drain the company's finances
- Rewriting the web server, which may require 10000 hours of development effort. This option may take too much time, and worse, it may lead to losing customers who are not satisfied with the system's performance
To be prepared for such scenarios, it is crucial to build the foundation of the system to scale from the beginning. The aim should be to develop software systems that can be scaled exponentially while keeping the cost increase linear, or in another words, the hyperscale systems.
Keynotes
- To scale a system is to expand its capacity in some aspects
- One of the common aspects for scaling a system is to increase the system's throughput by utilizing two fundamental principles: replication and optimization
- When designing scalable systems, we need to carefully consider the trade-offs between scalability and architecture, as well as between scalability and costs