One important high availability principle is concurrency control. The idea is to allow only that much traffic through to your system which your system can handle successfully. For example: if your system is certified to handle a concurrency of 100 then the 101st request should either timeout, be asked to try later or wait until one of the previous 100 requests finish. The 101st request should not be allowed to negatively impact the experience of the other 100 users. Only the 101st request should be impacted.
Why Concurrency Control? :
Most systems exhibit Response Time patterns that rise exponentially after some critical concurrency limit(Escape concurrency) for the system has been reached- as shown in the diagram. With concurrency control you can protect the system from entering the Critical Zone where the exponential response time behavior is experienced.
Benefits of Concurrency Control :
- Avoiding Cascading failures. Imagine a sudden surge in traffic which goes above your system limits. Lets say that your database becomes the bottleneck. As the database slows down threads on the Application Servers will begin to pile up. Now the layer above your Application Servers your web servers/load balancers) will start to pile up threads as well. Very soon your system reaches a state where every layer of the architecture has been compromised – the system has reached its escape concurrency – you have lost control of the system. By controlling the amount of traffic that you take in you avoid this situation.
- Flexibility. The ability to control the amount of traffic that your system receives is powerful. For example : After a release you realize that in production the scalability profile of your system has changed for the worse. You were initially certified for a certain concurrency but all metrics are pointing to the fact that the system will not handle this load. Now you can reduce the concurrency allowed. While this is not an ideal situation it is much preferred to a system meltdown.
- Guarantee of Service Quality. One can be more confident in the Quality of Service being provided to the requests that are allowed through.
Haproxy is good place to start to investigate. It offers a myriad of features such as load balancing, content switching in addition to concurrency control. Concurrency limits can be set at a global level and/or at a server level. Importantly it also offers request queuing. This is extremely powerful when used in concert with concurrency control. More on this in a later post. I recommend that you look at haproxy when designing high availability systems.
Concurrency controls should be put in place in concert with a capacity planning program. If indeed system concurrency limits are being approached one better start to scale the system and then increase the concurrency allowed.