Transparen Corporation lends its expertise to clients experiencing rapid and sudden growth in traffic or server utilization, bottlenecks, systems instability, downtime during peak traffic, or which would like to plan to avoid such issues.
Here are some of the features of a highly scalable system:
- Avoidance, but sometimes optimization, of single points (AKA "single points of failure" or "bottlenecks") - some groups go too far in one direction or another, for example they try to avoid ALL single points, even where that is not practical, or try to prematurely optimize single points at the expense of being able to increase capacity by purchasing more hardware.
- Mistake 1: Adding more single points of failure instead of avoiding or optimizing existing single points
- Mistake 2: Not adding a single point which is really needed out of a dogmatic application of "avoid single points of failure"
- Low latency between parts - as low as possible. Some groups ignore the latency between parts, and in pursuit of "scalability", by adding too many servers for some back-end system, paradoxically, they increase the overall time needed to process each request, and reduce the overall throughput that is possible through the front-end systems.
- Mistake 3: Pushing the system out of balance by adding a slow (or high latency) back-end system, without adding corresponding bandwidth on the front-end systems.
- Mistake 4: Adding more servers when the problem might be a slow networking layer. Diagnose first, fix second.
- Mistake 5: Using lower clock speed CPUs, but more of them. Same principle, but it can also apply within a single server, especially if it has 96 or more cores.
- Efficiency - not wasting effort on any unnecessary actions. Each unnecessary step slows the system down.
- Avoidance of Magic - Magic doesn't scale reliably. Magic includes all "Black Boxes", "Website Optimizer Programs", "Security Appliances", etc., unless someone who knows the whole system understands how the devices work and how much capacity they can reliably serve. (Selective testing of key attributes can help with this.)
All of the above is very easy to say, but hard to do, for many reasons, some of which are:
- Available technical expertise may be sufficient to understand what needs to be done but not necessarily how to do it.
- There is a strong tendency in large organizations to make mistakes that inhibit scalability, despite the recommendations of in-house technical expertise, and the only remedy for this is to bring in a team with experience, perspective and position to recognise and correct the mistakes, if such mistakes have been made, in such a way that benefits the enterprise.
- Highly scalable systems can be very complex and involve a wide variety of technologies.
- Oversimplification of the principles of scalability can lead to incorrect assumptions about how to design or optimize a scalable system.
