According to Gartner, “an average organization expects to triple its allocated share of capacity to public cloud services across the next two or three years”. And, with increase in capacity, there is a need for scaling the cloud infrastructure. The scalable infrastructure enables continuous growth in traffic and data size, allows consistent performance even when multiple resources are added to the cloud infrastructure
Below I have mentioned the stages of scaling your infrastructure
- Audit of the infrastructure
It is important to measure the performance of the systems in cloud and the current load on the cloud. Then one needs to know how much the load can grow. It can be done by analysing the historical data and using forecasting techniques to predict the amount of workload or data growth in upcoming years. The best part of infrastructure audit in terms of load, cost, efficiency of performance and analysis of historical data and metrics is that it helps you recognise the blockers which act as impediment to scaling
- Designing the scaling architecture
Once you have an understanding of the infrastructure and demand spikes or troughs (traffic patterns), your team can devise various levels of scaling
- Determining the type of Scaling
Horizontal scaling- In this you need to provision additional servers to meet your demands. This helps in splitting the loads between servers and reducing the number of requests per server
Vertical Scaling- It refers to empowering your existent server by adding more CPUs, more memory or I/O resources. It helps replacing one server with a more powerful or higher performance server.
- Determine the type of Scaling
Scaling to cloud can be done in three ways
- Manual Scaling– done with the help of engineers or teams. Manual scaling may not take into consideration continuous fluctuation of demands for applications. If by error, the team misses to scale down when required, the organisation can incur losses.
- Scheduled Scaling– It is generally based on your demand curve and helps in optimised provisioning based on usage. An organisation may have peak demand during certain particular time of day and the demand may be minimum at other time. By scheduling scaling up and scaling down at such times respectively, the organisation basically tailors scaling to usage with manual efforts every day.
- Automatic Scaling- In this your computing, storage and database is automatically scaled based on some pre-defined rules. It enables continuous availability of your applications and minimum performance issues and outages.
( c) Determining the types of instances
You can go in for spot instances (where you do not need any commitment to make and can purchase hourly compute or storage power), reserved instances (where you need to commit running for 1-3 years)
(d) Determining the cloud availability zones (cloud data centres geographic locations for access and running)
(e) Redesigning your IT infrastructure for negating the interoperability problems if you want a hybrid on-premise and cloud setup
(f) Use load balancers
Exhaust the possibility of using load balancers to distribute workloads among multiple nodes and ensure feasible number of requests per node for peak performance.
(g) Employ containers
Containers offer great chances of scalability as they function individually even while sharing a kernel limiting inconsistencies or problems during scaling to a single container than the entire machine. Containers allow deployment of a huge number of similar application instances which requires low resources. This helps in scaling microservices
You set up the pilot project wherein you conduct load tests of your system and document the inconsistencies, and other issues
- Production and Support
Here the system is made live for use and is continually supported by teams for optimising the performance and resolving minor inconsistencies that may arise while scaling.