Concurrency is the number queries that is run parallel on a system. When we get exhausted on resource capabilities like (CPU, Memory and storage) on a database, we have reached maximum concurrency and after that we need either to scale or change the number of requests being served.
The common data warehouse cycle involves loading the data at night with transactions that occurred during the day and querying the data during daytime. Here there is no requirement for concurrent querying and updates. The increasing demands of data warehousing creates a situation where concurrency is needed. For example, you have an application that requires a meagre amount of critical data to be uploaded continuously during the day. This is generally required in financial transactions where stock prices need to be uploaded continuously with change.
When companies spread across many time zones, with late night operation of offices, the business volumes increase. This causes the night windows for loading to decrease and hence the need for concurrency. Concurrency enables teams to work on the same real time data warehouse without one’s working negatively impacting the others. Concurrency allows higher speed of innovation and ensures higher accuracy of data being used
The features of a high concurrency environment include
- Impressive relational performance across a wide range of data types
High concurrency environment allows fast querying on a wide range of data including semi-structured data and others. Non-traditional data types from different online teams and product engineering teams are also queried fast.
- Automatic scaling of your warehouse
Even if your data warehouse can handle large number of concurrent accesses, it is quite possible that the data warehouse goes down when demand spikes. At that time, you may need to move users, schedule after job hours operation and add nodes and more. Automatic load balancing and scaling in the data warehouse architecture can mitigate these disruptions
- Employ ACID Compliance
A data warehouse having ACID compliance will ensure integrity and consistency of data without you having to write scripts or manage it manually.
The metric concurrency is derived from latency. By decreasing latency, you increase concurrency and vice versa