Traditional data warehouse Vs Data lake

Data Lake Architecture vs. Traditional Datawarehouse Architecture 

Datalake architecture 

Data lake is a repository for storing huge volumes of structured, unstructured and semi-structured data. There is no limit on the file size or the format that can be stored.  

Data lake architectural components 

  • Data Ingestion- It contains connectors to extract data from multiple data sources (databases, servers, emails etc) , in a variety of format( structured, semi-structured and unstructured). It provides data curation options. 
  • Data storage component- It is able to store raw and curated data in any format. This component allows compression and encryption of data. 
  • Security components- Security is enabled at all stages of information flow in data lake be it data ingestion, data storage, data consumption or data discovery 
  • Data quality management- Data Lake implementation allows options for setting data quality rules, data quality reporting and remediation 
  • Meta data management – Data Lake has mechanisms for data audits, data lineage checks, data lifecycle management and policy enforcement. 
  • Data auditing- Data Lake provides options for complete data auditing and recording data transformation from the perspective of risk and compliance. It helps audit who/how/or when the data elements have been changed 

Flow of information in a data lake 

There are multiple layers in this architecture 

  • Ingestion Tier- This layer ingests data in various format 
  • Storage Tier- This layer stores the raw data 
  • Insights Tier- These layers provide insights of the input data 
  • Distillation Tier- This layer consumes data from storage and converts it into structured format for better analysis 
  • Processing Tier- This layer uses algorithms and processes user queries 
  • Presentation layer- This layer presents the results and analysis 

Traditional data warehouse architecture 

It consists of three tiers 

  • Ist Tier (Bottom Tier)- It contains the database server which extracts data from data sources 
  • 2nd Tier (ETL Tier or Middle Tier)- The data is extracted, transformed and loaded into the enterprise data warehouse and then into data marts.  
  • 3rd Tier (Client layer)- The data prepared for analysis is then thoroughly analysed by high level data analytic tools and presented as reports 

A more vivid representation of the architecture of traditional data warehouse has been provided below 

Get free consultation from our tech experts

Get free consultation from our tech experts

Schedule a discussion
Get free consultation from our tech experts
Get free consultation from our tech experts

Related Posts

Aligned to business domains to provide deep expertise to solving and enabling business units
Connect With Us

Request a Consultation