Table of Contents

Traditional data warehouse Vs Data lake

Data Lake Architecture vs. Traditional Datawarehouse Architecture 

Datalake architecture 

Data lake is a repository for storing huge volumes of structured, unstructured and semi-structured data. There is no limit on the file size or the format that can be stored.  

Data lake architectural components 

  • Data Ingestion- It contains connectors to extract data from multiple data sources (databases, servers, emails etc) , in a variety of format( structured, semi-structured and unstructured). It provides data curation options. 
  • Data storage component- It is able to store raw and curated data in any format. This component allows compression and encryption of data. 
  • Security components- Security is enabled at all stages of information flow in data lake be it data ingestion, data storage, data consumption or data discovery 
  • Data quality management- Data Lake implementation allows options for setting data quality rules, data quality reporting and remediation 
  • Meta data management – Data Lake has mechanisms for data audits, data lineage checks, data lifecycle management and policy enforcement. 
  • Data auditing- Data Lake provides options for complete data auditing and recording data transformation from the perspective of risk and compliance. It helps audit who/how/or when the data elements have been changed 

Flow of information in a data lake 

There are multiple layers in this architecture 

  • Ingestion Tier- This layer ingests data in various format 
  • Storage Tier- This layer stores the raw data 
  • Insights Tier- These layers provide insights of the input data 
  • Distillation Tier- This layer consumes data from storage and converts it into structured format for better analysis 
  • Processing Tier- This layer uses algorithms and processes user queries 
  • Presentation layer- This layer presents the results and analysis 

Traditional data warehouse architecture 

It consists of three tiers 

  • Ist Tier (Bottom Tier)- It contains the database server which extracts data from data sources 
  • 2nd Tier (ETL Tier or Middle Tier)- The data is extracted, transformed and loaded into the enterprise data warehouse and then into data marts.  
  • 3rd Tier (Client layer)- The data prepared for analysis is then thoroughly analysed by high level data analytic tools and presented as reports 

 

Liked what you read !

Please leave a Feedback

Leave a Reply

Your email address will not be published. Required fields are marked *

Join the sustainability movement

Is your carbon footprint leaving a heavy mark? Learn how to lighten it! ➡️

Register Now

Calculate Your DataOps ROI with Ease!

Simplify your decision-making process with the DataOps ROI Calculator, optimize your data management and analytics capabilities.

Calculator ROI Now!

Related articles you may would like to read

The Transformative Power of Artificial Intelligence in Healthcare
How To Setup An AI Center of Excellence (COE) With Use Cases And Process 
Proposals

Know the specific resource requirement for completing a specific project with us.

Blog

Keep yourself updated with the latest updates about Cloud technology, our latest offerings, security trends and much more.

Webinar

Gain insights into latest aspects of cloud productivity, security, advanced technologies and more via our Virtual events.

ISmile Technologies delivers business-specific Cloud Solutions and Managed IT Services across all major platforms maximizing your competitive advantage at an unparalleled value.

Request a Consultation