Search This Blog

Wednesday, February 23, 2011

DWH Concept

ETL (Extract Transformation Load)

ETL is a process in datawarehousing that involve :

- Extraction data from outsiding sources
- Tranforming it to fit business need
- Loading it into end target (i.e Datawarehouse)

E (Extract) - The first part of ETL is to extract the data from the source systems.

T (Transform) - The transform stage applies a series of rules to the extracted data from the source to derived the data to be loaded to the end target.

L (Load) - The load phase load the data into end target.

OLAP (On Line Analytical Processing)

OLAP is a database technology which are specially designed to deal with the high volumn of data. OLAP provide you a very good view that what is happening but it does not provide that why is happening and what will be happened in the future.

DataMining

Datamining is a technology which perform a process of analyzing the data from different prospective & summurized is into a useful information that information can be used to increase the revenue, cut the cost etc.
Datamining provide you that why is happening and what will be happened in the future.

Data Analysis

Data Analysis is an approach in which row data is ordered and organized in such a way that useful information can be extracted from it.

Data Analysis Phases

1 - Data Cleansing
2 - Transforming
3 - Data Modeling 

1 - Data Cleansing

Data cleansing is a process of detecting & removing the error from a record set, table and database. This process is also known as Data Scrubbing. During data cleansing records are checked & either corrected or deleted as per as requirment.

2 - Transforming

In the Transforming phase, the source data is transformed as per as DWH tables.

3 - Data Modeling 

Data Modeling is a process in a Data Model is build of data structure (Tables) as per as requirement of the Database.A Data Model represent the nature of data, business rules govering the data & how it will be organized in the DB.

A Data Model is comprised of two parts :

- Logical Design
- Physical Design

Data model helps functional & technical team in designing the DB.

The Functional Team (that create the logical design) refers to one or more Business Analyst, Business Managers, End Users. And Technical Team (that create the physical design) refers to the one or more programmers & DBAs.

Data Modelers are responsible for designing the Data Model & they communicate with the functional team to get the business requirments & technical team to implement the DB.

EX :
------
Suppose a company (XYZ) is planning to build a Guest House (DB) & it call the Building Architect (Data Modeler) & Building Architect (Data Modeler) take the Guest House (DB) requirment (Business requirmnet) from the function team (Business Analyst, Business Managers) and develop a plan (Data Model) & give it to the Company (XYZ). And finally XYZ company call the civil engineer (Programmer, DBA) to construct the Guest House (DB). This concept is called Data Modelling.

Data Integration

Data integration is a process of combining the data from different sources & providing users with a unified view of these data.

Data Agreegation

Data Agreegation is a process in which the information is gathered & expressed in a summary form. The purpose of agreegation  is to get more information about the particular group.

Business Intelligence (BI)

Business Intelligence (BI) is a technology which is used to gethering, storing & analysing the data to make a better decision.

BI solution help a company in making the business decision faster, accurate & market oriented.

Technical DW-MSBI work flow

[Sources] -> [ETL(Extract tha data using SSIS)] -> [Data Cleansing & Tranfoming of data] -> [DWH(stored the data in DWH (Fact & Dimesion))] -> [Create the OLAP cube(using SSAS)] -> [Analyse the data & create the Reports] -> [Implement Data Mining structure(Data Mining)] -> BI (make a better decision)

Regards
Manish

2 comments: