Extract Transform Load (ETL) Process in Distributed Database Academic Data Warehouse

Ardhian Agung Yulianto

Abstract


While data warehouse is designed to support the decision-making function, the most time-consuming part is Extract Transform Load (ETL) process. Case in Academic Data Warehouse, when data source came from faculty’s distributed database, although having a typical database but become not easier to integrate. This paper presents the ETL detail process following Data Flow Thread in data staging area for identifying, profiling, the content analyzing including all tables in data sources, and then cleaning, confirming dimension and data delivery to the data warehouse. Those processes are running gradually from each distributed database data sources until it merged. Dimension table and fact table are generated in a multidimensional model. ETL tool is Pentaho Data Integration 6.1. ETL testing is done by comparing data source and data target and DW testing conducted by comparing the data analysis between SQL query and Saiku Analytics plugin in Pentaho Business Analytic Server.

Full Text:

PDF


DOI: https://doi.org/10.11591/APTIKOM.J.CSIT.36

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 APTIKOM Journal on Computer Science and Information Technologies

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2722-323X, e-ISSN: 2722-3221

CSIT Stats

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.