An hybrid approach to quality evaluation across big data value chain

Mohamed Adel Serhani, Hadeel T. El Kassabi, Ikbal Taleb, Alramzana Nujum

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

While the potential benefits of Big Data adoption are significant, and some initial successes have already been realized, there remain many research and technical challenges that must be addressed to fully realize this potential. The Big Data processing, storage and analytics, of course, are major challenges that are most easily recognized. However, there are additional challenges related for instance to Big Data collection, integration, and quality enforcement. This paper proposes a hybrid approach to Big Data quality evaluation across the Big Data value chain. It consists of assessing first the quality of Big Data itself, which involve processes such as cleansing, filtering and approximation. Then, assessing the quality of process handling this Big Data, which involve for example processing and analytics process. We conduct a set of experiments to evaluate Quality of Data prior and after its pre-processing, and the Quality of the pre-processing and processing on a large dataset. Quality metrics have been measured to access three Big Data quality dimensions: accuracy, completeness, and consistency. The results proved that combination of data-driven and process-driven quality evaluation lead to improved quality enforcement across the Big Data value chain. Hence, we recorded high prediction accuracy and low processing time after we evaluate 6 well-known classification algorithms as part of processing and analytics phase of Big Data value chain.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Congress on Big Data, BigData Congress 2016
EditorsCalton Pu, Geoffrey Fox, Ernesto Damiani
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages418-425
Number of pages8
ISBN (Electronic)9781509026227
DOIs
Publication statusPublished - Oct 5 2016
Event5th IEEE International Congress on Big Data, BigData Congress 2016 - San Francisco, United States
Duration: Jun 27 2016Jul 2 2016

Publication series

NameProceedings - 2016 IEEE International Congress on Big Data, BigData Congress 2016

Other

Other5th IEEE International Congress on Big Data, BigData Congress 2016
Country/TerritoryUnited States
CitySan Francisco
Period6/27/167/2/16

Keywords

  • Big Data
  • Hybrid quality assessment
  • Metadata
  • Quality Metadata
  • Quality assessment
  • Quality metrics
  • Quality of process

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'An hybrid approach to quality evaluation across big data value chain'. Together they form a unique fingerprint.

Cite this