Hadoop and what it is good for

Christos Chrysoulas

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

© 2017 Nova Science Publishers, Inc. All rights reserved. Organizations are flooded with data. Not only that, but in an era of incredibly cheap storage where everyone and everything are interconnected, the nature of the data we are collecting is also changing. For many businesses, their critical data used to be limited to their transactional databases and data warehouses. Nowadays, the variety of data that is available to organizations is tremendous. The challenge here is that traditional tools are poorly equipped to deal with the scale and complexity of much of this data. This is where Hadoop comes in. Hadoop is built to deal with all sorts of messiness. Apache Hadoop is an open-source software framework used for distributed storage and processing of dataset of big data using the MapReduce programming model. It consists of computer clusters built from commodity hardware. Above all, the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework. This chapter is a welcome to the magical world of Hadoop.
Original languageEnglish
Title of host publicationMachine Learning: Advances in Research and Applications
PublisherNova Science Publishers, Inc.
Publication statusPublished - 1 Oct 2017
Externally publishedYes

Keywords

  • Big Data
  • Hadoop
  • HDFS
  • Statistical Analysis
  • Machine Learning

Fingerprint

Dive into the research topics of 'Hadoop and what it is good for'. Together they form a unique fingerprint.

Cite this