Skip to main content

HOW BIGDATA AND HADOOP RELATES

What is Big Data?

           We are living in an era where a huge volume of data has been generated. The below points indicates the rate at which data has been generated in the world.
  • Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone.
  • According to IBM, 80% of data captured today is unstructured, from sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, to name a few. All of this unstructured data is Big Data.
Definition: Gartner defines Big Data as high volume, velocity, and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. 

If your company is dealing with a set of data that makes you difficult to process and make information out of it, then you are facing and handling big-data.

There are traditional data storage systems, but these are soon become obsolete. Nowadays Hadoop is an integral part of lots of companies who were dealing with big-data to store and process a huge volume of structured and unstructured data.

Role of Hadoop

What is Hadoop?

      Hadoop is an open source project from the Apache software foundation that provides a software framework for distributed application in order to store and process data on clusters of servers. This is modeled after Google's MapReduce Programming and Google file system.

Yahoo launched Yahoo! Search Web-map in 2008, which is the first and largest implementation of Hadoop. The Hadoop cluster of Yahoo! Search Web-Map consist of 10,000 more Linux servers.

What does Hadoop solve?

  • Organizations are discovering that important predictions can be made by sorting through and analyzing Big Data.
  • However, since 80% of this data is "unstructured", it must be formatted (or structured) in a way that makes it suitable for data mining and subsequent analysis.
  • Hadoop is the core platform for structuring Big Data and solves the problem of making it useful for analytics purposes.
  • Hadoop is very suits to process unstructured data too.

 The feature of Hadoop's Computing Solution

Comments