• 30 May

    What is Yarn in Hadoop?

    The Hadoop ecosystem is going through the continuous evolution. Its processing frameworks are also evolving at full speed with the time. Hadoop 1.0 has passed the limitation of the batch-oriented MapReduce processing framework for the development of specialized and interactive processing model which is Hadoop 2.0. Apache Hadoop was introduced in 2005 and taken over […]

  • 28 May

    What is MapReduce in Hadoop?

    The heart of Apache Hadoop is Hadoop MapReduce. It’s a programming model used for processing large datasets in parallel across hundreds or thousands of Hadoop clusters on commodity hardware. The framework does all the works; you just need to put the business logic into the MapReduce. All the work is divided into the small works […]

  • 11 May

    What is HBase in Hadoop?

    Hadoop HBase is based on the Google Bigtable (a distributed database used for structured data) which is written in Java. Hadoop HBase was developed by the Apache Software Foundation in 2007; it was just a prototype then. Hadoop HBase is an open-source, multi-dimensional, column-oriented distributed database which was built on the top of the HDFS. […]

  • 11 May

    What is Architecture of Hadoop?

    Hadoop is the open-source framework of Apache Software Foundation, which is used to store and process large unstructured datasets in the distributed environment. Data is first distributed among different available clusters then it is processed. Hadoop biggest strength is that it is scalable in nature means it can work on a single node to thousands […]

  • 21 April

    Introduction to Big Data

    The term “Big Data” have been around for few years only, but there is a buzz all around about Big Data. It has become an essential part of our daily life, just like the Internet. Big Data has always been behind the scene, from an internet search to video on demand and online shopping to […]