• 30 May

    What is Yarn in Hadoop?

    The Hadoop ecosystem is going through the continuous evolution. Its processing frameworks are also evolving at full speed with the time. Hadoop 1.0 has passed the limitation of the batch-oriented MapReduce processing framework for the development of specialized and interactive processing model which is Hadoop 2.0. Apache Hadoop was introduced in 2005 and taken over […]

  • 28 May

    What is MapReduce in Hadoop?

    The heart of Apache Hadoop is Hadoop MapReduce. It’s a programming model used for processing large datasets in parallel across hundreds or thousands of Hadoop clusters on commodity hardware. The framework does all the works; you just need to put the business logic into the MapReduce. All the work is divided into the small works […]

  • 15 May

    What is HDFS in Hadoop

    The Hadoop Distributed File System is a java based file, developed by Apache Software Foundation with the purpose of providing versatile, resilient, and clustered approach to manage files in a Big Data environment using commodity servers. HDFS used to store a large amount of data by placing them on multiple machines as there are hundreds […]

  • 11 May

    What is HBase in Hadoop?

    Hadoop HBase is based on the Google Bigtable (a distributed database used for structured data) which is written in Java. Hadoop HBase was developed by the Apache Software Foundation in 2007; it was just a prototype then. Hadoop HBase is an open-source, multi-dimensional, column-oriented distributed database which was built on the top of the HDFS. […]

  • 11 May

    What is Architecture of Hadoop?

    Hadoop is the open-source framework of Apache Software Foundation, which is used to store and process large unstructured datasets in the distributed environment. Data is first distributed among different available clusters then it is processed. Hadoop biggest strength is that it is scalable in nature means it can work on a single node to thousands […]

  • 08 May

    Applications of Big data

    Data is omnipresent. It was there in the past, it is now at present, and it will be there in the future also. But the thing that has changed now is that the industries have realized the importance of data. Industries are now having knowledge of Big Data and are benefited by Big Data applications. […]

  • 28 April

    What are the Big Data Technologies?

    To be at the top of your field is one thing, and maintaining your position at the top is another. The same thing applies to the IT industry and Big Data technologies are doing the later thing so well! Data management will decide the position. If any organizations don’t know to handle the tons of […]

  • 28 April

    How to Install Hadoop?

    Hadoop is an open-source Java-based framework. It was built on Java programming language and Linux Operating system. Hadoop is a tool used for big data processing and many companies are using Hadoop to maintain their large set of data. Hadoop is the project of Apache Software Foundation. Hadoop has undergone a number of changes since […]

  • 22 April

    Introduction to Apache Hadoop

    With the continuous business growth and start-ups flourishing up, the need to store a large amount of data has also increased rapidly. The companies started looking for the tools to analyze this Big Data to uncover market trends, hidden pattern, customer requirements, and other useful business information to help them make effective business decisions and […]