Hadoop Real Time Usecases with Solutions 1

Below are a few Hadoop Real Time usecases with solutions.

Usecase 1 Problem:-

Data Description:

This gives the information about the markets and the products available in different regions based on the seasons.

You will find the below fields listed in that file.

Problem Statement:

  • Select any particular county and calculate the percentage of different products produced by each Market in that particular county.

Note: Here we have total 24 products which consists of the value Y or N. Count the products that a particular market will produce will be Y and calculate percentage as count%25. Divide the products into three categories High, Medium and Low.

High – above 60 %

Medium – less than or equal to 60% and greater than 40%

Low – less than or equal to 40%

  • Find the count of the markets that come under the category HIGH.

Usecase1 Solution:-


Before going ahead copy your input file (DATA_GOV_US_Farmers_Market_DataSet.csv’) into hdfs.

Step 1:

Step 2:

Step 3:

Step 4:

Step 5: