Below are a few Hadoop Real Time usecases with solutions.
Usecase 1 Problem:-
This gives the information about the markets and the products available in different regions based on the seasons.
You will find the below fields listed in that file.
- Select any particular county and calculate the percentage of different products produced by each Market in that particular county.
Note: Here we have total 24 products which consists of the value Y or N. Count the products that a particular market will produce will be Y and calculate percentage as count%25. Divide the products into three categories High, Medium and Low.
High – above 60 %
Medium – less than or equal to 60% and greater than 40%
Low – less than or equal to 40%
- Find the count of the markets that come under the category HIGH.
Before going ahead copy your input file (DATA_GOV_US_Farmers_Market_DataSet.csv’) into hdfs.