📁 Map Reduce


Predefined Mapper and Reducer Classes

Hadoop provided some predefined Mapper and Reducer classes in its Java API and these will be helpful in writing simple or default mapreduce jobs. A few among the entire list of predefined mapper and reducer classes are provided below. Identity Mapper Identity Mapper is the default Mapper class provided by hadoop and this will be picked automatically when no mapper is specified in Mapreduce driver class. Identity Mapper class implements […]


MapReduce Job Flow 4

Mapreduce Job Flow Through YARN Implementation This post is to describe the mapreduce job flow – behind the scenes, when a job is submit to hadoop through submit() or waitForCompletion() method on Job object. This Mapreduce job flow is explained with the help of Word Count mapreduce program described in our previous post. Here the flow is described as per the YARN (Mapreduce2) implementation. submit() method submits the job to the hadoop […]


MapReduce Programming Model 1

In this post, we are going to review the building blocks & programming model of example mapreduce program word count run in previous post in this Mapreduce Category. We will not go too deep into code, our focus will be mainly on structure of the mapreduce program written in java and at the end of post we will submit the mapreduce job to execute this program. Before starting with word […]


Run Example MapReduce Program 2

For testing YARN/Map Reduce Installation, we can run example mapreduce program (word count job) from the hadoop download directory. Hadoop release will contain map reduce examples in share/hadoop/mapreduce/hadoop-mapreduce-examples-x.y.z.jar file. In this demonstration, we will consider wordcount mapreduce program from the above jar to test the counts of each word in a input file and writes counts into output file. 1. Create input test file in local file system and copy it […]


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.