Hadoop Integration – Avro Errors 1


In this post we will discuss about some of the errors or exceptions that can occur when there is mismatch in the integration of Avro and Hadoop distributions.

When we do not use the correct version of avro release then we will run into so many errors or exceptions. In this post, we will consider the version compatibility for Hadoop-2.3.0 release. As it is Hadoop2  we need to use the correct avro-mapred-*.jar version file.

When we run the avro mapreduce jobs in local mode through eclipse then we may not receive these kind of errors or exceptions but when we run these in hadoop2 environment then we need to maintain the version compatibility.

As Avro is used in IPC & RPC communication (communication mode between datanodes & namenode in HDFS), the first avro error we receive when there is version mismatch in the        avro-x.y.z.jar file in $HADOOP_HOME/share/hadoop/common/lib is

or

These error messages will be received when we try to perform any hadoop fs -command  .

When we submit the mapreduce jobs, then we will receive below exceptions.

In Hadoop-2.3.0 release the suitable version for avro release is 1.7.4 and if we can copy the correct version of avro-1.7.4.jar file into $HADOOP_HOME/share/hadoop/common/lib directory then we will not receive above kind of error messages or exceptions.

And if we do not have right version of avro-mapred-1.7.4.jar file in the above hadoop library then we will receive below types of error messages when we submit the mapreduce jobs.

And if we do not have right version of avro-mapred-1.7.4.jar file in the above hadoop library then we will receive below types of error messages when we submit the mapreduce jobs.

So, we need to be very careful in copying avro-*.jar files into hadoop distribution folders $HADOOP_HOME/share/hadoop/common/lib. We can find the correct version of avro release based on avro-x.y.z.jar file present in $HADOOP_HOME/share/hadoop/common/lib folder, we need to copy the same version of avro-mapred-x.y.z-hadoop2.jar and avro-tools-x.y.z.jar files into hadoop distribution folder$HADOOP_HOME/share/hadoop/tools/lib.

As hadoop-2.3.0 release used avro-1.7.4.jar file in common/lib directory, we must copy the same version of avro-mapred-1.7.4-hadoop2.jar and avro-tools-1.7.4.jar files into $HADOOP_HOME/share/hadoop/tools/lib directory otherwise we will run into too many exceptions/errors while accessing HDFS file system or submitting mapreduce jobs.

Note:  Do not copy avro-tools-1.7.4.jar file into $HADOOP_HOME/share/hadoop/common/lib directory but this can be copied into $HADOOP_HOME/share/hadoop/tools/lib. Even if we copy the same version of avro-tools-1.7.4.jar file into */common/lib directory then we will receive below error message.

So, this avro-tools-1.7.4.jar file can only be copied into  $HADOOP_HOME/share/hadoop/tools/lib directory but not into  $HADOOP_HOME/share/hadoop/common/lib directory.


Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

One thought on “Hadoop Integration – Avro Errors


Review Comments
default gravatar

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA

.