creating custom inputformat and recordreader example


Merging Small Files into SequenceFile 7

In this post, we will discuss one of the famous use case of SequenceFiles, where we will merge large number of small files into SequenceFile. We will get to this requirement mainly due to the lack efficient processing of large number of small files in hadoop or mapreduce. Need For Merging Small Files: As hadoop stores all the HDFS files metadata in namenode’s main memory(which is a limited value) for […]


Review Comments
default gravatar

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA

.

Next Batch Spark, Scala Training

  • Next Batch on Spark Developer Online Training starts around 26th of February. If any one interested to attend this batch please register by sending email to me on siv535@gmail.com or calling on +91-9704231873
  • Training Course Includes below topics:
  • Scala
  • Spark
  • Real Time projects
  • Contact For Job Support Also

Next Training Batch on Hadoop Starts on 21st Feb

  • Next Batch on Spark Developer Online Training starts around 21st of February. If any one interested to attend this batch please register by sending email to me on siv535@gmail.com or calling on +91-9704231873
  • Training Course Includes below topics:
  • HDFS
  • Mapreduce Essentials
  • Pig
  • Hive
  • Impala
  • Hbase
  • Sqoop
  • Flume
  • Oozie
  • 3 Real Time Projects