bulk data copy between hadoop clusters


HDFS Distributed File Copy Tool – distcp 7

HDFS Distributed File copy Hadoop provides HDFS Distributed File copy (distcp) tool for copying large amounts of HDFS files within or in between HDFS clusters. It is implemented based on Mapreduce framework and thus it submits a map-only mapreduce job to parallelize the copy process. Usually this tool is useful for copying files between clusters from production to development environments. It supports some advanced command options while copying files. Below are the […]


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.