Daily Archives: February 26, 2015

Formula to Calculate HDFS nodes storage 5

Formula to calculate HDFS nodes Storage (H) Below is the formula to calculate the HDFS Storage size required, when building a new Hadoop cluster. H = C*R*S/(1-i) * 120% Where: C = Compression ratio. It depends on the type of compression used (Snappy, LZOP, …) and size of the data. When no compression is used, C=1. R = Replication factor. It is usually 3 in a production cluster. S = Initial size of […]

RHadoop Installation on Ubuntu 4

In this post, we will briefly discuss about the steps for RHadoop Installation on Ubuntu 14.04 Machine with Hadoop-2.6.0 version. We also see the procedure for R & RStudio Installations on Ubuntu Machine. All these installations are done on single node hadoop machine. RStudio Installation on Hadoop Machine Before proceeding with steps detailed below, Hadoop machine setup should be completed. Please refer “install-hadoop-on-single-node-cluster” in this blog, for Hadoop installation Install […]

Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017