Hadoop Best Practices

Hadoop Best Practices Avoiding small files (sized less than 1 HDFS block, typically 128MB) with one map processing a single small file. Maintain Optimal HDFS Block size, generally >= 128 MB, to avoid tens of thousands of map tasks in processing large data sets. Usage of Combiners wherever applicable/suitable to reduce the network traffic from mapper nodes to reducer nodes. Applications processing large data-sets with optimal number of reducers and […]

Review Comments
I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA