How to manage huge numbers of small files in hadoop mapreduce