Querying HDFS File System


FileStatus:

Java API for Hadoop Distributed file system provides one important class org.apache.hadoop.fs.FileStatus for querying HDFS File System. This class encapsulates the file system meta data. We can obtain meta data about the files, directories, including file length, block size, replication, modification time, ownership, and permission information.

We can get the instance of FileStatus by calling getFileStatus() method on file system object.

Below are some of the important methods on FileStatus class.

Below is a sample program for illustrating the above methods of FileStatus class.

After compiling the above program and adding the class file location to HADOOP_CLASSPATH environment variable, then we can test the output of the program as shown below.

FileStatus1In the above screen shot, we can observe all the meta data (File length, Modification time, replication factor, block size and etc…) about hdfs://localhost:9000/input/word_input.txt file.

Browse the FileSystem with listStatus() methods


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.