dfsadmin – HDFS Administration Command


The Syntax for Hadoop commands is

$ hadoop [–config confdir]  [Command]  [Generic_Options]  [Command_Options]

here –config parameter is used for overwriting the default configuration directory. Commands can be either user commands or administrator commands.

Below are the details of the useful administrator command dfsadmin.

dfsadmin:

dfsadmin (distributed file system administration) command is used for file system administration activities like getting file system report, enter/leave safemode, refreshing nodes in the cluster and HDFS upgrade etc.

dfsadmin supports many command options to perform these tasks. Below are the list of command options available with dfsadmin command.

Below are a few useful Command Options provided from above list with examples.

1.  -report        –  Reports basic file system information and statistics.

2.  -safemode <enter|leave|get|wait>:   This is Safe mode maintenance command.
Safe mode is a Namenode state in which it
1. does not accept changes to the name space (read-only)
2. does not replicate or delete blocks.
For further details and examples on safemode please refer the post here 

3.  -refreshNodes: This command option updates the namenode with the set of datanodes allowed to connect to the namenode.

Namenode re-reads datanode hostnames from the file defined by
dfs.hosts, dfs.hosts.exclude configuration parameters. If there are entries in dfs.hosts, only the hosts in it are registered with the namenode.

The datanode entries in dfs.hosts.exclude are decommissioned.i.e. removed from the cluster.

4.  -finalizeUpgrade:  This command is useful when upgrading Hadoop version on all the machines of a cluster. With this command, datanodes delete their previous version working directories, followed by namenode doing the same. This completes the upgrade process.

5.  -metasave <filename>:  This command is useful to save Namenode’s primary data structures to <filename> in the directory specified by hadoop.log.dir property.

6.  -fetchImage <local file>: This command will be used for storing latest fsimage file from NameNode into the specified local file system file.

Here in the above screen, content of fsimage file is not in human-readable format. To know how to view fsimage files in human-readable format please refer the post here.

7. -printTopology : This command is used to get racks information about data nodes. It prints a tree of the racks and their nodes as reported by the Namenode.

There are a few more command options which are not listed in this port and can be tested with details from dfsadmin -help command.

To Read the existing configuration about the cluster

 


Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *


Review Comments
default gravatar

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA

.