Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Friday, June 23, 2017

Hadoop DFSAdmin Commands

The dfsadmin tools are a specific set of tools designed to help you root out information about your Hadoop Distributed File system (HDFS). As an added bonus, you can use them to perform some administration operations on HDFS as well.


[hdpsysuser@nn01 ~]$ hdfs dfsadmin -report -decommissioning
[hdpsysuser@nn01 ~]$ hdfs dfsadmin -report -dead
[hdpsysuser@nn01 ~]$ hdfs dfsadmin -report -live
Configured Capacity: 4497804386304 (4.09 TB)
Present Capacity: 4497702404096 (4.09 TB)
DFS Remaining: 4434167631872 (4.03 TB)
DFS Used: 63534772224 (59.17 GB)
DFS Used%: 1.41%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.49.137:50010 (dn03)
Hostname: dn03
Decommission Status : Normal
Configured Capacity: 1499268128768 (1.36 TB)
DFS Used: 21177212928 (19.72 GB)
Non DFS Used: 33988608 (32.41 MB)
DFS Remaining: 1478056927232 (1.34 TB)
DFS Used%: 1.41%
DFS Remaining%: 98.59%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun May 07 14:28:17 AST 2017


Name: 192.168.49.135:50010 (dn01)
Hostname: dn01
Decommission Status : Normal
Configured Capacity: 1499268128768 (1.36 TB)
DFS Used: 21178548224 (19.72 GB)
Non DFS Used: 33988608 (32.41 MB)
DFS Remaining: 1478055591936 (1.34 TB)
DFS Used%: 1.41%
DFS Remaining%: 98.59%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun May 07 14:28:17 AST 2017


Name: 192.168.49.136:50010 (dn02)
Hostname: dn02
Decommission Status : Normal
Configured Capacity: 1499268128768 (1.36 TB)
DFS Used: 21179011072 (19.72 GB)
Non DFS Used: 34004992 (32.43 MB)
DFS Remaining: 1478055112704 (1.34 TB)
DFS Used%: 1.41%
DFS Remaining%: 98.59%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun May 07 14:28:17 AST 2017


Safe Mode

Safe mode is a Namenode state in which it
1. does not accept changes to the name space (read-only)
2. does not replicate or delete blocks. 



Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well.


[hdpsysuser@nn01 ~]$ hdfs dfsadmin -safemode get
Safe mode is OFF

[hdpsysuser@nn01 ~]$ hdfs dfsadmin -safemode enter
Safe mode is ON
[hdpsysuser@nn01 ~]$ hdfs dfsadmin -safemode leave
Safe mode is OFF

While being in safe mode, save current namespace into storage directories and reset edits log.

[hdpsysuser@nn01 ~]$ hdfs dfsadmin -saveNamespace
Save namespace successful

Save Namenode’s primary data structures to filename in the directory specified by hadoop.log.dir property. filename is overwritten if it exists. filename will contain one line for each of the following

1. Datanodes heart beating with Namenode
2. Blocks waiting to be replicated
3. Blocks currently being replicated



[hdpsysuser@nn01 ~]$ hdfs dfsadmin -metasave metasaveNameNode.txt
Created metasave file metasaveNameNode.txt in the log directory of namenode hdfs://nn01:9000

[hdpsysuser@nn01 ~]$ cat /usr/hadoopsw/hadoop-2.7.3/logs/metasaveNameNode.txt
384 files and directories, 440 blocks = 824 total
Live Datanodes: 3
Dead Datanodes: 0
Metasave: Blocks waiting for replication: 0
Mis-replicated blocks that have been postponed:
Metasave: Blocks being replicated: 0
Metasave: Blocks 0 waiting deletion from 0 datanodes.
Metasave: Number of datanodes: 3
192.168.44.137:50010 IN 1499268128768(1.36 TB) 21177212928(19.72 GB) 1.41% 1478056927232(1.34 TB) 0(0 B) 0(0 B) 100.00% 0(0 B) Sun May 07 14:40:05 AST 2017
192.168.44.135:50010 IN 1499268128768(1.36 TB) 21178548224(19.72 GB) 1.41% 1478055591936(1.34 TB) 0(0 B) 0(0 B) 100.00% 0(0 B) Sun May 07 14:40:05 AST 2017
192.168.44.136:50010 IN 1499268128768(1.36 TB) 21179011072(19.72 GB) 1.41% 1478055112704(1.34 TB) 0(0 B) 0(0 B) 100.00% 0(0 B) Sun May 07 14:40:05 AST 2017

File contains one line for each of these items: a) DataNodes that are exchanging heartbeats with the NameNode; b) blocks that are waiting to be replicated; c) blocks that are being replicated; and d) blocks that are waiting to be deleted.


Default value for the environment variable HADOOP_LOG_DIR would be ${HADOOP_HOME}/logs. You will the correct value by checking the file hadoop-env.sh in HADOOP-CONF_DIR (/etc/hadoop/hadoop-env.sh).


Save the description of each DataNode in the HDFS cluster


hdfs dfsadmin -report > dfs.datanodes.report.backup

Report the status of each slave node

hdfs dfsadmin -report

Refresh all the DataNodes

hdfs dfsadmin -refreshNodes

Save the metadata of the HDFS filesystem, 

hdfs dfsadmin -metasave meta.log

The meta.log file will be created under the directory $HADOOP_HOME/logs.


1 comment:

veera said...

Really this article is awesome.Thank you for sharing this blog.

hadoop admin online training