You are on page 1of 38

Hadoop Admin:

Index:
-

Responsibilities of Hadoop Admin.


Building Single Node Cluster.
Building Multi Node Cluster.
Commissioning and Decommissioning Nodes in Cluster.
Challenges of running Hadoop Cluster.
System Log Files.
Hadoop Admin Commands.

Hadoop Admin Responsibilities:


Responsible for implementation and ongoing administration of Hadoop
infrastructure.

Aligning with the systems engineering team to propose and deploy new hardware
and software environments required for Hadoop and to expand existing
environments.

Working with data delivery teams to setup new Hadoop users. This job includes
setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig
and MapReduce access for the new users.

Cluster maintenance as well as creation and removal of nodes using tools like
Ganglia, Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools.

Performance tuning of Hadoop clusters and Hadoop MapReduce routines.

Screen Hadoop cluster job performances and capacity planning

Monitor Hadoop cluster connectivity and security

Manage and review Hadoop log files.

File system management and monitoring.

HDFS support and maintenance.

Diligently teaming with the infrastructure, network, database, application and


business intelligence teams to guarantee high data quality and availability.

Collaborating with application teams to install operating system and Hadoop


updates, patches, version upgrades when required.

Point of Contact for Vendor escalation.

The distributed computation

At his heart, Hadoop is a distributed computation platform. This platforms


programming model is Map Reduce. In order to be efficient, Map Reduce has two
prerequisites:
1) Datasets must be split able in smaller and independent blocks
2) Data locality: means that the code must be moved where the data lies, not
the opposite.
The first prerequisite depends on both the type of input data which feeds the
cluster and what we want to do with it.
The second prerequisite involves having a distributed storage system which
exposes where exactly data is stored and allows the execution of code on any storage
node.
Hadoop is a Master / Slave architecture:
1) The JobTracker (ResourceManagerin Hadoop 2 which Monitor jobs that
are running on the cluster. Needs a lot of memory and CPU (memory
bound and cpu bound)
2) The TaskTracker (Node Manager + Application Master in Hadoop 2)
Runs tasks of jobs on each node of the cluster. Which means Maps and Reduces
Its jobs need a lot of memory and CPU (memory bound and CPU bound)
The critical component in this architecture is the JobTracker/ResourceManager.

The distributed storage

HDFS is a distributed storage filesystem. It runs on top of another filesystem like ext3 or
ext4.In order to be efficient, HDFS must satisfy the following prerequisites :

Hard drives with a high throughput


An underlying filesystem which supports the HDFS read and write pattern: one
big read or write at a time (64MB, 128MB or 256MB)
Network fast enough to cope with intermediate data transfer and block
replication

HDFS is a Master / Slave architecture:

The NameNode and the Secondary NameNode

Stores the filesystem meta informations (directory structure, names,


attributes and file localization) and ensures that blocks are properly replicated in
the cluster

It needs a lot of memory (memory bound)


The DataNode

Manages the state of an HDFS node and interacts with its blocks

Needs a lot of I/O for processing and data transfer (I/O bound)
The critical components in this architecture are the NameNode and the Secondary
NameNode.
How HDFS manages its files

HDFS is optimized for the storage of large files. You write the file once and access it
many times. In HDFS, a file is split into several blocks. Each block is asynchronously
replicated in the cluster. Therefore, the client sends its files once and the cluster takes
care of replicating its blocks in the background.
A block is a contiguous area, a blob of data on the underlying filesystem, its default size
is 64MB but it can be extended to 128MB or even 256MB, depending on your needs. The
block replication, which has a default factor of 3, is useful for two reasons:

Ensure data recovery after the failure of a node. Hard drives used for HDFS must
be configured in JBOD, not RAID
Increase the number of maps that can work on a bloc during a MapReduce job
and therefore speedup processing

From a network standpoint, the bandwidth is used at two moments:

During the replication following a file write


During the balancing of the replication factor when a node fails

How the NameNode manages the HDFS cluster

The NameNode manages the Meta informations of the HDFS cluster. This includes
Meta informations (filenames, directories,) and the location of the blocks of a file. The
filesystem structure is entirely mapped into memory.
In order to have persistence over restarts, two files are also used:

a fsimage file which contains the filesystem metadata


The edits file which contains a list of modifications performed on the content
of fsimage.
The in memory image is the merge of those two files.
When the NameNode starts, it first loads fsimage and then applies the content
of edits on it to recover the latest state of the filesystem.
An issue would be that over time, the edits file keeps growing undefinitely and ends up
by:

consuming all disk space

slowdown restarts
The Secondary NameNode role is to avoid this issue by regularly
merging edits with fsimage, thus pushing a new fsimage and resetting the content
of edits. The trigger for this compaction process is configurable. It can be:

The number of transactions performed on the cluster


The size of the edits file
The elapsed time since the last compaction

The following formula can be applied to know how much memory a NameNode needs:
<Needed memory> = <total storage size in the cluster in MB> / <Size of a block in
MB> / 1000000
In other words, a rule of thumb is to consider that a NameNode needs about 1GB / 1
million blocks.

Building Single Node Cluster:

Cluster of Machines at Yahoo!

Prerequisites
Install Java. > Java 1.6 Version.

Adding a dedicated Hadoop system user

We will use a dedicated Hadoop user account for running Hadoop. While thats not
required it is recommended because it helps to separate the Hadoop installation from
other software applications.
$ sudo addgroup Naresh
$ sudo adduser --ingroup Naresh Srinu

This will add the user srinu and the group Naresh to your local machine.

Configuring SSH (Configuring Key Based Login)

Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local
machine if you want to use Hadoop on it .For our single-node setup of Hadoop, we
therefore need to configure SSH access to localhost for the Srinu user
1) We have to generate an SSH key for the Srinu user.

user@ubuntu:~$ su - Srinu
Srinu@ubuntu:~$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/Srinu/.ssh/id_rsa):
Created directory '/home/Srinu/.ssh'.
Your identification has been saved in /home/Srinu/.ssh/id_rsa.
Your public key has been saved in /home/Srinu/.ssh/id_rsa.pub.
The key fingerprint is:
9b:82:ea:58:b4:e0:35:d7:ff:19:66:a6:ef:ae:0e:d2 Srinu@ubuntu
The key's randomart image is:
[...snipp...]
Srinu@ubuntu:~$

The second line will create an RSA key pair with an empty password. Generally, using an
empty password is not recommended, but in this case it is needed to unlock the key

without your interaction (you dont want to enter the passphrase every time Hadoop
interacts with its nodes).
2) We have to enable SSH access to your local machine with this newly
created key.
Srinu@ubuntu:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

The final step is to test the SSH setup by connecting to your local machine with
the Srinu user. The step is also needed to save your local machines host key fingerprint
to the Srinu users known_hosts file.
Srinu@ubuntu:~$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is d7:87:25:47:ae:02:00:eb:1d:75:4f:bb:44:f9:36:26.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Linux ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:27:30 UTC 2010 i686
GNU/Linux
Ubuntu 12.04 LTS
[...snipp...]
Srinu@ubuntu:~$

Disabling IPv6
One problem with IPv6 on Ubuntu is that using 0.0.0.0 for the various networkingrelated Hadoop configuration options will result in Hadoop binding to the IPv6
addresses of my Ubuntu box. In my case, I realized that theres no practical point in
enabling IPv6 on a box when you are not connected to any IPv6 network. Hence, I
simply disabled IPv6 on my Ubuntu machine. Your mileage may vary.
To disable IPv6 on Ubuntu, open /etc/sysctl.conf in the editor of your choice and add
the following lines to the end of the file:

# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

You have to reboot your machine in order to make the changes take effect.
You can check whether IPv6 is enabled on your machine with the following command:
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6

A return value of 0 means IPv6 is enabled, a value of 1 means disabled (thats what we
want).

Hadoop.
Download Hadoop from the Apache Download Mirrors and extract the
contents of the Hadoop package to a location of your choice. I
picked /usr/local/hadoop. Make sure to change the owner of all the files to
the Srinu user and Naresh group, for example:
$ cd /usr/local
$ sudo tar xzf hadoop-1.0.3.tar.gz
$ sudo mv hadoop-1.0.3 hadoop
$ sudo chown -R Srinu:Naresh hadoop

Update $HOME/.bashrc
Add the following lines to the end of the $HOME/.bashrc file of user Srinu. If you use a
shell other than bash, you should of course update its appropriate configuration files
instead of .bashrc.
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null

alias hls="fs -ls"


# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

Configuration
Hadoop-env.sh
The only required environment variable we have to configure for Hadoop in this tutorial
is JAVA_HOME. Open conf/hadoop-env.sh in the editor of your choice (if you used
the installation path in this tutorial, the full path
is /usr/local/hadoop/conf/hadoop-env.sh) and set
the JAVA_HOME environment variable to the Sun JDK/JRE 6 directory.
Change
Conf/hadoop-env.sh
# The java implementation to use. Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun

To
Conf/hadoop-env.sh

# The java implementation to use. Required.


export JAVA_HOME=/usr/lib/jvm/java-6-sun

Conf/*-site.xml
In this section, we will configure the directory where Hadoop will store its data files, the
network ports it listens to, etc. Our setup will use Hadoops Distributed File
System, HDFS, even though our little cluster only contains our single local machine.
You can leave the settings below as is with the exception of
the hadoop.tmp.dir parameter this parameter you must change to a directory of
your choice. We will use the directory /app/hadoop/tmp.

Now we create the directory and set the required ownerships and permissions:
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown Srinu:hadoop /app/hadoop/tmp
# ...and if you want to tighten up security, chmod from 755 to 750...
$ sudo chmod 750 /app/hadoop/tmp

Add the following snippets between the <configuration> ... </configuration> tags in the
respective configuration XML file.
In file conf/core-site.xml:
Conf/core-site.xml

<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

Conf/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>

Conf/Hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.The default is
used if replication is not specified in create time.
</description>

</property>

Formatting HDFS via Name Node.


Srinu@ubuntu:~$ /usr/local/hadoop/bin/hadoop namenode -format

Srinu@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format


10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
10/05/08 16:59:56 INFO namenode.FSNamesystem: fsOwner=Srinu,hadoop
10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/05/08 16:59:57 INFO common.Storage: Storage directory .../hadoop-Srinu/dfs/name has
been successfully formatted.
10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
Srinu@ubuntu:/usr/local/hadoop$

Starting Single Node Cluster.


Srinu@ubuntu:~$ /usr/local/hadoop/bin/start-all.sh

These will startup a Namenode, Datanode, Jobtracker and a Tasktracker on your


machine. The output will look like this:

Srinu@ubuntu:/usr/local/hadoop$ bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-Srinu-namenodeubuntu.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-Srinu-datanodeubuntu.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoopSrinu-secondarynamenode-ubuntu.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-Srinu-jobtrackerubuntu.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-Srinutasktracker-ubuntu.out
Srinu@ubuntu:/usr/local/hadoop$
JPS
Srinu@ubuntu:/usr/local/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode

If there are any errors, examine the log files in the /logs/ directory.
Stopping Single Node Cluster.
Srinu@ubuntu:/usr/local/hadoop$ bin/stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
Srinu@ubuntu:/usr/local/hadoop$

Building Multi Node Cluster:


From two single-node clusters to a multi-node cluster: We will build a
multi-node cluster using two Ubuntu boxes. The best way to do this for starters is to
install, configure and test a local Hadoop setup for each of the two Ubuntu boxes, and
in a second step to merge these two single-node clusters into one multi-node cluster in
which one Ubuntu box will become the designated master (but also act as a slave with
regard to data storage and processing), and the other box will become only a slave.

Prerequisites:
Configuring single-node clusters first:

It is recommended that you

use the same settings (e.g., installation locations and paths) on both machines, or
otherwise you might run into problems later when we will migrate the two machines to
the final multi-node cluster setup. Now that you have two single-node clusters up and
running, we will modify the Hadoop configuration to make one Ubuntu box the
master (which will also act as a slave) and the other Ubuntu box a slave.

Mapping the nodes:

The easiest way is to put both machines in the same network with regard to hardware
and software configuration, for example connect both machines via a single hub or
switch and configure the network interfaces to use a common network such
as 192.168.0.x/24.

To make it simple, we will assign the IP address 192.168.0.1 to the master machine
and 192.168.0.2 to the slave machine. Update /etc/hosts on both machines with the
following line:
etc/hosts (For Master and slave)
192.168.0.1
192.168.0.2

master
slave

SSH Access:
The Srinu user on the master (aka Srinu@master) must be able to connect
1) To its own user account on the master i.e. ssh master in this context and not
necessarily ssh localhost.
2) To the Srinu user account on the slave (aka Srinu@slave) via a password-less
SSH login
You just have to add the Srinu@masters public SSH key (which should be
in $HOME/.ssh/id_rsa.pub) to the authorized_keys file of Srinu@slave (in this
users $HOME/.ssh/authorized_keys). You can do this manually or use
Srinu@master:~$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub Srinu@slave

This command will prompt you for the login password for user Srinu on slave, then
copy the public SSH key for you, creating the correct directory and fixing the
permissions as necessary.
The final step is to test the SSH setup by connecting with user Srinu from the master to
the user account Srinu on the slave. The step is also needed to save slaves host key
fingerprint to the Srinu@masters known_hosts file.

So, connecting from master to master

Srinu@master:~$ ssh master


The authenticity of host 'master (192.168.0.1)' can't be established.
RSA key fingerprint is 3b:21:b3:c0:21:5c:7c:54:2f:1e:2d:96:79:eb:7f:95.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master' (RSA) to the list of known hosts.
Linux master 2.6.20-16-386 #2 Thu Jun 7 20:16:13 UTC 2007 i686
...
Srinu@master:~$

And from master to slave.


Srinu@master:~$ ssh slave
The authenticity of host 'slave (192.168.0.2)' can't be established.
RSA key fingerprint is 74:d7:61:86:db:86:8f:31:90:9c:68:b0:13:88:52:72.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave' (RSA) to the list of known hosts.
Ubuntu 10.04
...
Srinu@slave:~$

Hadoop
Cluster
We will see how to configure one Ubuntu box as a master node and the other Ubuntu
box as a slave node. The master node will also act as a slave because we only have two
machines available in our cluster but still want to spread data storage and processing to
multiple machines.

The master node will run the master daemons for each layer: NameNode for the HDFS
storage layer, and JobTracker for the MapReduce processing layer. Both machines will
run the slave daemons: DataNode for the HDFS layer, and TaskTracker for
MapReduce processing layer. Basically, the master daemons are responsible for
coordination and management of the slave daemons while the latter will do the actual
data storage and data processing work.

Configuration

Conf/masters (master only)


Despite its name, the conf/masters file defines on which machines Hadoop will
start secondary Name Nodes in our multi-node cluster. In our case, this is just
the master machine. The primary NameNode and the JobTracker will always be the
machines on which you run the bin/start-dfs.sh and bin/start-mapred.sh scripts,

respectively (the primary NameNode and the JobTracker will be started on the same
machine if you run bin/start-all.sh).
To start individually...
bin/hadoop-daemon.sh start [namenode | secondarynamenode | datanode |
jobtracker | tasktracker]

Again, the machine on which bin/start-dfs.sh is run will become the


primary NameNode.
On master, update conf/masters that it looks like this:

Conf/Masters (On Master file)


Master

Conf/slaves (master only)


The conf/slaves file lists the hosts, one per line, where the Hadoop slave daemons
(Data Nodes and Task Trackers) will be run. We want both the master box and
the slave box to act as Hadoop slaves because we want both of them to store and process
data.
On master, update conf/slaves that it looks like this:

Conf/slaves (on Master)


master
slave

The conf/slaves file on master is used only by the scripts like bin/start-dfs.sh
or bin/stop-dfs.sh. For example, if you want to add Data Nodes on the fly you can
manually start the DataNode daemon on a new slave machine via bin/hadoopdaemon.sh start datanode. Using the conf/slaves file on the master simply helps you to
make full cluster restarts easier.

conf/*-site.xml (all machines)


We must change the configuration files conf/core-site.xml, conf/mapredsite.xml and conf/hdfs-site.xml on ALL machines as follows.
First, we have to change the fs.default.name parameter (in conf/core-site.xml), which
specifies the NameNode (the HDFS master) host and port. In our case, this is the master
machine.
Conf/core-site.xml (On All Machines)

<property>

<name>fs.default.name</name>
<value>hdfs: //master:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

Conf/Mapred-site.xml(On All Machines)

Second, we have to change the mapred.job.tracker parameter (in conf/mapred-site.xml),


which specifies the JobTracker (MapReduce master) host and port. Again, this is
the master in our case.

<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>

Third, we change the dfs.replication parameter (in conf/hdfs-site.xml) which specifies


the default block replication. It defines how many machines a single file should be
replicated to before it becomes available. If you set this to a value higher than the
number of available slave nodes (more precisely, the number of DataNodes), you will
start seeing a lot of (Zero targets found, forbidden1.size=1) type errors in the log files.

The default value of dfs.replication is 3. However, we have only two nodes available, so
we set dfs.replication to 2.

Conf/hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>

Formatting the Name Node

To format the filesystem (which simply initializes the directory specified by


the dfs.name.dirvariable on the NameNode), run the command

Srinu@master:/usr/local/hadoop$ bin/hadoop namenode -format


... INFO dfs.Storage: Storage directory /app/hadoop/tmp/dfs/name has been successfully
formatted.
Srinu@master:/usr/local/hadoop$

The HDFS name table is stored on the Name Nodes (here: master) local filesystem in
the directory specified by dfs.name.dir. The name table is used by the NameNode to
store tracking and coordination information for the DataNodes.

Starting the multi-node cluster


Starting the cluster is performed in two steps.
1.

We begin with starting the HDFS daemons: the NameNode daemon is started
on master, and DataNode daemons are started on all slaves (here: master and slave).

2.

Then we start the MapReduce daemons: the JobTracker is started on master, and
TaskTracker daemons are started on all slaves (here: master and slave).

HDFS daemons
Run the command bin/start-dfs.sh on the machine you want the (primary)
NameNode to run on. This will bring up HDFS with the NameNode running on the
machine you ran the previous command on, and DataNodes on the machines listed in
the conf/slaves file.In our case, we will run bin/start-dfs.sh on master:
Java Processes running on Master after bin/start-dfs.sh
Srinu@master:/usr/local/hadoop$ jps
14799 NameNode
15314 Jps
14880 DataNode
14977 SecondaryNameNode
Srinu@master:/usr/local/hadoop$

Java Processes running on Slaves after bin/start-dfs.sh


Srinu@slave:/usr/local/hadoop$ jps
15183 DataNode
15616 Jps
Srinu@slave:/usr/local/hadoop$

Java Processes running on Masters after bin/start-mapred.sh


Srinu@master:/usr/local/hadoop$ jps
16017 Jps
14799 NameNode
15686 TaskTracker
14880 DataNode
15596 JobTracker
14977 SecondaryNameNode
Srinu@master:/usr/local/hadoop$

Java Processes running on Slaves after bin/start-mapred.sh

Srinu@slave:/usr/local/hadoop$ jps
15183 DataNode
15897 TaskTracker
16284 Jps
Srinu@slave:/usr/local/hadoop$

Commissioning and Decommissioning Nodes in a


Hadoop Cluster:

One of the most attractive features of Hadoop framework is its utilization of commodity
hardware. However, this leads to frequent Data Node crashes in a Hadoop cluster.
Another striking feature of Hadoop Framework is the ease of scale in accordance to the
rapid growth in data volume. Because of these two reasons, one of the most common
task of a Hadoop administrator is to commission (Add) and decommission (Remove)
Data Nodes in a Hadoop Cluster.

Above diagram shows a step by step process to decommission a DataNode


in the cluster

The first task is to update the exclude files for both HDFS (hdfs-site.xml) and Map
Reduce (mapred-site.xml).
The exclude file:

For jobtracker contains the list of hosts that should be excluded by the jobtracker.
If the value is empty, no hosts are excluded.
For Name node contains a list of hosts that are not permitted to connect to the
Name node.

Here is the sample configuration for the exclude file in hdfs-site.xml and
mapred-site.xml:
Hdfs-site.xml
<Property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>

Mapred-site.xml
<Property>
<name>mapred.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>
Note: The full pathname of the files must be specified.

Removing a DataNode from the Hadoop Cluster


We can remove a node from a cluster on the fly, while it is running, without any data
loss. HDFS provides a decommissioning feature, which ensures that removing a node
is performed safely. To use it, follow the steps as given below:

Step 1: Login to master


Login to master machine user where Hadoop is installed.
$ su hadoop

Step 2: Change cluster configuration


An exclude file must be configured before starting the cluster. Add a key named
dfs.hosts.exclude to our $HADOOP_HOME/etc/hadoop/hdfs-site.xml file. The
value associated with this key provides the full path to a file on the NameNode's local
file system which contains a list of machines which are not permitted to connect to
HDFS.
For example, add these lines to etc/hadoop/hdfs-site.xml file.
<property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/hadoop-1.2.1/hdfs_exclude.txt</value>
<description>DFS exclude</description>
</property>

Step 3: Determine hosts to decommission


Each machine to be decommissioned should be added to the file identified by the
hdfs_exclude.txt, one domain name per line. This will prevent them from connecting to

the NameNode. Content of the "/home/hadoop/hadoop-1.2.1/hdfs_exclude.txt" file is


shown below.
slave2.in

Step 4: Force configuration reload


Run the command "$HADOOP_HOME/bin/hadoop dfsadmin -refreshNodes" without the quotes.

$ $HADOOP_HOME/bin/hadoop dfsadmin -refreshNodes


This will force the NameNode to re-read its configuration, including the newly updated
excludes file. It will decommission the nodes over a period of time, allowing time for
each node's blocks to be replicated onto machines which are scheduled to remain
active.
On slave2.in, check the jps command output. After some time, you will see the
DataNode process is shutdown automatically.

Step 5: Shutdown nodes


After the decommission process has been completed, the decommissioned hardware
can be safely shut down for maintenance. Run the report command to dfsadmin to
check the status of decommission. The following command will describe the status of
the decommission node and the connected nodes to the cluster.
$

$HADOOP_HOME/bin/hadoop dfsadmin -report

Step 6: Edit excludes file again


Once the machines have been decommissioned, they can be removed from the
excludes
file.
Running "$HADOOP_HOME/bin/hadoop
dfsadmin
-refreshNodes" again will read the excludes file back into the NameNode; allowing the
DataNodes to rejoin the cluster after the maintenance has been completed, or
additional capacity is needed in the cluster again, etc.
Special Note: If the above process is followed and the tasktracker process is still
running on the node, it needs to be shut down. One way is to disconnect the machine as

we did in the above steps. The Master will recognize the process automatically and will
declare as dead. There is no need to follow the same process for removing the
tasktracker because it is NOT much crucial as compared to the DataNode. DataNode
contains the data that you want to remove safely without any loss of data.
The tasktracker can be run/shutdown on the fly by the following command at any point
of time.

Challenges of running hadoop Cluster:

Main challenge in running a hadoop cluster comes from maintenance itself. We will
point out some of the common problems we face every day.

1. Replacing/upgrading hard drives. You got to be careful while scaling.


2. Commissioning and decommissioning is fairly simple but still keep an eye on
site.xml's.
3. Performance issues as your data grows. It could be at network level, IO level, Disk
level or Application level. Better to have monitoring tools over your cluster to detect any
upcoming issues.
4. Memory - Always better to guesstimate the scope of your data and chose appropriate
replication factor. Doesn't feel comfortable if jobs fail while running MR just because of
insufficient memory.
5. There are opensource tools like Nagios, Ganglia or Enterprise solutions like
brightcomputing for better monitoring capabilities. You got to research on it.
6. Once again better to have clear understanding on each and every parameter of
mapred-site.xml and hdfs-site.xml. If you just want to performance tune and configure
the cluster, I would say understanding fully how each parameter in mapred-default.xml
and core-default.xml impacts your jobs are critical. There have been changes in the
names of properties over releases and not taking them into account is common for new
people in the domain. Failing hardware (disks mainly) is very common.
7. Typically a large hadoop cluster's main job is replacing hardware: especially hard
drives. Software management is not that much of work after the initial setup. Reboot
resolves most of the problems.

Largest Cluster in the World.


The largest publicly known Hadoop clusters are Yahoo!'s 4000 node cluster followed by
Facebook's 2300 node cluster [1].
Yahoo! has lots of Hadoop nodes but they're organized under different clusters and are
used for different purposes (a significant amount of these clusters are research clusters).
The current JobTracker and NameNode actually don't scale that well to that many nodes
(they've lots of concurrency issues)

EBay apparently has a large cluster: "Amr Awadallah said eBay have the third largest
Hadoop cluster in existence holding a few petabytes of data and move data between it
and a traditional data warehouse."

System Log Files.


Apache Hadoops jobtracker, namenode, secondary namenode, datanode, and
tasktracker all generate logs. That includes logs from each of the daemons under normal
operation, as well as configuration logs, statistics, standard error, standard out, and
internal diagnostic information. Many users arent entirely sure what the differences are
among these logs, how to analyze them, or even how to handle simple administrative
tasks like log rotation.
The log categories are:

Hadoop Daemon Logs

These logs are created by the Hadoop daemons, and exist on all machines running at
least one Hadoop daemon. Some of the files end with .log, and others end with .out.
The .out files are only written to when daemons are starting. After daemons have started
successfully, the .out files are truncated. By contrasts, all log messages can be found in
the .log files, including the daemon start-up messages that are sent to the .out files.
There is a .log and .out file for each daemon running on a machine. When the

namenode, jobtracker, and secondary namenode are running on the same machine,
then there are six daemon log files: a .log and .out for the each of the three daemons.
The .log and .out file names are constructed as follows:
Hadoop-<User-running-hadoop>-<daemon>-<hostname>.log
Where <user-running-hadoop> is the user running the Hadoop daemons, <daemon>
is the daemon these logs are associated (for example, namenode or jobtracker), and
<hostname> is the hostname of the machine on which the daemons are running.
For example:
Hadoop-hadoop-datanode-ip-10-251-30-53.log
By default, the .log files are rotated daily by log4j. This is configurable
with /etc/hadoop/conf/log4j.properties. Administrators of a Hadoop cluster should
review these logs regularly to look for cluster-specific errors and warnings that might
have to do with daemons running incorrectly. Note that the namenode and
secondarynamenode logs should not be deleted more frequently than
fs.checkpoint.period, so in the event of a secondarynamenode edits log compaction
failure, logs from the namenode and secondarynamenode will be available for
diagnostics.
These logs grow slowly when the cluster is idle. When jobs are running, they grow
very rapidly. Some problems create considerably more log entries, but some problems
only create a few infrequent messages. For example, if the jobtracker cant connect to
the namenode, the jobtracker daemon logs explode with the same error (something
like Retrying connecting to namenode [.]). Lots of log entries here does not
necessarily mean that there is a problem: you have to search through these logs to
look for a problem.

Job Configuration XML

The job configuration XML logs are created by the jobtracker. The jobtracker creates
a .xml file for every job that runs on the cluster. These logs are stored in two
places:/var/log/hadoop and /var/log/hadoop/history. The XML file describes the job
configuration.
The /hadoop file names are constructed as follows:
Job_<JobID>_conf.xml

Job Statistics

These logs are created by the jobtracker. The jobtracker runtime statistics from jobs to
these files. Those statistics include task attempts, time spent shuffling, input splits
given to task attempts, start times of tasks attempts and other information.

The statistics files are named:


<Hostname>_<epoch-of-jobtracker-start>_<job-id>_<job-name>

Standard Error

These logs are created by each tasktracker. They contain information written to
standard error (stderr) captured when a task attempt is run. These logs can be used
for debugging. For example, a developer can include System.err.println (some useful
information) calls in the job code. The output will appear in the standard error files.
The parent directory name for these logs is constructed as follows:
/var/log/hadoop/userlogs/attempt_<Job-id>_<Map or Reduce>_<attempt-id>
where <job-id> is the ID of the job that this attempt is doing work for, <map-orreduce> is either m if the task attempt was a mapper, or r if the task attempt was a
reducer, and <attempt-id> is the ID of the task attempt.
For example:
/var/log/hadoop/userlogs/attempt_200908190029_001_m_00001_0
These logs are rotated according to the mapred.userlog.retain.hours property. You can
clear these logs periodically without affecting Hadoop. However, consider archiving
the logs if they are of interest in the job development process. Make sure you do not
move or delete a file that is being written to by a running job .

1) HADOOP NAMENODE COMMANDS

Command

Description

hadoop namenode
-format

Format HDFS filesystem from Namenode

hadoop namenode
-upgrade

Upgrade the NameNode

start-dfs.sh

Start HDFS Daemons

stop-dfs.sh

Stop HDFS Daemons

start-mapred.sh

Start MapReduce Daemons

stop-mapred.sh

Stop MapReduce Daemons

hadoop namenode
-recover -force

Recover namenode metadata after a cluster


failure (may lose data)

2. HADOOP FSCK COMMANDS

Command

Description

hadoop fsck /

Filesystem check on HDFS

hadoop fsck / -files

Display files during check

hadoop fsck / -files -blocks

Display files and blocks during


check

hadoop fsck / -files -blocks


-locations

Display files, blocks and its location


during check

hadoop fsck / -files -blocks


-locations -racks

Display network topology for datanode locations

hadoop fsck -delete

Delete corrupted files

hadoop fsck -move

Move corrupted files to /lost+found


directory

3. HADOOP JOB COMMANDS

Command

Description

hadoop job -submit


<job-file>

Submit the job

hadoop job -status


<job-id>

Print job status completion percentage

hadoop job -list all

List all jobs

hadoop job -list-activeList all available TaskTrackers


trackers
Set priority for a job. Valid priorities:
hadoop job -set-priority
VERY_HIGH, HIGH, NORMAL, LOW,
<job-id> <priority>
VERY_LOW
hadoop job -kill-task
<task-id>

Kill a task

hadoop job -history

Display job history including job details,


failed and killed jobs
4. HADOOP DFSADMIN COMMANDS

Command

Description

hadoop dfsadmin
-report

Report filesystem info and statistics

hadoop dfsadmin
-metasave file.txt

Save namenodes primary data structures to


file.txt

hadoop dfsadmin
-setQuota 10
/quotatest

Set Hadoop directory quota to only 10 files

hadoop dfsadmin
Clear Hadoop directory quota
-clrQuota /quotatest
hadoop dfsadmin
-refreshNodes

Read hosts and exclude files to update


datanodes that are allowed to connect to
namenode. Mostly used to commission or
decommissions nodes

hadoop fs -count
-q /mydir

Check quota space on directory /mydir

hadoop dfsadmin
-setSpaceQuota

Set quota to 100M on hdfs directory named

/mydir 100M

/mydir

hadoop dfsadmin
-clrSpaceQuota
/mydir

Clear quota on a HDFS directory

hadooop dfsadmin
-saveNameSpace

Backup Metadata (fsimage & edits). Put


cluster in safe mode before this command.
5. HADOOP SAFEMODE COMMANDS.

The following dfsadmin commands helps the cluster to enter or leave safe mode, which
is also called as maintenance mode. In this mode, Namenode does not accept any
changes to the name space; it does not replicate or delete blocks.

Command

Description

hadoop dfsadmin -safemode


Enter safe mode
enter
hadoop dfsadmin -safemode
Leave safe mode
leave
hadoop dfsadmin -safemode
Get the status of mode
get
hadoop dfsadmin -safemode Wait until HDFS finishes data block
wait
replication
6. HADOOP CONFIGURATION FILES

File

Description

hadoop-env.sh

Sets ENV variables for Hadoop

core-site.xml

Parameters for entire Hadoop cluster

hdfs-site.xml

Parameters for HDFS and its clients

mapred-site.xml

Parameters for MapReduce and its clients

masters

Host machines for secondary Namenode

slaves

List of slave hosts

7. HADOOP MRADMIN COMMANDS

Command

Description

hadoop mradmin -safemode get

Check Job tracker status

hadoop mradmin -refreshQueues

Reload mapreduce
configuration

hadoop mradmin -refreshNodes

Reload active TaskTrackers

hadoop mradmin -refreshServiceAcl

Force Jobtracker to reload


service ACL

hadoop mradmin
-refreshUserToGroupsMappings

Force jobtracker to reload user


group mappings

8. HADOOP BALANCER COMMANDS

Command

Description

start-balancer.sh

Balance the cluster

hadoop dfsadmin
-setBalancerBandwidth
<bandwidthinbytes>

Adjust bandwidth used by the


balancer

hadoop balancer -threshold 20

Limit balancing to only 20%


resources in the cluster

You might also like