Saturday, 19 May 2012

HADOOP INSTALLATION

Hi All,

Hadoop installation .


HADOOP INSTALLATIPON AND CONFIGURATION:-
Step 1 :-  Goto  Cloudera.oracle.com  Download  hadoop-0.20.2-cdh3u3.tar file and other components as PIG and HIVE .
Step 2 :- Untar the Hadoop and name the installation Directory as HADOOP_HOME.
Step 3 :- Now here we can configure the 3 different types of Environment’s.
1)       Single
2)     Pseudo
3)     Clustered or Distributed Environment.
Here we have Configure a clustered environment .For this cluster Set-up depending on the Nodes we can deal with it.
Concepts Need to know before this cluster set-up :-
Terminology:-
1)     Master Node
2)     Secondary Node
3)     Job-Tracker
4)     Task-Tracker
5)     Data-Node
We Here Configuring only Two Node’s So,  Always Remember   MASTER and JOB-TRACKER will be on One Node and TASK_TRACKER and Data-Node will be on Other Node and Secondary will be a replica of the Name-Node now it doesn’t mean that it is a back-up for the name-node but it stored the check-point We can recover if something happen’s to the Name-Node.
Note :-
========
The secondary name-node can be run on the same machine as the name-node, but again
For reasons of memory usage (the secondary has the same memory requirements as the
Primary), it is best to run it on a separate piece of hardware, especially for larger clusters.
(This topic is discussed in more detail in “Master node scenarios” on page 254.) Machines
Running the name-nodes should typically run on 64-bit hardware to avoid the 3
GB limit on Java heap size in 32-bit architectures         



Steps to configure the HADDOP FILE SYSTEM (CDH3) :-
First Step :-
======
1)     SSH CONFIGURATION  FROM THE MASTER NODE TO SLAVE NODES.

[hadoop-user@master]$ Which ssh
/usr/bin/ssh
[hadoop-user@master]$ which sshd
/usr/bin/sshd
[hadoop-user@master]$ which ssh-keygen
/usr/bin/ssh-keygen
[hadoop-user@master]$ ssh-keygen -t rsa
Generating  public/private rsa key pair.
Enter file in which to save the key (/home/hadoop-user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again
Your identification has been saved in /home/hadoop-user/.ssh/id_rsa.
Your public key has been saved in /home/hadoop-user/.ssh/id_rsa.pub.
After creating your key pair, your public key will be of the form
[hadoop-user@master]$ more /home/hadoop-user/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA1WS3RG8LrZH4zL2/1oYgkV1OmVclQ2OO5vRi0Nd
K51Sy3wWpBVHx82F3x3ddoZQjBK3uvLMaDhXvncJG31JPfU7CTAfmtgINYv0kdUbDJq4TKG/fuO5q
J9CqHV71thN2M310gcJ0Y9YCN6grmsiWb2iMcXpy2pqg8UM3ZKApyIPx99O1vREWm+4moFTg
YwIl5be23ZCyxNjgZFWk5MRlT1p1TxB68jqNbPQtU7fIafS7Sasy7h4eyIy7cbLh8x0/V4/mcQsY
5dvReitNvFVte6onl8YdmnMpAh6nwCvog3UeWWJjVZTEBFkTZuV1i9HeYHxpm1wAzcnf7az78jT
IRQ== hadoop-user@master




Distribute public key and validate logins :-
[hadoop-user@master]$ scp ~/.ssh/id_rsa.pub hadoop-user@target:~/master_key
[hadoop-user@target]$ mkdir ~/.ssh
[hadoop-user@target]$ chmod 700 ~/.ssh
[hadoop-user@target]$ mv ~/master key ~/.ssh/authorized_keys
[hadoop-user@target]$ chmod 600 ~/.ssh/authorized_keys
After generating the key, you can verify it’s correctly defined by attempting to log in to
the target node from the master:
[hadoop-user@master]$ ssh target
The authenticity of host 'target (xxx.xxx.xxx.xxx)' can’t be established.
RSA key fingerprint is 72:31:d8:1b:11:36:43:52:56:11:77:a4:ec:82:03:1d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'target' (RSA) to the list of known hosts.
Last login: Sun Jan 4 15:32:22 2009 from master
After confirming the authenticity of a target node to the master node, you won’t be
prompted upon subsequent login attempts.
[hadoop-user@master]$ ssh target
Last login: Sun Jan 4 15:32:49 2009 from master
Configuring the OCnfiguration Files for the Cluster setup
Note :-
When we un-tar the tar file , under CONF directory we can only see the Hadoop-default.xml , hdfs-default.xml ,mapred-default.xml .
We can change these main configuration files to Hadoop-site.xml , mapred-site.xml , hdfs-site.xml .
We need to configure a few things before running Hadoop. Let’s take a closer look at
the Hadoop configuration directory :
[hadoop-user@master]$ cd $HADOOP_HOME
[hadoop-user@master]$ ls -l conf/

-rw-rw-r-- 1 hadoop-user hadoop 2065 Dec 1 10:07 capacity-scheduler.xml
-rw-rw-r-- 1 hadoop-user hadoop 535 Dec 1 10:07 configuration.xsl
-rw-rw-r-- 1 hadoop-user hadoop 49456 Dec 1 10:07 hadoop-default.xml
-rwxrwxr-x 1 hadoop-user hadoop 2314 Jan 8 17:01 hadoop-env.sh
-rw-rw-r-- 1 hadoop-user hadoop 2234 Jan 2 15:29 hadoop-site.xml
-rw-rw-r-- 1 hadoop-user hadoop 2815 Dec 1 10:07 log4j.properties
-rw-rw-r-- 1 hadoop-user hadoop 28 Jan 2 15:29 masters
-rw-rw-r-- 1 hadoop-user hadoop 84 Jan 2 15:29 slaves
-rw-rw-r-- 1 hadoop-user hadoop 401 Dec 1 10:07 sslinfo.xml.example

The Below are the Main set-up which is supposed to be in the *.xml  configuration  Files :-
core-site.xml






The key differences are
We explicitly stated the hostname for location of the NameNode q and
JobTracker w daemons.
We increased the HDFS replication factor to take advantage of distributed
storage e. Recall that data is replicated across HDFS to increase availability and
reliability.
We also need to update the masters and slaves files to reflect the locations of the other
daemons.
[hadoop-user@master]$ cat masters
backup
[hadoop-user@master]$ cat slaves
hadoop1(change these according to our hostname )
hadoop2
hadoop3

After this Goto $HADOOP_HOME/bin/start-all.sh
It will start all the nodes in master and slaves.

Note: - In Default the configuration File name would be Hadoop-default.xml. You want to do some changes you can copy the existing Hadoop

Ok. So after configuring all the above , we are ready  to bring up the services .
Now export the HADOOP_HOME/bin to the PATH
Export PATH = $HADOOP_HOME/bin:$PATH
Bringing up the services: -  Hadoop start-all.sh
Once we issue this command it will bring up the services on the nodes mentioned in Masters and slaves files.
How can we Validate:-
Once the all the services are up we get the web interfaces URL’s :-

Regards,
Naga.

No comments:

Post a Comment