Deployment of Apache Hadoop 2.7.0 on Ubuntu Vivid 15.04 on Azure Linux VM

Recently, on april 21st, the first release of 2015 of Apache Hadoop is committed, the version 2.7.0 is came up as dev edition. Lots of new updates have added.

  • This release drops support for JDK6 runtime & works with JDK 7+ only.
  • This release is yet not ready for production use. But, production users should wait for 2.7.1/2.7.2 release.

In Hadoop common, first time it has got support for Azure Blob storage – blob as file system for Azure.

Other than that, Hadoop HDFS has got support for file truncate, support for quotas per storage type & support for files with variable-length blocks. For Yarn & MapReduce, some of the new pluggins are added like Yarn authorization made pluggable, global caching of YARN localized resources, ability to limit running MapReduce of a job.

Here goes a step by step guide on installation of Apache Hadoop 2.7.0 on Azure Linux Virtual Machine(Ubuntu 15.04) .




Hadoop Administration: Installation scripts of Apache Hadoop (2.6.0) on Ubuntu Unicorn as Multi-Node cluster

Recently , just published a quick step by step guide on deployment of Apache Hadoop (2.6.0) single -node cluster on Ubuntu unicorn(14.10) image, you can get the full installation video here.


Here, the full deployment of Apache Hadoop (2.6.0) multi-node cluster setup details are provided. The primary hardward requirements are needed to run the setup :

1. VMware Player/Workstation(if Windows/Linux) or VMware Fusion(if OSX)

2. More than 4 GB of RAM for primary OS

3. More than 60 GB of Disk space

4. Intel VT-X capable processor.

5. Ubuntu/CentOs/Red Hat/Sese OS Image(as guest OS)

Now, the step by step multinode hadoop clustering  scripts are provided.


Checkout the Ipaddress of each master & slaves node:


Namenode > hadoopmaster >

Datanodes > hadoopslave1 >
hadoopslave2 >
hadoopslave3 >

Clone Hadoop Single node cluster as hadoopmaster

Hadoopmaster Node

$ sudo gedit /etc/hosts


$ sudo gedit /etc/hostname


$ cd /usr/local/hadoop/etc/hadoop

$ sudo gedit core-site.xml

replace localhost as hadoopmaster

$ sudo gedit hdfs-site.xml

replace value as 3 (represents no of datanode)

          $ sudo gedit yarn-site.xml

add the following configuration


$ sudo gedit mapred-site.xml.template
replace as mapred.job.tracker

replace yarn as hadoopmaster:54311

$ sudo rm -rf /usr/local/hadoop/hadoop_data

Shutdown hadoopmaster node

Clone Hadoopmaster Node as hadoopslave1, hadoopslave2, hadoopslave3

Hadoopslave Node (conf should be done on each slavenode)

$ sudo gedit /etc/hostname


          $ sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode

$ sudo chown -R trainer:trainer /usr/local/hadoop

          $ sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml

remove dfs.namenode.dir property section

reboot all nodes

Hadoopmaster Node

          $ sudo gedit /usr/local/hadoop/etc/hadoop/masters


$ sudo gedit /usr/local/hadoop/etc/hadoop/slaves

remove localhost and add


$ sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml

   remove dfs.datanode.dir property section

          $ sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode

$ sudo chown -R trainer:trainer /usr/local/hadoop

$ sudo ssh-copy-id -i ~/.ssh/ trainer@hadoopmaster

$ sudo ssh-copy-id -i ~/.ssh/ trainer@hadoopslave1

$ sudo ssh-copy-id -i ~/.ssh/ trainer@hadoopslave2

$ sudo ssh-copy-id -i ~/.ssh/ trainer@hadoopslave3

$ ssh hadoopmaster

$ exit

$ ssh hadoopslave1

$ exit

$  ssh hadoopslave2

$ exit

$ ssh hadoopslave3

$ exit

$ hadoop namenode -format


$ jps (check in all 3 datanodes)

for checking Hadoop web console :

http://hadoopmasteripaddress :8088/
http://hadoopmasteripaddress :50070/
http://hadoopmasteripaddress :50090/

http://hadoopmasteripaddress  :50075/


%d bloggers like this: