Installation Commands of Apache Hadoop 2.6.0 as Single Node Pseudo-Distributed mode on Ubuntu 14.10 (Step by Step)


$ sudo apt-get update

$ sudo apt-get install default-jdk

$ java -version

$ sudo apt-get install ssh

$ sudo apt-get install rsync

$ ssh-keygen -t dsa -P ‘ ‘ -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$ wget -c http://mirror.olnevhost.net/pub/apache/hadoop/common/current/hadoop-2.6.0.tar.gz

$ sudo tar -zxvf hadoop-2.6.0.tar.gz

$ sudo mv hadoop-2.6.0 /usr/local/hadoop

$ update-alternatives –config java

$ sudo gedit ~/.bashrc

#Hadoop Variables
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=”-Djava.library.path=$HADOOP_HOME/lib”

Now apply the variables.

$ source ~/.bashrc

There are a number of xml files within the Hadoop folder that require editing which are:

  • mapred-site.xml
  • yarn-site.xml
  • core-site.xml
  • hdfs-site.xml
  • hadoop-env.sh

The files can be found in /usr/local/hadoop/etc/hadoop/.First copy the mapred-site template file over and then edit it.

mapred-site.xml

mapreduce-xml

Next, go to the following path.

$ cd /usr/local/hadoop/etc/Hadoop

Add the following text between the configuration tabs.

mapred-site.xml.template

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

yarn-site.xml

Add the following text between the configuration tabs.

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

core-site.xml

Add the following text between the configuration tabs.
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

hdfs-site.xml

Add the following text between the configuration tabs.

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoopuser/hadoopspace/hdfs/namenode</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoopuser/hadoopspace/hdfs/namenode/datanode</value>
</property>

Note other locations can be used in hdfs by separating values with a comma, e.g.

file:/home/hadoopuser/hadoopspace/hdfs/datanode, .disk2/Hadoop/datanode, . .

hadoop-env.sh

Add an entry for JAVA_HOME

export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

$ mkdir -p /home/hadoopuser/hadoopspace/hdfs/namenode

$ mkdir -p /home/hadoopuser/hadoopspace/hdfs/datanode

$ sudo chown hadoopuser:hadoopuser -R /usr/local/hadoop

Next format the namenode.

hdfs-format

Issue the following commands.

./start-dfs.sh
./start-yarn.sh

StartDemons

Issue the jps command and verify that the following jobs are running:

jps

At this point Hadoop has been installed and configured

type on terminal ,

firefox http://localhost:50070(namenode)

firefox http://localhost:50075(datanode)

firefox http://localhost:50090(checkpoint namenode)

firefox http://localhost:8088(Yarn Cluster)

Hadoop-namenode

MapReduce

A lap around the latest PowerBI annoucements , Socrata OData Feed & RealTime Fast Streaming Data Analytics


Last month, 27th february 2015 , there are some new awesome features connected with Microsoft PowerBI, lets have a quick look at this, first of all , in this release , the powerbi comes out of office 365 & Microsoft Office veils & you can now connect your data not only from Excel workbooks /Azure but also from PowerBI Designer files, Sendgrid, SalesForce CRM, Microsoft SQL Server Analysis Service, Azure Stream Analytics(private preview).

In the first demo, I’ve collected real time data from White House Visitors Records directory using OData feed by Socrata api using this link http://open.whitehouse.gov/OData.svc/p86s-ychb from Excel -> PowerQuery-> OData Feed or Excel-> Data-> OData Feed option.

PowerQuery

 

 

Next, import data into PowerPivot table & build out the linked tables to put out the powerview dashboard.

 

White-House

 

Also, you can sign up for PowerBI public preview dashboard here , but make sure that the preview is now available for users in United States only.

The PowerMap tour is compiled along with the latest features introduced as Custom Maps in PowerMap & rich set of effects. The powermap tour on White House Visitors records index analysis is available on Youtube.

Upload the excel PowerView Dashboard workbook on PowerBI public preview portal & you can view the amazing experience including PowerQ&A without the environment of Office 365.

PowerBI-PublicPreview

 

In new powerbi public preview portal , lots of option by which you can import data like SQL Server Analysis Service, Excel workbook, PowerBI Designer files, SendGrid, SalesForce CRM, Microsoft Dynamics, Marketo, GitHub, ZenDesk etc.

Get-Data

The new powerbi designer file is available for free download by this link & some spectacular views have been introduced in the designer preview like Tree charts, Gauge, Combo, Tabular etc.

Designer

 

 

In the next demo, I extracted real time 9-1-1 call records index data from http://data.seattle.gov/ & analysed 911 call records index over 2 days , possible report locations, types of reports all over US & of course over greater Seattle Area.