how to execute Cloudera Hadoop(CDH4) Quickstart VM(CentOS 6.2) in Windows with VMware workstation
August 10, 2013 6 Comments
In order to execute Cloudera Hadoop CDH4 VM in Windows , you need to download the quickstart VM from here according to your VM version(i.e VMware/VirtualBox/KVM). It requires a 64 bit host OS. This VM runs CentOS 6.2 and includes CDH4.3, Cloudera Manager 4.6, Cloudera Impala 1.0.1 and Cloudera Search .9 Beta.
For this demo, I have used VMware version of Cloudera Quickstart VM for running on Windows 8 64 bit host OS.
Few points to ponder:
- This is a 64-bit VM, and requires a 64-bit host OS and a virtualization product that can support a
64-bit guest OS.
- This VM uses 4 GB of total RAM. The total system memory required varies depending on the size
of your data set and on the other processes that are running.
- The demo VM file is approximately 2 GB. Feel free to mirror internally or externally to minimize
- To use the VMware VM, you must use a player compatible with WorkStation 8.x or higher: Player 4.x or higher, ESXi 5.x or higher, or Fusion 4.x or higher. Older versions of WorkStation can be used to create a new VM using the same virtual disk (VMDK file), but some features in VMware Tools won’t be available.
- After downloading the Cloudera VM , extract it & select the virtual machine configuration (.vmx) file.
- Open the .vmx file by vmware workstation & start the VM.
- To start work on hadoop console, click on hue & login with default Id ‘admin’ & password ‘admin’
- Similarly, login to cloudera manager console with default user id ‘admin‘ & password ‘admin‘ in order to check hadoop cluster’s health.
- You can check the hadoop namenode cluster details with summary along with namenode logs along with HDFS clusters.
- Cloudera Hadoop (CDH4) VM contains inbuilt eclipse integrated with apache hadoop to write mapreduce jobs with ease.
- Once you open the eclipse integrated with hadoop, a default MapReduce java project is available which runs on Java SE 1.6