Skip to main content

Cloudera QuickStart virtual machines (VMs) Installation

Cloudera Distribution including Apache Hadoop ( CDH ) is the most popular Hadoop distribution currently available. CDH is 100% open source.

Cloudera quick start VMs include everything that is needed to tryout basic package based CDH installation. This is useful to create initial deployments for proof of concept (POC) or development.

The QuickStart VMs contain a single-node Apache Hadoop cluster, complete with example data, queries, scripts, and Cloudera Manager to manage the cluster. VMs are available as Zip archives in VMware, KVM, and VirtualBox formats. These 64-bit VMs require a 64-bit host OS and a virtualization product that can support a 64-bit guest OS. CDH installation require 4GB+ RAM.

Stepwise illustration on how to install Cloudera Quickstart VM given below :
Step 1

Download latest quickstart VM.

http://www.cloudera.com/downloads.html


Virtualbox version used for demo purpose.
http://download.virtualbox.org/virtualbox/4.3.30/VirtualBox-4.3.30-101610-Win.exe

Step 2

Download "cloudera-quickstart-vm-5.4.2-0-virtualbox.zip" and extract the zip file.
Open the Oracle Virtual box and click on "import appliance".

















Step 3
Browse and select the Open Virtualization Format ( OVF ) file - "cloudera-quickstart-vm-5.4.2-0-virtualbox.ovf".
















Step 4

Review the settings and click on import.
Step 5

Start the Quickstart VM once the import completed successfully.




Step 6
Cloudera Quickstart VM up and running.
Use following link to access : http://quickstart.cloudera/#/

Hue : http://quickstart.cloudera:8888/about/





Comments

Popular posts from this blog

DW Architecture - Traditional vs Bigdata Approach

DW Flow Architecture - Traditional             Using ETL tools like Informatica and Reporting tools like OBIEE.   Source OLTP to Stage data load using ETL process. Load Dimensions using ETL process. Cache dimension keys. Load Facts using ETL process. Load Aggregates using ETL process. OBIEE connect to DW for reporting.  

Healthcare Analytics Example - Predicting Hospital Readmissions for Diabetic Patients

  Scenario: A healthcare institution seeks to decrease the frequency of hospital readmissions for patients diagnosed with diabetes. Repeated hospital stays incur significant expenses and frequently signal unfavorable patient results. The business aims to utilize big data analytics to proactively identify patients with a high likelihood of readmission and react accordingly.