How to setup a basic HBase cluster

This article is a guide to setup a HBase cluster. The cluster runs on local CentOS virtual machines using Virtualbox. I use this to have a local environment for development and testing.

Prerequisites

This setup guides assumes you have gone through the Hadoop Setup Guide and the Zookeeper Setup Guide.

It assumes you are using the following software versions.

  • MacOS 10.11.3
  • Vagrant 1.8.5
  • Java 1.8.0
  • Zookeeper 3.4.8
  • Hadoop 2.7.3
  • HBase 1.2.2

Here are the steps I used:

  1. First, create a workspace.

    mkdir -p ~/vagrant_boxes/hbase

    cd ~/vagrant_boxes/hbase

  2. Next, create a new vagrant box. I’m using a minimal CentOS vagrant box.

    vagrant box add “CentOS 6.5 x86_64” https://github.com/2creatives/vagrant-centos/releases/download/v6.5.3/centos65-x86_64-20140116.box

  3. We are going to create a vagrant box with the packages we need. So, first we initialize the vagrant box.

    vagrant init -m “CentOS 6.5 x86_64” hbase_base

  4. Next, change the Vagrantfile to the following:

      Vagrant.configure(2) do |config|
        config.vm.box = "CentOS 6.5 x86_64"
        config.vm.box_url = "hbase_base"
        config.ssh.insert_key = false
      end
    

  5. Now, install HBase and it’s dependencies.

    vagrant up

    vagrant ssh

    sudo yum install java-1.8.0-openjdk-devel

    sudo yum install wget

    wget https://www.apache.org/dist/hbase/stable/hbase-1.2.2-bin.tar.gz ~

    gunzip -c *gz | tar xvf –

  6. Open up your ~/.bash_profile and append the following lines.

      export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk.x86_64
      export PATH=$PATH:$JAVA_HOME/bin
      export HBASE_HOME=~/hbase-1.2.2
      export PATH=$PATH:$HBASE_HOME/bin
      export HBASE_CONF_DIR=$HBASE_HOME/conf
    

  7. Source the profile.

    source ~/.bash_profile

  8. Create a ~/.ssh/config file to avoid host key checking for SSH. Since these are DEV servers, this is ok. Note that the indentation here before StrictHostKeyChecking must be a tab.

      Host *
            StrictHostKeyChecking no
    

  9. Now run these commands to finish the password-less authentication.

    chmod 600 ~/.ssh/config

    ssh-keygen -f ~/.ssh/id_rsa -t rsa -P “”

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  10. In /etc/hosts, remove any lines starting with 127.0.*, and add the following lines.

      192.168.50.11 zoo1.example.com
      192.168.50.12 zoo2.example.com
      192.168.50.13 zoo3.example.com
      192.168.50.14 zoo4.example.com
      192.168.50.15 zoo5.example.com
      192.168.50.21 hdfs-namenode.example.com
      192.168.50.22 hdfs-datanode1.example.com
      192.168.50.23 hdfs-datanode2.example.com
      192.168.50.24 hdfs-datanode3.example.com
      192.168.50.25 hdfs-datanode4.example.com
      192.168.50.31 hbase-master.example.com
      192.168.50.32 hbase-region1.example.com
      192.168.50.33 hbase-region2.example.com
      192.168.50.34 hbase-region3.example.com
      192.168.50.35 hbase-region4.example.com
    

  11. In ~/hbase-1.2.2/conf/hbase-env.sh, append the following lines to the bottom of the file.

      export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk.x86_64
      export HBASE_MANAGES_ZK=false
    

  12. Edit ~/hbase-1.2.2/conf/hbase-site.xml to contain the following:

      <configuration>
        <property>
          <name>hbase.rootdir</name>
          <value>hdfs://hdfs-namenode.example.com:9000/hbase</value>
        </property>
        <property>
          <name>hbase.master.hostname</name>
          <value>hbase-master.example.com</value>
        </property>
        <property>
          <name>hbase.cluster.distributed</name>
          <value>true</value>
        </property>
        <property>
          <name>hbase.zookeeper.quorum</name>
          <value>zoo1.example.com,zoo2.example.com,zoo3.example.com,zoo4.example.com,zoo5.example.com</value>
        </property>
        <property>
          <name>hbase.zookeeper.property.dataDir</name>
          <value>/tmp/zookeeper</value>
        </property>
      </configuration>
    

  13. In ~/hbase-1.2.2/conf/regionservers, remove localhost and add the following lines:

      hbase-region1.example.com
      hbase-region2.example.com
      hbase-region3.example.com
      hbase-region4.example.com
    

  14. The docs say you can create a “backup-masters” file in the conf directory, but I had a problem starting my cluster when I did. So, I skipped this step.

  15. Exit the SSH session and copy the VM for the other hbase nodes.

    exit

    vagrant halt

    vagrant package

    vagrant box add hbase ~/vagrant_boxes/hbase/package.box

  16. Edit the Vagrantfile to look like the following below. This will create 5 hbase nodes for us using the new HBase VM.

      Vagrant.configure("2") do |config|
        config.vm.define "hbase-master" do |node|
          node.vm.box = "hbase"
          node.vm.box_url = "hbase-master.example.com"
          node.vm.hostname = "hbase-master.example.com"
          node.vm.network :private_network, ip: "192.168.50.31"
          node.ssh.insert_key = false
    
          # Change hostname
          node.vm.provision "shell", inline: "hostname hbase-master.example.com", privileged: true
        end
    
        (1..4).each do |i|
          config.vm.define "hbase-region#{i}" do |node|
            node.vm.box = "hbase"
            node.vm.box_url = "hbase-region#{i}.example.com"
            node.vm.hostname = "hbase-region#{i}.example.com"
            node.vm.network :private_network, ip: "192.168.50.3#{i+1}"
            node.ssh.insert_key = false
    
            # Change hostname
            node.vm.provision "shell", inline: "hostname hbase-region#{i}.example.com", privileged: true
          end
        end
      end
    

  17. Bring the new Vagrant VMs up.

    vagrant up –no-provision

  18. Start HBase. For some reason, I can start HBase from the provisioner. So, I ssh in and start it up.

    vagrant provision

    vagrant ssh hbase-master

    ~/hbase-1.2.2/bin/start-hbase.sh

To test the cluster:

  1. Log into the Master Server and run ‘jps’ on the command line. You should see at least these two process.

    jps
    Jps
    HMaster

  2. Log into one of the Region Servers and run ‘jps’ on the command line. You should see at least these three processes.

    jps
    Jps
    HMaster
    HRegionServer

  3. Go to http://192.168.50.31:16010/ and you should see all of the Region Servers running.

  4. From the Master Server, start the HBase shell.

    vagrant ssh hbase-master

    sudo ~/hbase-1.2.2/bin/hbase shell

  5. At the command prompt, you should be able to create a table.

    create ‘test’, ‘cf’

  6. And you should be able to list the table.

    list

  7. And you should be able to put date into the table.

    put ‘test’, ‘row1’, ‘cf:a’, ‘value1’

    put ‘test’, ‘row2’, ‘cf:b’, ‘value2’

    put ‘test’, ‘row3’, ‘cf:c’, ‘value3’

  8. And you should be able to view all the data in the table.

    scan ‘test’

  9. Or just get one row.

    get ‘test’, ‘row1’

Leave a Reply