Overview of HDInsight HBase

In continuation of my series on HDInsight and the different clusters within it, today I’ll cover HBase. HBase is a NoSQL database that provides random access and strong consistency for structured, unstructured and semi-structured data.

It’s a schema-less (or organized by families of columns) database. Another way to describe it is it’s sort of modeled after Google’s Bigtable, where data is stored in the rows of a table and then grouped by a column family. As it’s schema-less, neither the columns themselves or the data types inside of the columns need to be defined before using the data.

Some other key things to be aware of with HBase:

  • As with all the HDInsight components, this get implemented as a managed cluster and a Platform as a Service offering in which we can separate compute nodes from storage.
  • It has a scale out architecture that helps provide automatic sharding or horizontal partitioning of tables, where essentially rows of a table are held separately rather than splitting those columns as we would in a typical table normalization.
  • Strong consistency for read and write as it’s part of the architecture of HBase.
  • Automatic failover built in, so you have multiple clusters that you can failover to multiple nodes.
  • In-memory caching for reads and writes, which helps with performance, as well as moving your data in and out quicker.

Some of the most common workloads:

    • A search engine like I mentioned with Google’s Bigtable, which builds indexes that map terms to webpages that contain them.
    • A key value store. Facebook uses HBase for their messaging system because it’s ideal for storing and managing internet communications.
    • Also, a good repository for collecting sensor data, so where large amounts of data are being pulled into this NoSQL Table and it can be used to build dashboards for reporting.

I still have a few HDInsight technologies to cover in this series. Many of these are interrelated and work together to complete and update data architecture.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.