Apache HBase

Apache HBase is an open-source clone of Google’s Bigtable, built on top of HDFS as the storage layer. Started in 2008 as part of the Hadoop ecosystem, HBase provides random, strongly-consistent read/write access to massive structured datasets — petabyte-scale tables with billions of rows and millions of columns. HBase remains widely deployed at large enterprises that already run Hadoop, but greenfield projects today usually choose Cassandra, ScyllaDB, or a managed wide-column service instead.

Key Features:

Bigtable Data Model. Sparse multidimensional sorted map: (row key, column family, column qualifier, timestamp) → value. Columns within a family are stored together; different rows can have entirely different columns.
Strong Consistency. Single master per region — reads and writes for a row are linearizable. Different from Cassandra’s eventual model.
HDFS Storage. SSTables (HFiles) live on HDFS, inheriting its durability, replication factor, and locality model.
Region Servers. Tables are split into row-key ranges (regions) assigned to region servers. Auto-split when a region exceeds size threshold.
HBase Shell + REST + Thrift. Multiple client surfaces; JVM is the native one.
Coprocessors. Server-side trigger / aggregation hooks — conceptual ancestor of Cassandra’s materialized views.

HBase vs. Cassandra:

HBase. Strong consistency, single master per region, HDFS-backed. Best when you already run Hadoop and want strong consistency.
Cassandra. Eventual consistency, masterless, multi-region native. Best for write-heavy and multi-DC workloads.
Both implement Bigtable-style wide-column data models with very similar query semantics.

Use Cases:

Operational data stores on top of an existing Hadoop cluster.
Massive time-series tables (Open TSDB built on HBase).
Real-time random access to data also queried by MapReduce / Spark batch jobs.
Facebook Messages, Yahoo, AdRoll, and many financial-services workloads ran on HBase historically.

Notes:

HBase’s master-region-server topology has more operational complexity than Cassandra’s peer-to-peer ring. For new deployments today the question usually becomes “HBase or Cassandra/Scylla?” — HBase wins when strong consistency is required and Hadoop infrastructure is already in place; otherwise Cassandra or a managed Bigtable / DynamoDB equivalent is typically simpler.