- Authors
- Name
- Introduction
- 1. What is HBase?
- 2. HBase Architecture
- 3. Data Model
- 4. RowKey Design Strategies
- 5. Essential HBase Shell Commands
- 6. Java API Code Examples
- 7. Read/Write Performance Optimization
- 8. Region Split and Compaction
- 9. Monitoring and Management
- 10. HBase vs Cassandra vs MongoDB
- 11. Troubleshooting
- 12. Operations Checklists
- Conclusion
Introduction
HBase is a distributed NoSQL database built on the Google Bigtable paper. It runs on top of HDFS and is optimized for large-scale random reads/writes that can handle billions of rows and millions of columns. It is widely used in scenarios requiring low-latency access to large volumes of data, such as time-series data, log analysis, and real-time serving layers.
This article systematically covers everything from HBase core concepts to real-world operational know-how.
1. What is HBase?
Key Characteristics
| Characteristic | Description |
|---|---|
| Distributed | Horizontally scalable on top of HDFS |
| Column-Family Based | Data stored in column family units |
| Versioned | Each cell can store multiple versions (timestamps) |
| Strong Consistency | Guarantees atomic reads/writes for a single row |
| Auto-Sharding | Data automatically distributed in Region units |
| High Throughput | Capable of handling hundreds of thousands of QPS |
When to Use HBase?
Suitable cases:
- Massive datasets with billions or more rows
- Fast random reads/writes needed (< 10ms)
- Time-series data (IoT sensors, logs, metrics)
- Wide tables (hundreds to thousands of columns)
- Integration with HDFS is required
Unsuitable cases:
- Small datasets with only millions of rows or fewer
- Complex JOINs or transactions required
- Full-text search (Elasticsearch is more suitable)
- Ad-hoc analytical queries (Hive, Spark SQL are more suitable)
2. HBase Architecture
Overall Structure
┌────────────────────────────────────────────────────────┐
│ Client │
│ (HBase Shell, Java API, REST, Thrift) │
└───────────────────────┬────────────────────────────────┘
│
┌───────▼───────┐
│ ZooKeeper │
│ (coordination│
│ & discovery)│
└───┬───────┬───┘
│ │
┌───────▼──┐ ┌─▼──────────┐
│ HMaster │ │ HMaster │
│ (Active) │ │ (Standby) │
└───────┬───┘ └────────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│RegionSvr │ │RegionSvr │ │RegionSvr │
│┌────────┐│ │┌────────┐│ │┌────────┐│
││Region A││ ││Region C││ ││Region E││
│├────────┤│ │├────────┤│ │├────────┤│
││Region B││ ││Region D││ ││Region F││
│└────────┘│ │└────────┘│ │└────────┘│
└──────────┘ └──────────┘ └──────────┘
│ │ │
└──────────────┼──────────────┘
│
┌──────▼──────┐
│ HDFS │
│ (Storage) │
└─────────────┘
Roles of Each Component
| Component | Role | Impact on Failure |
|---|---|---|
| HMaster | Region assignment, DDL processing, load balancing | DDL unavailable, auto-balancing stops (reads/writes still work) |
| RegionServer | Data read/write processing, Region management | Access to affected Regions unavailable (auto-recovery) |
| ZooKeeper | HMaster election, RS status tracking, meta location | Entire cluster goes down |
| HDFS | Persistent data storage | Risk of data loss |
Region Internal Structure
┌─────────────── Region ──────────────────┐
│ │
│ ┌─── Column Family: cf1 ─────────────┐ │
│ │ ┌──────────┐ │ │
│ │ │ MemStore │ (memory, write buf) │ │
│ │ └──────────┘ │ │
│ │ ┌──────────┐ ┌──────────┐ │ │
│ │ │ HFile 1 │ │ HFile 2 │ │ │
│ │ └──────────┘ └──────────┘ │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌─── Column Family: cf2 ─────────────┐ │
│ │ ┌──────────┐ │ │
│ │ │ MemStore │ │ │
│ │ └──────────┘ │ │
│ │ ┌──────────┐ │ │
│ │ │ HFile 1 │ │ │
│ │ └──────────┘ │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────┐ │
│ │ WAL (Write-Ahead Log) │ │
│ └──────────────────────────────────┘ │
└──────────────────────────────────────────┘
Write Path
1. Client → Put request to RegionServer
2. Write to WAL (Write-Ahead Log) (stored on HDFS, for failure recovery)
3. Write to MemStore (memory)
4. Return success response to Client
5. When MemStore is full → Flush to HFile (stored on HDFS)
Read Path
1. Client → Get request to RegionServer
2. Check Block Cache (memory)
3. Check MemStore (memory)
4. Check HFile (disk, quickly filtered via Bloom Filter)
5. Merge results (return latest version)
3. Data Model
Logical Data Model
Table: user_activity
─────────────────────────────────────────────────────────────────
RowKey | Column Family: info | Column Family: stats
| name | email | login_count | last_login
─────────────────────────────────────────────────────────────────
user001 | Kim YJ | yj@email.com | 142 | 2026-03-08
user002 | Park SJ | sj@email.com | 87 | 2026-03-07
user003 | Lee HN | hn@email.com | 256 | 2026-03-08
─────────────────────────────────────────────────────────────────
Physical Storage Structure
# Data is actually stored separately per Column Family
# Key-Value format: (RowKey, CF:Qualifier, Timestamp) → Value
# Column Family: info
(user001, info:name, t3) → "Kim YJ"
(user001, info:name, t1) → "Kim YJ (old)" # Previous version
(user001, info:email, t2) → "yj@email.com"
(user002, info:name, t4) → "Park SJ"
(user002, info:email, t4) → "sj@email.com"
# Column Family: stats (separate HFile)
(user001, stats:login_count, t5) → 142
(user001, stats:last_login, t5) → "2026-03-08"
Key Terminology
| Term | Description | RDBMS Analogy |
|---|---|---|
| Table | Data container | Table |
| Row | Row identified by RowKey | Row |
| Column Family | Column group (physical storage unit) | (none) |
| Column Qualifier | Column name within a CF | Column |
| Cell | Value at (Row, CF:Qualifier, Timestamp) | Cell |
| Timestamp | Cell version (default: milliseconds) | (none) |
| Region | Horizontal partition unit of a table | Partition |
4. RowKey Design Strategies
RowKey design determines 80% of HBase performance.
Hotspot Problem
# Bad RowKey: Sequential keys
# → All writes concentrate on the last Region (hotspot)
RowKey: 20260308_000001
RowKey: 20260308_000002
RowKey: 20260308_000003
↓ All writes concentrate on one Region!
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Region 1 │ │ Region 2 │ │ Region 3 │
│ (idle) │ │ (idle) │ │(overload!)│
└──────────┘ └──────────┘ └──────────┘
Solution 1: Salting (Prefix Hashing)
// Original RowKey: "20260308_user001"
// With Salt: hash("20260308_user001") % NUM_REGIONS + "_" + originalKey
int numRegions = 10;
String originalKey = "20260308_user001";
int salt = Math.abs(originalKey.hashCode() % numRegions);
String saltedKey = String.format("%02d_%s", salt, originalKey);
// Result: "07_20260308_user001"
// Data is evenly distributed across Regions
// Region 0: 00_xxx, Region 1: 01_xxx, ... Region 9: 09_xxx
Pros: Even write load distribution Cons: Must scan all Regions for range scans
Solution 2: Key Reversing
// Reverse domain-based RowKeys for distribution
// Original: "com.google.www" → Reversed: "www.google.com"
// Original: "com.google.mail" → Reversed: "liam.elgoog.moc"
// Timestamp reversing
long reverseTimestamp = Long.MAX_VALUE - System.currentTimeMillis();
String rowKey = userId + "_" + reverseTimestamp;
// Latest data is sorted first (most common query pattern)
Solution 3: Hashing
// Use first N characters of MD5 hash as prefix
String hashPrefix = DigestUtils.md5Hex(userId).substring(0, 4);
String rowKey = hashPrefix + "_" + userId + "_" + timestamp;
// Result: "a3f2_user001_20260308120000"
RowKey Design Principles Summary
| Principle | Description |
|---|---|
| Keep it short | RowKey is repeatedly stored in every Cell, so length wastes storage |
| Consider read patterns | Design for the most frequent queries |
| Prevent hotspots | Avoid sequential/monotonically increasing keys |
| Reverse versioning | Use reverse timestamp to read latest data first |
| Composite key delimiter | Use _ or \x00 |
RowKey Examples by Use Case
# Time-series data (IoT sensors)
salt_deviceId_reverseTimestamp
Example: 03_sensor042_9223370449055775807
# User activity logs
userId_reverseTimestamp_activityType
Example: user001_9223370449055775807_login
# Web page crawling
reversedDomain_path_timestamp
Example: moc.elgoog_/search_20260308
# Messaging system
chatRoomId_reverseTimestamp_messageId
Example: room001_9223370449055775807_msg12345
5. Essential HBase Shell Commands
Table Management
# Start HBase Shell
hbase shell
# Create table
create 'user_activity', \
{NAME => 'info', VERSIONS => 3, COMPRESSION => 'SNAPPY', BLOOMFILTER => 'ROW'}, \
{NAME => 'stats', VERSIONS => 1, COMPRESSION => 'SNAPPY', TTL => 2592000}
# List tables
list
# Table details
describe 'user_activity'
# Disable/drop table
disable 'user_activity'
drop 'user_activity'
# Alter table structure (must be disabled)
disable 'user_activity'
alter 'user_activity', {NAME => 'info', VERSIONS => 5}
alter 'user_activity', {NAME => 'logs'} # Add new CF
enable 'user_activity'
# Create pre-split table (prevent hotspots)
create 'events', 'data', SPLITS => ['10', '20', '30', '40', '50', '60', '70', '80', '90']
Data CRUD
# Put (insert/update)
put 'user_activity', 'user001', 'info:name', 'Kim YJ'
put 'user_activity', 'user001', 'info:email', 'yj@email.com'
put 'user_activity', 'user001', 'stats:login_count', '142'
# Get (single row query)
get 'user_activity', 'user001'
get 'user_activity', 'user001', {COLUMN => 'info:name'}
get 'user_activity', 'user001', {COLUMN => 'info:name', VERSIONS => 3}
get 'user_activity', 'user001', {TIMERANGE => [1709856000000, 1709942400000]}
# Scan (range scan)
scan 'user_activity'
scan 'user_activity', {LIMIT => 10}
scan 'user_activity', {STARTROW => 'user001', STOPROW => 'user010'}
scan 'user_activity', {COLUMNS => ['info:name', 'stats:login_count']}
scan 'user_activity', {FILTER => "SingleColumnValueFilter('info','name',=,'binary:Kim YJ')"}
# Delete
delete 'user_activity', 'user001', 'info:email' # Delete specific column
deleteall 'user_activity', 'user001' # Delete entire row
# Count (caution: slow on large datasets)
count 'user_activity'
count 'user_activity', INTERVAL => 100000
# Truncate
truncate 'user_activity'
Administrative Commands
# Cluster status
status
status 'detailed'
status 'simple'
# Region management
list_regions 'user_activity'
# Manual Region split
split 'user_activity', 'user500'
# Major Compaction (run manually, avoid peak hours)
major_compact 'user_activity'
# Flush
flush 'user_activity'
# Balancer status
balancer_enabled
balance_switch true
6. Java API Code Examples
Connection Setup
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
// Create Configuration
Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "zk1,zk2,zk3");
config.set("hbase.zookeeper.property.clientPort", "2181");
// Connection (thread-safe, same lifetime as application)
Connection connection = ConnectionFactory.createConnection(config);
// Table (NOT thread-safe, close after use)
Table table = connection.getTable(TableName.valueOf("user_activity"));
CRUD Operations
// Put (write)
Put put = new Put(Bytes.toBytes("user001"));
put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("Kim YJ"));
put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("email"), Bytes.toBytes("yj@email.com"));
put.addColumn(Bytes.toBytes("stats"), Bytes.toBytes("login_count"), Bytes.toBytes(142));
table.put(put);
// Batch Put (bulk writes)
List<Put> puts = new ArrayList<>();
for (int i = 0; i < 10000; i++) {
Put p = new Put(Bytes.toBytes(String.format("user%06d", i)));
p.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"),
Bytes.toBytes("User " + i));
puts.add(p);
}
table.put(puts);
// Get (read)
Get get = new Get(Bytes.toBytes("user001"));
get.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"));
Result result = table.get(get);
byte[] value = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"));
System.out.println("Name: " + Bytes.toString(value));
// Scan (range scan)
Scan scan = new Scan();
scan.withStartRow(Bytes.toBytes("user001"));
scan.withStopRow(Bytes.toBytes("user100"));
scan.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"));
scan.setCaching(500); // Rows returned per RPC
scan.setBatch(10); // Columns returned per row
try (ResultScanner scanner = table.getScanner(scan)) {
for (Result r : scanner) {
String rowKey = Bytes.toString(r.getRow());
String name = Bytes.toString(r.getValue(Bytes.toBytes("info"), Bytes.toBytes("name")));
System.out.println(rowKey + ": " + name);
}
}
// Delete
Delete delete = new Delete(Bytes.toBytes("user001"));
delete.addColumn(Bytes.toBytes("info"), Bytes.toBytes("email"));
table.delete(delete);
// Release resources
table.close();
connection.close();
BufferedMutator (Bulk Async Writes)
BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf("user_activity"))
.writeBufferSize(5 * 1024 * 1024); // 5MB buffer
try (BufferedMutator mutator = connection.getBufferedMutator(params)) {
for (int i = 0; i < 1000000; i++) {
Put put = new Put(Bytes.toBytes(String.format("user%08d", i)));
put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"),
Bytes.toBytes("User " + i));
mutator.mutate(put);
}
mutator.flush(); // Send remaining data in buffer
}
7. Read/Write Performance Optimization
Write Optimization
| Method | Description | Effect |
|---|---|---|
| Disable WAL | put.setDurability(Durability.SKIP_WAL) | Faster but risk of data loss |
| Batch Put | table.put(List<Put>) | Reduced network round trips |
| BufferedMutator | Async buffered writes | Optimal for bulk loading |
| Pre-split | Split Regions in advance at table creation | Prevent initial hotspots |
| BulkLoad | Generate HFiles directly via MapReduce | For massive data loading |
| Compression | Use SNAPPY, LZ4 | Reduced I/O |
Read Optimization
| Method | Description | Effect |
|---|---|---|
| Block Cache | LRU + Bucket Cache configuration | Cache frequently read blocks |
| Bloom Filter | ROW or ROWCOL level | Prevent unnecessary HFile reads |
| Scan Caching | scan.setCaching(500) | Reduced RPC calls |
| Column Specification | Query only needed columns | Reduced I/O |
| Coprocessor | Server-side processing | Reduced network traffic |
| Short-circuit Read | Direct local DataNode reads | Eliminate HDFS overhead |
BulkLoad Example
# 1. Convert CSV to HFile (MapReduce)
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv \
-Dimporttsv.separator=',' \
-Dimporttsv.columns='HBASE_ROW_KEY,info:name,info:email,stats:login_count' \
-Dimporttsv.bulk.output='/tmp/hbase-bulkload' \
user_activity \
/input/users.csv
# 2. Load HFiles into HBase
hbase org.apache.hadoop.hbase.tool.LoadIncrementalHFiles \
/tmp/hbase-bulkload \
user_activity
8. Region Split and Compaction
Region Split
# Regions automatically split when reaching a certain size
# Default split policy: IncreasingToUpperBoundRegionSplitPolicy
# Region size calculation:
# min(r^2 * memstore.flush.size * 2, hbase.hregion.max.filesize)
# r = number of Regions of the same table on the same RS
# Manual configuration
hbase.hregion.max.filesize = 10737418240 # 10GB
# Split process:
# 1. Determine split point (midpoint)
# 2. Create two new Regions (daughters)
# 3. Deactivate the old Region
# 4. Update meta table
# 5. Activate new Regions
# 6. Clean up old Region (after compaction)
Compaction
# Minor Compaction
# - Merges small HFiles into larger HFiles
# - Retains delete markers (tombstones)
# - Runs automatically at regular intervals
# Major Compaction
# - Merges all HFiles into a single HFile
# - Removes delete markers, expired versions
# - Heavy disk I/O → avoid peak hours
# - Default auto-run every 7 days
# Manual Major Compaction
major_compact 'user_activity'
# Disable automatic Major Compaction
# hbase-site.xml
# hbase.hregion.majorcompaction = 0
# → Run manually during off-peak hours via cron
<!-- hbase-site.xml Compaction configuration -->
<configuration>
<!-- Minor Compaction trigger threshold -->
<property>
<name>hbase.hstore.compactionThreshold</name>
<value>3</value> <!-- Minor Compaction when 3+ HFiles -->
</property>
<!-- Major Compaction interval (0 = disabled) -->
<property>
<name>hbase.hregion.majorcompaction</name>
<value>604800000</value> <!-- 7 days, set to 0 to disable -->
</property>
<!-- MemStore flush size -->
<property>
<name>hbase.hregion.memstore.flush.size</name>
<value>134217728</value> <!-- 128MB -->
</property>
</configuration>
9. Monitoring and Management
Key Monitoring Metrics
| Category | Metric | Alert Threshold |
|---|---|---|
| RegionServer | GC Pause Time | > 5 seconds |
| RegionServer | Heap Used % | > 80% |
| RegionServer | Compaction Queue Size | > 10 (sustained) |
| RegionServer | MemStore Size | 90% of flush threshold |
| Region | Request Count (R/W) | Concentrated on specific Region |
| Region | Store File Count | > 10 (Compaction needed) |
| HMaster | Dead RegionServers | > 0 |
| HMaster | RIT (Regions in Transition) | > 0 (sustained) |
HBase Web UI
# HMaster UI: http://hmaster:16010
# RegionServer UI: http://regionserver:16030
# Check JMX metrics
curl http://regionserver:16030/jmx?qry=Hadoop:service=HBase,name=RegionServer,sub=Server
Operations Script
#!/bin/bash
# HBase cluster status check script
echo "=== HBase Cluster Status ==="
echo "status" | hbase shell 2>/dev/null | grep -E "servers|dead|regions"
echo ""
echo "=== Region Distribution ==="
echo "status 'detailed'" | hbase shell 2>/dev/null | grep -E "regionserver|regions="
echo ""
echo "=== Table Sizes ==="
for table in $(echo "list" | hbase shell 2>/dev/null | grep -v "TABLE\|row(s)"); do
size=$(hdfs dfs -du -s -h /hbase/data/default/$table 2>/dev/null | awk '{print $1 $2}')
echo "$table: $size"
done
10. HBase vs Cassandra vs MongoDB
| Item | HBase | Cassandra | MongoDB |
|---|---|---|---|
| Data Model | Column-Family | Wide Column | Document (JSON) |
| Consistency | Strong consistency | Tunable (AP by default) | Tunable |
| Scalability | Horizontal scaling | Horizontal scaling (excellent) | Horizontal scaling (Sharding) |
| Query Language | Scan/Get API | CQL (SQL-like) | MQL (JSON-like) |
| Secondary Index | Limited | Built-in support | Rich indexing |
| JOIN | Not supported | Not supported | $lookup (limited) |
| Operational Complexity | High (requires HDFS, ZK) | Medium | Low |
| Suitable Workloads | Large-scale sequential scans + random reads | High write throughput | Flexible schema, CRUD |
| Ecosystem | Hadoop (Hive, Spark) | Independent | Atlas, Realm |
| Max Data Scale | PB-scale | PB-scale | TB~PB-scale |
11. Troubleshooting
RegionServer Failure
# Check RegionServer status
echo "status" | hbase shell
# Check specific RegionServer logs
tail -f /var/log/hbase/hbase-hbase-regionserver-hostname.log
# Reassign Region
hbase hbck -reassign <encoded-region-name>
# hbck2 (HBase 2.x)
hbase hbck -j /path/to/hbase-hbck2.jar assigns <encoded-region-name>
GC Tuning
# hbase-env.sh
export HBASE_REGIONSERVER_OPTS="
-Xmx32g -Xms32g
-XX:+UseG1GC
-XX:MaxGCPauseMillis=100
-XX:+ParallelRefProcEnabled
-XX:G1HeapRegionSize=16m
-XX:InitiatingHeapOccupancyPercent=65
-verbose:gc
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xloggc:/var/log/hbase/gc-regionserver.log
"
Hotspot Region Isolation
# Check request count per Region
echo "status 'detailed'" | hbase shell | grep -A2 "requestsPerSecond"
# Manual split of hotspot Region
split 'ENCODED_REGION_NAME', 'split_key'
# Or table-level split
split 'user_activity', 'user500000'
RIT (Regions in Transition) Resolution
# Check RIT status
echo "status 'detailed'" | hbase shell | grep "transition"
# Force assignment
hbase hbck -j /path/to/hbase-hbck2.jar assigns <region-encoded-name>
# Repair meta table
hbase hbck -j /path/to/hbase-hbck2.jar fixMeta
12. Operations Checklists
Initial Design Checklist
- RowKey design: Apply hotspot prevention strategies (Salting, Hashing)
- Minimize Column Family count (2~3 or fewer recommended)
- TTL configuration: Automatically delete unnecessary data
- VERSIONS configuration: Keep only needed version count
- Pre-split: Split Regions according to expected data distribution
- Bloom Filter: Configure ROW or ROWCOL
- Compression: Configure SNAPPY or LZ4
- Evaluate need for Coprocessors
Daily Operations Checklist
- Monitor RegionServer heap usage
- Check Compaction Queue size
- Check GC Pause time (alert if > 5 seconds)
- Check for dead RegionServers
- Check for Region hotspots
- Check HDFS capacity
- Check ZooKeeper status
Regular Inspection Checklist
- Run Major Compaction during off-peak hours
- Run
hbase hbckto verify table integrity - Check Region balancing status
- Analyze disk usage trends per table
- Analyze and tune GC logs
- Test backup/recovery (Snapshot, ExportSnapshot)
- Check HBase, Hadoop security patches
Conclusion
HBase is a powerful tool for scenarios requiring low-latency processing of massive datasets. However, it requires a completely different mindset from RDBMS.
Key Takeaways:
- RowKey is everything: Hotspot prevention and design aligned with read patterns are the key
- Keep Column Families minimal: As physical storage units, keep to 2~3 or fewer
- Compaction management: Run Major Compaction manually during off-peak hours
- Monitoring is essential: Focus on GC Pause, Region distribution, and Compaction Queue
- Separate read/write patterns: Use BulkLoad for bulk writes, Block Cache + Bloom Filter for real-time reads