Skip to content

필사 모드: HBase Row Key Design

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Background

The design of the HBase row key is very important. Since regions are divided by the range of row keys, if the row key prefix is not well designed, hot spot regions will occur, leading to significant performance degradation. One method to prevent this is to place a salt at the very beginning of the row key so that the rows are distributed well across different regions.

For example, if you design the row key as `send date:send time:message_id` when storing messages, the following messages would be processed by the same region server, causing performance degradation.

230611:063031:1231231

230611:063032:1231232

230611:063032:1231233

230611:063033:1231234

230611:063033:1231235

What if we put the message_id at the front?

1231231:230611:063031

1231232:230611:063032

1231233:230611:063032

1231234:230611:063033

1231235:230611:063033

This would also cause writes to concentrate on a single region due to the sequentially increasing message_id.

A good way to prevent this is to add a salt as a prefix to the row key using a good hash value.

With salt added, the row key structure would become `salt:send date:send time:message_id`.

The salt uses the return value from putting another key into a hash function, because the hash function's return value has a consistent length and randomness, which helps distribute regions evenly.

Among the most commonly used hash functions -- SHA, AES, and MD5 -- let's use MD5.

Using MD5_function(message_id) as the salt, the row keys would look like this:

8D4646EB2D7067126EB08ADB0672F7BB:230611:063031:1231231

715782C59C0561E9B6CE0F3D522C32F1:230611:063032:1231232

57F962C03EF3526EC6E95CEB50785C4C:230611:063032:1231233

8B353D5CC07E13577608711F4602FCB7:230611:063033:1231234

430EDB0C535BF08174E122EFECFA711D:230611:063033:1231235

Since the prefix order is no longer sequential, we can expect the data to be well scattered across different region servers. This allows for balanced use of HBase Region Server performance, greatly contributing to performance improvement.

Quiz

HBase Row Key Design

HBase Row Key Design

Consider the practical examples and patterns discussed throughout the post.

현재 단락 (1/28)

The design of the HBase row key is very important. Since regions are divided by the range of row key...

작성 글자: 0원문 글자: 1,874작성 단락: 0/28