Skip to content

필사 모드: How to Use HBase YCSB (Yahoo Cloud Serving Benchmark)

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Overview

There are times when you want to push data into HBase to measure the maximum performance of an HBase cluster. Or you may want to insert dummy data into HBase. For these cases, there is a tool called YCSB that I would like to introduce.

Installation

Install Maven

To build YCSB from source code, Maven version 3.x or higher is required. Install it with the following command and then check the Maven version. In my case, I used Maven 3.6.3.

sudo apt install maven

mvn -version

Git Clone

Clone the project from [YCSB-github](https://github.com/brianfrankcooper/YCSB/).

Maven Build

cd YCSB

mvn clean package

Install Python 2.7

YCSB is designed to run on Python 2.xx. Therefore, set up a virtualenv that can run Python 2.7 and activate it.

pip install virtualenv

virtualenv py27 --python=python2.7

source py27/bin/activate

Create a Table in HBase

Refer to the [YCSB HBase2](https://github.com/brianfrankcooper/YCSB/tree/master/hbase2) documentation to create a table called usertable in HBase beforehand.

hbase:001:0> n_splits = 50

=> 50

hbase:002:0> create 'usertable', 'family', {SPLITS => (1..n_splits).map {|i| "user#{1000+i*(9999-1000)/n_splits}"}}

2022-11-19 12:29:26,454 INFO [RPCClient-NioEventLoopGroup-1-1] Configuration.deprecation (Configuration.java:logDeprecation(1394)) - hbase.client.pause.cqtbe is deprecated. Instead, use hbase.client.pause.server.overloaded

Created table usertable2022-11-19 12:29:30,765 INFO [RPCClient-NioEventLoopGroup-1-2] client.AsyncHBaseAdmin (RawAsyncHBaseAdmin.java:onFinished(2569)) - Operation: CREATE, Table Name: default:usertable completed

Took 4.7173 seconds

=> Hbase::Table - usertable

YCSB Folder

YCSB requires hbase-site.xml to run benchmarks, so create a new folder for this purpose.

Then go to one of the servers in the HBase cluster, copy the `${HBASE_HOME}/conf/hbase-site.xml` file, and paste it into this folder. You can also use scp to copy it.

/YCSB$ mkdir youngju-hbase

/YCSB$ cd youngju-hbase/

/YCSB/youngju-hbase$ vim hbase-site.xml

root@ubuntu01:~# cat /usr/local/hbase/conf/hbase-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

Run YCSB

Run `bin/ycsb load hbase2 -P workloads/workloada -cp youngju-hbase/ -p table=usertable -p` from the YCSB home directory, and 1000 records will be loaded as shown below. Make sure the Python 2.7 virtual environment is activated so the Python script runs without issues.

(py27) YCSB$ bin/ycsb load hbase2 -P workloads/workloada -cp youngju-hbase/ -p table=usertable -p columnfamily=family

[WARN] Running against a source checkout. In order to get our runtime dependencies we'll have to invoke Maven. Depending on the state of your system, this may take ~30-45 seconds

[DEBUG] Running 'mvn -pl site.ycsb:hbase2-binding -am package -DskipTests dependency:build-classpath -DincludeScope=compile -Dmdep.outputFilterFile=true'

java -cp youngju-hbase/:/home/youngju/work/YCSB/hbase2/conf:/home/youngju/work/YCSB/hbase2/target/hbase2-binding-0.18.0-SNAPSHOT.jar:/home/youngju/.m2/repository/org/apache/htrace/htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar:/home/youngju/.m2/repository/org/slf4j/slf4j-api/1.7.25/slf4j-api-1.7.25.jar:/home/youngju/.m2/repository/commons-logging/commons-logging/1.2/commons-logging-1.2.jar:/home/youngju/.m2/repository/org/apache/yetus/audience-annotations/0.5.0/audience-annotations-0.5.0.jar:/home/youngju/.m2/repository/com/github/stephenc/findbugs/findbugs-annotations/1.3.9-1/findbugs-annotations-1.3.9-1.jar:/home/youngju/.m2/repository/org/hdrhistogram/HdrHistogram/2.1.4/HdrHistogram-2.1.4.jar:/home/youngju/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/youngju/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.4/jackson-mapper-asl-1.9.4.jar:/home/youngju/.m2/repository/org/apache/hbase/hbase-shaded-client/2.2.3/hbase-shaded-client-2.2.3.jar:/home/youngju/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar:/home/youngju/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.4/jackson-core-asl-1.9.4.jar:/home/youngju/work/YCSB/core/target/core-0.18.0-SNAPSHOT.jar site.ycsb.Client -db site.ycsb.db.hbase2.HBaseClient2 -P workloads/workloada -p table=usertable -p columnfamily=family -load

Command line: -db site.ycsb.db.hbase2.HBaseClient2 -P workloads/workloada -p table=usertable -p columnfamily=family -load

YCSB Client 0.18.0-SNAPSHOT

Loading workload...

log4j:WARN No appenders could be found for logger (org.apache.htrace.core.Tracer).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Starting test.

DBWrapper: report latency for each error is false and specific error codes to track for latency are: []

[OVERALL], RunTime(ms), 11783

[OVERALL], Throughput(ops/sec), 84.86803021301876

[TOTAL_GCS_PS_Scavenge], Count, 2

[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 10

[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.08486803021301877

[TOTAL_GCS_PS_MarkSweep], Count, 1

[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 12

[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.10184163625562251

[TOTAL_GCs], Count, 3

[TOTAL_GC_TIME], Time(ms), 22

[TOTAL_GC_TIME_%], Time(%), 0.18670966646864126

[CLEANUP], Operations, 2

[CLEANUP], AverageLatency(us), 4844.5

[CLEANUP], MinLatency(us), 13

[CLEANUP], MaxLatency(us), 9679

[CLEANUP], 95thPercentileLatency(us), 9679

[CLEANUP], 99thPercentileLatency(us), 9679

[INSERT], Operations, 1000

[INSERT], AverageLatency(us), 9895.29

[INSERT], MinLatency(us), 4144

[INSERT], MaxLatency(us), 1103871

[INSERT], 95thPercentileLatency(us), 14295

[INSERT], 99thPercentileLatency(us), 23871

[INSERT], Return=OK, 1000

Next, change only the `load` part to `run` and execute `bin/ycsb run hbase2 -P workloads/workloada -cp youngju-hbase/ -p table=usertable -p`. This will perform operations such as PUT, READ, UPDATE, etc., as shown below.

(py27) YCSB$ bin/ycsb run hbase2 -P workloads/workloada -cp youngju-hbase/ -p table=usertable -p columnfamily=family

[WARN] Running against a source checkout. In order to get our runtime dependencies we'll have to invoke Maven. Depending on the state of your system, this may take ~30-45 seconds

[DEBUG] Running 'mvn -pl site.ycsb:hbase2-binding -am package -DskipTests dependency:build-classpath -DincludeScope=compile -Dmdep.outputFilterFile=true'

java -cp youngju-hbase/:/home/youngju/work/YCSB/hbase2/conf:/home/youngju/work/YCSB/hbase2/target/hbase2-binding-0.18.0-SNAPSHOT.jar:/home/youngju/.m2/repository/org/apache/htrace/htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar:/home/youngju/.m2/repository/org/slf4j/slf4j-api/1.7.25/slf4j-api-1.7.25.jar:/home/youngju/.m2/repository/commons-logging/commons-logging/1.2/commons-logging-1.2.jar:/home/youngju/.m2/repository/org/apache/yetus/audience-annotations/0.5.0/audience-annotations-0.5.0.jar:/home/youngju/.m2/repository/com/github/stephenc/findbugs/findbugs-annotations/1.3.9-1/findbugs-annotations-1.3.9-1.jar:/home/youngju/.m2/repository/org/hdrhistogram/HdrHistogram/2.1.4/HdrHistogram-2.1.4.jar:/home/youngju/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/youngju/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.4/jackson-mapper-asl-1.9.4.jar:/home/youngju/.m2/repository/org/apache/hbase/hbase-shaded-client/2.2.3/hbase-shaded-client-2.2.3.jar:/home/youngju/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar:/home/youngju/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.4/jackson-core-asl-1.9.4.jar:/home/youngju/work/YCSB/core/target/core-0.18.0-SNAPSHOT.jar site.ycsb.Client -db site.ycsb.db.hbase2.HBaseClient2 -P workloads/workloada -p table=usertable -p columnfamily=family -t

Command line: -db site.ycsb.db.hbase2.HBaseClient2 -P workloads/workloada -p table=usertable -p columnfamily=family -t

YCSB Client 0.18.0-SNAPSHOT

Loading workload...

log4j:WARN No appenders could be found for logger (org.apache.htrace.core.Tracer).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Starting test.

DBWrapper: report latency for each error is false and specific error codes to track for latency are: []

[OVERALL], RunTime(ms), 10020

[OVERALL], Throughput(ops/sec), 99.8003992015968

[TOTAL_GCS_PS_Scavenge], Count, 2

[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 14

[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.13972055888223553

[TOTAL_GCS_PS_MarkSweep], Count, 1

[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 17

[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.16966067864271456

[TOTAL_GCs], Count, 3

[TOTAL_GC_TIME], Time(ms), 31

[TOTAL_GC_TIME_%], Time(%), 0.3093812375249501

[READ], Operations, 485

[READ], AverageLatency(us), 8099.581443298969

[READ], MinLatency(us), 2850

[READ], MaxLatency(us), 197247

[READ], 95thPercentileLatency(us), 25519

[READ], 99thPercentileLatency(us), 49439

[READ], Return=OK, 485

[CLEANUP], Operations, 2

[CLEANUP], AverageLatency(us), 2581.5

[CLEANUP], MinLatency(us), 9

[CLEANUP], MaxLatency(us), 5155

[CLEANUP], 95thPercentileLatency(us), 5155

[CLEANUP], 99thPercentileLatency(us), 5155

[UPDATE], Operations, 515

[UPDATE], AverageLatency(us), 10536.794174757282

[UPDATE], MinLatency(us), 4048

[UPDATE], MaxLatency(us), 226815

[UPDATE], 95thPercentileLatency(us), 35359

[UPDATE], 99thPercentileLatency(us), 66687

[UPDATE], Return=OK, 515

Summary

[YCSB](https://github.com/brianfrankcooper/YCSB/) is a useful tool that can perform benchmarks on HBase 1.x, 2.x, and 3.x. Although I have not tried it, it also supports performance testing for various databases such as Redis, DynamoDB, Elasticsearch, Cassandra, and MongoDB. In real-world scenarios, no one knows what data will come in or in what pattern, so you should not rely solely on YCSB results. In particular, HBase can experience hotspots, and if requests are excessively concentrated on a specific region server, the overall QPS will decrease. For a detailed look at HBase architecture and row-key design, refer to [Cho Daehyeop's post on HBase and Google's Bigtable Architecture](https://bcho.tistory.com/1217).

Quiz

Q1: What is the main topic covered in "How to Use HBase YCSB (Yahoo Cloud Serving Benchmark)"?

YCSB (Yahoo Cloud Serving Benchmark) for testing HBase performance

Install Maven To build YCSB from source code, Maven version 3.x or higher is required. Install it

with the following command and then check the Maven version. In my case, I used Maven 3.6.3. Git

Clone Clone the project from YCSB-github.

현재 단락 (1/114)

There are times when you want to push data into HBase to measure the maximum performance of an HBase...

작성 글자: 0원문 글자: 9,753작성 단락: 0/114