Skip to content

필사 모드: How to Build Hadoop 3.4 (Ubuntu 22.04)

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Overview

Hadoop is such a massive piece of software that even setting up the build environment can be challenging.

The [official Hadoop build guide](https://github.com/apache/hadoop/blob/trunk/BUILDING.txt) provides detailed instructions on how to build it.

While there are instructions for building on Linux, CentOS, MacOS, and Windows, building through Docker is recommended.

This is because although you can set up a build environment on a physical Linux machine, restoring that environment later is difficult. Also, since builds are sensitive to software versions, you may need to downgrade software versions that are already running well on your machine.

By leveraging Docker containers for convenience, you can dramatically reduce the time and effort needed to set up the build environment. Below, I will document the method.

install docker

sudo apt-get update

sudo apt-get install \

ca-certificates \

curl \

gnupg \

lsb-release

sudo mkdir -p /etc/apt/keyrings

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

echo \

"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \

$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

Verify Docker Installation

sudo docker run hello-world

If you see the following message, the installation was successful.

Hello from Docker!

This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

1. The Docker client contacted the Docker daemon.

2. The Docker daemon pulled the "hello-world" image from the Docker Hub.

(amd64)

3. The Docker daemon created a new container from that image which runs the

executable that produces the output you are currently reading.

4. The Docker daemon streamed that output to the Docker client, which sent it

to your terminal.

To try something more ambitious, you can run an Ubuntu container with:

$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:

https://hub.docker.com/

For more examples and ideas, visit:

https://docs.docker.com/get-started/

Setting Up the Hadoop Build Environment

After cloning the trunk branch from the official Hadoop GitHub repository, running `sudo ./start-build-env.sh` in that directory will automatically set up the build environment for Hadoop.

git clone https://github.com/apache/hadoop.github

cd hadoop

sudo ./start-build-env.sh

If the build environment is set up successfully, the following text will appear in the terminal.

Successfully built 147e63abcbef

Successfully tagged hadoop-build-1000:latest

_ _ _ ______

| | | | | | | _ \

| |_| | __ _ __| | ___ ___ _ __ | | | |_____ __

| _ |/ _` |/ _` |/ _ \ / _ \| '_ \ | | | / _ \ \ / /

| | | | (_| | (_| | (_) | (_) | |_) | | |/ / __/\ V /

\_| |_/\__,_|\__,_|\___/ \___/| .__/ |___/ \___| \_(_)

| |

hadoop build

Entering the following command will generate the Source and Binary distributions.

export MAVEN_OPTS="-Xms256m -Xmx1536m"

sudo JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn package -Pdist,src -DskipTests -Dtar

You can change the build options as needed, as shown below.

Building distributions:

Create binary distribution without native code and without Javadocs:

$ mvn package -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true

Create binary distribution with native code:

$ mvn package -Pdist,native -DskipTests -Dtar

Create source distribution:

$ mvn package -Psrc -DskipTests

Create source and binary distributions with native code:

$ mvn package -Pdist,native,src -DskipTests -Dtar

Create a local staging version of the website (in /tmp/hadoop-site)

$ mvn site site:stage -Preleasedocs,docs -DstagingDirectory=/tmp/hadoop-site

If the build succeeds, a BUILD SUCCESS message will be displayed as shown below. It took 23 minutes on the laptop currently in use.

[INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [ 3.600 s]

[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 25.185 s]

[INFO] Apache Hadoop Client Modules ....................... SUCCESS [ 0.024 s]

[INFO] Apache Hadoop Tencent COS Support .................. SUCCESS [ 4.816 s]

[INFO] Apache Hadoop OBS support .......................... SUCCESS [ 21.104 s]

[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [ 3.470 s]

[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [ 0.016 s]

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 23:13 min

[INFO] Finished at: 2022-12-25T02:17:47Z

[INFO] ------------------------------------------------------------------------

The build output is located in the `hadoop/hadoop-dist/target` folder.

You can install Hadoop using this binary file.

285f7027d3f:~/hadoop/hadoop-dist/target$ ll

total 662864

drwxr-xr-x 9 root root 4096 Dec 25 02:17 ./

drwxr-xr-x 3 youngjukim youngjukim 4096 Dec 24 17:05 ../

drwxr-xr-x 2 root root 4096 Dec 25 02:16 antrun/

drwxr-xr-x 3 root root 4096 Dec 25 02:16 classes/

drwxr-xr-x 10 root root 4096 Dec 25 02:16 hadoop-3.4.0-SNAPSHOT/

-rw-r--r-- 1 root root 37263892 Dec 25 01:54 hadoop-3.4.0-SNAPSHOT-src.tar.gz

-rw-r--r-- 1 root root 641461679 Dec 25 02:17 hadoop-3.4.0-SNAPSHOT.tar.gz

drwxr-xr-x 2 root root 4096 Dec 25 02:10 hadoop-tools-deps/

drwxr-xr-x 3 root root 4096 Dec 25 02:16 maven-shared-archive-resources/

-rw-r--r-- 1 root root 30 Dec 25 02:16 .plxarc

drwxr-xr-x 3 root root 4096 Dec 25 02:16 test-classes/

drwxr-xr-x 2 root root 4096 Dec 25 02:16 test-dir/

Quiz

Learn how to build Hadoop 3.4 on Ubuntu 22.04.

If you see the following message, the installation was successful.

After cloning the trunk branch from the official Hadoop GitHub repository, running sudo

./start-build-env.sh in that directory will automatically set up the build environment for Hadoop.

Entering the following command will generate the Source and Binary distributions. You can change

the build options as needed, as shown below. If the build succeeds, a BUILD SUCCESS message will

be displayed as shown below. It took 23 minutes on the laptop currently in use.

현재 단락 (1/102)

Hadoop is such a massive piece of software that even setting up the build environment can be challen...

작성 글자: 0원문 글자: 5,523작성 단락: 0/102