Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It is part of the Apache Hadoop project and is used by many big data applications as a primary storage layer. In this tutorial, you will learn how to install HDFS, which is available from http://hadoop.apache.org/, on Elementary OS Latest.
Open Terminal and update the package list:
sudo apt update
Install SSH client:
sudo apt install ssh
Download the Hadoop installation file:
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
Extract the downloaded file:
tar -xvf hadoop-3.3.1.tar.gz
Move the extracted folder to /usr/local/
directory:
sudo mv hadoop-3.3.1 /usr/local/hadoop
Set the JAVA_HOME
environment variable:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
Append the following lines to the end of the ~/.bashrc
file:
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
Reload the ~/.bashrc
file:
source ~/.bashrc
Edit the hadoop-env.sh
file:
sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh
Find the line that starts with export JAVA_HOME
and update it with the following path:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
Save and Close the file.
Edit the core-site.xml
file:
sudo nano /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Save and Close the file.
Edit the hdfs-site.xml
file:
sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/hadoop_data/hdfs/datanode</value>
</property>
</configuration>
Save and Close the file.
Format the HDFS file system:
hdfs namenode -format
/usr/local/hadoop/sbin/start-dfs.sh
hdfs dfs -ls /
This will show you the contents of the root directory in the HDFS file system.
Congratulations! You have successfully installed and verified HDFS on Elementary OS Latest. Now you can use HDFS to store and process big data on your system.
If you want to self-host in an easy, hands free way, need an external IP address, or simply want your data in your own hands, give IPv6.rs a try!
Alternatively, for the best virtual desktop, try Shells!