Hadoop Distributed File System (HDFS) is a distributed file system designed to store data across multiple machines. It is a core component of the Hadoop ecosystem and is widely used in big data processing. In this tutorial, we will guide you through the installation process of HDFS on the latest version of EndeavourOS.
Before we begin, make sure you have the following prerequisites:
Hadoop requires Java Development Kit (JDK) to be installed on your machine. If JDK is not already installed on your system, run the following command to install it:
sudo pacman -S jdk8-openjdk
Download the latest version of Hadoop distribution package from http://hadoop.apache.org/ and save it to your preferred directory on your machine.
Extract the downloaded Hadoop package using the following command:
tar -xzf hadoop-x.y.z.tar.gz
Replace x.y.z
with the version number of the Hadoop package you downloaded.
Set up the environment variables for Hadoop by adding the following lines to your .bashrc
file:
export HADOOP_HOME=/path/to/hadoop/directory
export PATH=$PATH:$HADOOP_HOME/bin
Replace /path/to/hadoop/directory
with the directory path where the Hadoop package is extracted.
Edit the core-site.xml
file located in $HADOOP_HOME/etc/hadoop
directory:
nano $HADOOP_HOME/etc/hadoop/core-site.xml
Add the following configuration settings within the configuration
tag:
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
Save and exit the file.
Then edit the hdfs-site.xml
file located in $HADOOP_HOME/etc/hadoop
directory:
nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml
Add the following configuration settings within the configuration
tag:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Save and exit the file.
Before starting the HDFS service, you need to format the Namenode. Run the following command to format the Namenode:
hdfs namenode -format
Start the HDFS service by running the following command:
start-dfs.sh
Now, HDFS service is running on your system.
In this tutorial, we have guided you through the installation process of HDFS on EndeavourOS latest. You can now start using HDFS to store and process your big data.
If you want to self-host in an easy, hands free way, need an external IP address, or simply want your data in your own hands, give IPv6.rs a try!
Alternatively, for the best virtual desktop, try Shells!