Hadoop 2.X QuickStart on Mac OS X 10.12 Sierra
The Mac Tutorial Shows You Step-by-Step How-to Install and Getting-Started with Apache Hadoop/Map-Reduce vanilla in Pseudo-Distributed mode on Mac OS X 10.12 Sierra 32/64bit Desktop.
Hadoop is a distributed master-slave that consists of the Hadoop Distributed File System (HDFS) for storage and Map-Reduce for computational capabilities.
The Guide Describe a System-Wide Installation with Root Privileges but You Can Easily Convert the Procedure to a Local One.
Apache Hadoop Require the Java JDK 7+ Installed so If Needed Just Update Your Mac 10.12 Sierra.
Apache Hadoop 2.x Includes the following Modules:
- Hadoop Common: The common utilities that support the other Hadoop modules.
- Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.
- Hadoop YARN: A framework for job scheduling and cluster resource management.
- Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
-
Download Latest Apache Hadoop Stable Binary Release:
- Double-Click on Archive to Extract
-
Open Terminal Window
(Press “Enter” to Execute Commands) - Relocate Apache Hadoop Directory
sudo mv $HOME/Downloads/hadoop* /usr/local
sudo ln -s /usr/local/hadoop-[2.x] /usr/local/hadoop
sudo mkdir /usr/local/hadoop/tmp
sudo chown -R root:wheel /usr/local/hadoop*
- Check if Java JDK 7+ is Installed
java --version
How-to Install Required Oracle JDK 7+ on MacOS X:
-
Set JAVA_HOME in Hadoop Env File
sudo su
If Got “User is Not in Sudoers file” then Look: Solution
mkdir /usr/local/hadoop/conf
nano /usr/local/hadoop/conf/hadoop-env.sh
Append:
export JAVA_HOME=$(/usr/libexec/java_home)
Ctrl+x to Save & Exit 🙂
- Configuration for Pseudo-Distributed mode
nano /usr/local/hadoop/conf/core-site.xml
The Content Should Look Like:
hadoop.tmp.dir /usr/local/hadoop/tmp fs.default.name hdfs://localhost:8020 Next:
nano /usr/local/hadoop/conf/hdfs-site.xml
The Content Should Look Like:
dfs.replication 1 dfs.name.dir /usr/local/hadoop/cache/hadoop/dfs/name Latest:
nano /usr/local/hadoop/conf/mapred-site.xml
The Content Should Look Like:
mapred.job.tracker localhost:8021 -
SetUp Local Path & Environment
exit
cd $HOME
nano .profile
Inserts:
HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
The JAVA_HOME is Set Following Oracle Java JDK6+ Installation Version…
Then Load New Setup:
source $HOME/.profile
-
SetUp Needed Local SSH Connection
Enable SSH Connection:
System Preferences >> Sharing
Testing Connection:
ssh 127.0.0.1
- Formatting HDFS
hadoop namenode -format
- Starting Up Hadoop Database
start-all.sh
-
Apache Hadoop Database Quick-Start Guide:
Hadoop MapReduce Quick-Start
Eclipse Hadoop 2.X Integration with Free Plugin: