Hadoop 2.X QuickStart on Mac OS X 10.11 El Capitan
The Mac Tutorial Shows You Step-by-Step How-to Install and Getting-Started with Apache Hadoop/Map-Reduce vanilla in Pseudo-Distributed mode on Mac OS X 10.11 El Capitan 32/64bit Desktop.
Hadoop is a distributed master-slave that consists of the Hadoop Distributed File System (HDFS) for storage and Map-Reduce for computational capabilities.
The Guide Describe a System-Wide Installation with Root Privileges but You Can Easily Convert the Procedure to a Local One.
Apache Hadoop Require the Java JDK 7+ Installed so if Needed Follow the guide on How-to Install Oracle JDK on Mac.

Apache Hadoop 2.x Includes the following Modules:
- Hadoop Common: The common utilities that support the other Hadoop modules.
- Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.
- Hadoop YARN: A framework for job scheduling and cluster resource management.
- Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
-
Download Latest Apache Hadoop Stable Release:
-
Double-Click on Archive to Extract
-
Open Terminal Window
(Press “Enter” to Execute Commands) -
Relocate Apache Hadoop Directory
rm $HOME/Downloads/hadoop*tar.gz
sudo mv $HOME/Downloads/hadoop* /usr/local/hadoop
sudo mkdir /usr/local/hadoop/tmp
sudo chown -R root:wheel /usr/local/hadoop*
-
Check if Java JDK 7+ is Installed
java --version
How-to Install Required Oracle JDK 7+ on MacOS X:
-
Set JAVA_HOME in Hadoop Env File
sudo su
If Got “User is Not in Sudoers file” then Look: Solution
mkdir /usr/local/hadoop/conf
nano /usr/local/hadoop/conf/hadoop-env.sh
Append:
export JAVA_HOME=$(/usr/libexec/java_home)
Ctrl+x to Save & Exit 🙂
-
Configuration for Pseudo-Distributed mode
nano /usr/local/hadoop/conf/core-site.xml
The Content Should Look Like:
hadoop.tmp.dir /usr/local/hadoop/tmp fs.default.name hdfs://localhost:8020 Next:
nano /usr/local/hadoop/conf/hdfs-site.xml
The Content Should Look Like:
dfs.replication 1 dfs.name.dir /usr/local/hadoop/cache/hadoop/dfs/name Last:
nano /usr/local/hadoop/conf/mapred-site.xml
The Content Should Look Like:
mapred.job.tracker localhost:8021 -
SetUp Local Path & Environment
exit
cd $HOME
nano .profile
Inserts:
HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
The JAVA_HOME is Set Following Oracle Java JDK6+ Installation Version…
Then Load New Setup:
source $HOME/.profile
-
SetUp Needed Local SSH Connection
Enable SSH Connection:System Preferences >> Sharing
To Enable SSH Login without Pass:ssh-keygen -t rsa
Press enter for each line…
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod og-wx ~/.ssh/authorized_keys
Testing Connection:
ssh 127.0.0.1
exit
- Formatting HDFS
cd /usr/local/hadoop
sudo su
bin/hadoop namenode -format
- Starting Up Hadoop Database
sbin/start-all.sh
-
Apache Hadoop Database Quick-Start Guide:
Hadoop MapReduce Quick-Start
Eclipse Hadoop 2.X Integration with Free Plugin: