How to Install Hadoop on Mac High Sierra 10.13

Hadoop Mac High Sierra Installation Guide



Hi Mac User! This is Tutorial Shows You Step-by-Step Hadoop Installation on Mac High Sierra 10.13 OS X.

Especially relevant: this is a Vanilla Hadoop Install on Mac High Sierra in Pseudo-Distributed mode.

Because Hadoop is a distributed master-slave that consists of the Hadoop Distributed File System (HDFS) for storage and Map-Reduce for computational capabilities.

Furthermore the Guide Describe a System-Wide Installation with Root Privileges but You Can Easily Convert the Procedure to a Local One.

Finally Apache Hadoop Require a Java JDK 7+ Setup on System. So if Needed Follow the linked guide to Install Oracle JDK on Mac.

Apache Hadoop Installation on Mac High Sierra 10.13 - Featured

In additon Apache Hadoop 2.x Includes the following Modules:

  • Hadoop Common: The common utilities that support the other Hadoop modules.
  • Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.
  • Hadoop YARN: A framework for job scheduling and cluster resource management.
  • Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.

Apache Haddop Setup on Mac High Sierra

  1. Download Latest Apache Hadoop Stable Release:

    Apache Hadoop Stable tar.gz

  2. And Double-Click on Archive to Extract
  3. Then Open a Terminal Window
    (Press “Enter” to Execute Commands)

    Apache Hadoop Installation on Mac High Sierra 10.13 - Open Terminal
  4. While to Relocate Apache Hadoop Directory

    rm $HOME/Downloads/hadoop*tar.gz
    sudo mv $HOME/Downloads/hadoop* /usr/local/
    sudo ln -s /usr/local/hadoop-[2.x] /usr/local/hadoop
    sudo mkdir /usr/local/hadoop/tmp
    sudo chown -R root:wheel /usr/local/hadoop*
  5. Follow Check if Java JDK 7+ is Installed
    java --version

    How-to Install Required Oracle JDK 7+ on MacOS X:

    Install Oracle JDK 7+ for Mac
  6. Furthermore Set JAVA_HOME in Hadoop Env File
    First logis is as SuperUser:

    sudo su

    If Got “User is Not in Sudoers file” then Look: Solution
    Then Make the needed Directory:

    mkdir /usr/local/hadoop/conf
    nano /usr/local/hadoop/conf/hadoop-env.sh

    So Append:

    export JAVA_HOME=$(/usr/libexec/java_home)
    

    Ctrl+x to Save & Exit 🙂

  7. Configuration for Pseudo-Distributed mode
    nano /usr/local/hadoop/conf/core-site.xml

    And this is Content Should Looks Like:

     
     
     hadoop.tmp.dir
     /usr/local/hadoop/tmp
     
     
     fs.default.name
     hdfs://localhost:8020
     
     
    

    Next:

    nano /usr/local/hadoop/conf/hdfs-site.xml

    So the Content Should Looks Like:

     
     
     dfs.replication
     1
     
     
     
     dfs.name.dir
     /usr/local/hadoop/cache/hadoop/dfs/name
     
     
    

    Last:

    nano /usr/local/hadoop/conf/mapred-site.xml

    The Content Should Look Like:

     
     
     mapred.job.tracker
     localhost:8021
     
     
    
  8. SetUp Local Path & Environment

    exit
    nano $HOME/.bashrc

    Append:

    HADOOP_HOME=/usr/local/hadoop
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    

    The JAVA_HOME is Set Following Oracle Java JDK7+ Installation Version…
    Then Load New Setup:

    bash
  9. So to SetUp Needed Local SSH Connection
    Enable SSH Connection:

    System Preferences >> Sharing

    Apache Hadoop Installation on Mac High Sierra 10.13 - Enabling Remote Login
    To Enable SSH Login without Pass:

    ssh-keygen -t rsa

    Press enter for each line…

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    chmod og-wx ~/.ssh/authorized_keys

    Testing Connection:

    ssh 127.0.0.1
    exit
  10. Then for Formatting HDFS

    cd /usr/local/hadoop
    sudo su
    bin/hadoop namenode -format

    Apache Hadoop Installation on Mac High Sierra 10.13 - Terminal Apache Hadoop HDFS Formatting Succcess

  11. Finally to Start Up Hadoop Database

    start-all.sh
  12. Apache Hadoop Database Quick-Start Guide:

    Hadoop MapReduce Quick-Start
  13. Eclipse Hadoop 2.X Integration with Free Plugin:

    Hadoop 2.X Eclipse Plugin SetUp