本文转载自 QingYingX's Blog: 原文地址

关闭防火墙&SELinux

防火墙

  1. systemctl status firewalld 查看防火墙状态

  2. systemctl stop firewalld 关闭防火墙

  3. systemctl disable firewalld 永久关闭防火墙

Selinux

  1. 临时关闭:输入命令 setenforce 0 重启系统后还会开启

  2. 永久关闭:输入命令 vim /etc/selinux/configSELINUX=enforcing 改为 SELINUX=disabled 然后保存退出

配置 SSH 免密登录

  1. 准备工作修改hosts文件 添加上机器对应的IP以及主机名 sudo vim /etc/hosts
    以下为hosts默认内容:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
  • 在本地终端中执行 ssh-keygen 命令,提示都不用管,一路回车(Enter)。

  • 将公钥上传到机器 ssh-copy-id User@Hostname

  • 如果需要删除密钥 ssh-keygen -r <Hostname>

配置JAVA环境

  1. 输入命令 tar -zxvf jdkx.x.x.tar.gz 解压

  2. 输入命令 vim /etc/profile 打开配置文件 按下i键 再按下 Shift+G 可到达底部 追加以下内容

export JAVA_HOME=
export PATH=$JAVA_HOME/bin:$PATH

3. 输入命令 source /etc/profile
4. 验证是否配置成功 java -version 大致为

[root@master java]# java -version
java version "x.x.x_xxx"
Java(TM) SE Runtime Environment (build x.x.x_xxx-xxx)
Java HotSpot(TM) 64-Bit Server VM (build xx.xxx-xxx, mixed mode)

Hadoop 配置文件

hadoop-env.sh

export JAVA_HOME=
export HADOOP_HOME=

workers

master
slave1
slave2

core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/dfs/tmp</value>
    </property>
    <property>
        <name>hadoop.http.staticuser.user</name>
        <value>root</value>
    </property>
    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>
</configuration>

yarn-site.xml

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
    <property>
        <name>mapreduce.map.memory.mb</name>
        <value>4096</value>
    </property>
    <property>
        <name>mapreduce.reduce.memory.mb</name>
        <value>4096</value>
    </property>
</configuration>

/etc/profile

添加环境变量

export HADOOP_HOME=
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

执行命令 source /etc/profile

mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>*****</value>
    </property>
</configuration>

mapreduce.application.classpath 此处的值请通过 hadoop classpath 执行获取

hdfs-site.xml

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/opt/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/opt/dfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.namenode.http-address</name>
        <value>0.0.0.0:50070</value>
    </property>
</configuration>

/start-dfs.sh&/stop-dfs.sh

HDFS_DATANODE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

/start-yarn.sh&/stop-yarn.sh

YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root

初始化集群

hdfs namenode -format
成功标志:Successful (需要上翻几行)

启动集群

start-all.sh

PS: historyserver 需要单独启动
mapred --daemon start historyserver

查看hadoop的报告

hdfs dfsadmin -report

再次感谢

在此处再次感谢 QingYingX 的贡献!他是一个非常可靠的队友!

若你看完了这篇博客文章,一定要去看看他的-> Click Here!