hadoop3.2.0集群安装
集群安装步骤
本篇文章是基于centos6.5系统的hadoop3.2.0安装,其他linux系统的其他hadoop版本也同样适用!
hadoop集群有1个Master节点,2个Slave节点,可根据需求自行扩展。
配置IP
[root@master ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=00:0C:29:E9:CA:59
TYPE=Ethernet
UUID=ca629425-b5c0-4dab-a66d-68831a690d8e
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPADDR=169.254.1.100
NETMASK=255.255.255.0
修改Master节点IP地址为169.254.1.100,其他两Slave节点IP地址分别为169.254.1.101,169.254.1.102。
使IP生效
[root@master ~]# service network restart
参数介绍:
https://www.cnblogs.com/dkblog/archive/2011/12/28/2305004.html
NM_CONTROLLED参数介绍:
https://blog.csdn.net/petrosofts/article/details/80346348
配置hostname
[root@master ~]# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master
修改Master节点主机名为master,其他两Slave节点主机名分别为slave01,slave02。
重启使hostname生效
[root@master ~]# reboot
配置hosts(IP和主机名映射)
[root@master ~]# vim /etc/hosts
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
169.254.1.100 master
169.254.1.101 slave01
169.254.1.102 slave02
关闭防火墙和selinux
[root@master ~]# service iptables stop(临时关闭防火墙)
[root@master ~]# chkconfig iptables off(重启生效)
[root@master ~]# setenforce 0(临时关闭selinux)
[root@master ~]# vim /etc/selinux/config
修改
SELINUX=disabled(重启生效)
修改swappiness参数并禁用透明大页面压缩
[root@master ~]# echo 10 > /proc/sys/vm/swappiness
禁用大内存页面
[root@master ~]# echo never > /sys/kernel/mm/transparent_hugepage/defrag
[root@master ~]# vim /etc/rc.local
末尾追加
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Cloudera版本会要求修改这些参数,Apache版本可以忽略。
配置时间同步
Master节点配置如下:
[root@master ~]# vim /etc/ntp.conf
restrict 0.0.0.0 mask 0.0.0.0 nomodify notrap
server 127.127.1.0
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
includefile /etc/ntp/crypto/pw
restrict 127.0.0.1
restrict -6 ::1
两个Slave节点配置如下:
[root@slave01 ~]# vim /etc/ntp.conf
restrict 0.0.0.0 mask 0.0.0.0 nomodify notrap
restrict default kod nomodify notrap nopeernoquery
restrict -6 default kod nomodify notrapnopeer noquery
server master prefer
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
restrict 127.0.0.1
restrict -6 ::1
所有节点执行:
[root@master ~]# /etc/rc.d/init.d/ntpd start //启动ntp服务
[root@master ~]# chkconfig ntpd on //让ntp服务开机启
安装JDK
下载链接:https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
[root@master ~]# rpm -ivh jdk-8u201-linux-x64.rpm
查看JDK是否安装成功
[root@master ~]# java -version
java version “1.8.0_201”
Java™ SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot™ 64-Bit Server VM (build 25.201-b09, mixed mode)
配置环境变量
[root@master ~]# vim /etc/profile
末尾追加
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64
export JAVA_BIN=/usr/java/jdk1.8.0_201-amd64/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/jre/lib/rt.jar
export PATH=$PATH:$JAVA_HOME/bin
使环境变量生效
[root@master ~]# source /etc/profile
创建hadoop用户
[root@master ~]# useradd hadoop(创建用户)
[root@master ~]# echo “123” | passwd hadoop --stdin(修改密码)
配置ssh无密钥证书
[root@master ~]# su - hadoop
[hadoop@master ~]$ ssh-keygen -t rsa
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub master
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub slave01
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub slave02
上述操作3个节点都要执行
hadoop安装
下载链接:http://mirrors.shu.edu.cn/apache/
解压安装包
[hadoop@master ~]$ tar -zxvf hadoop-3.2.0.tar.gz
创建dfs相关目录
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/name
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/data
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/namesecondary
进入hadoop配置文件目录
[hadoop@master ~]$ cd /home/hadoop/hadoop-3.2.0/etc/hadoop
修改配置文件core-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/core-default.xm
<configuration></configuration>
中增加如下内容
[hadoop@master hadoop]$ vi core-site.xml
<property>
<name>fs.defaultFS</name>
<value>
hdfs://master:9000</value>
<description>NameNode URI.</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
<description>Size of read/write buffer used inSequenceFiles.</description>
</property>
修改配置文件hdfs-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xm
<configuration></configuration>
中增加如下内容
[hadoop@master hadoop]$ vi hdfs-site.xml
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>
master:50090</value>
<description>The secondary namenode http server address andport.</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoop-3.2.0/dfs/name</value>
<description>Path on the local filesystem where the NameNodestores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoop-3.2.0/dfs/data</value>
<description>Comma separated list of paths on the local filesystemof a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///home/hadoop/hadoop-3.2.0/dfs/namesecondary</value>
<description>Determines where on the local filesystem the DFSsecondary name node should store the temporary images to merge. If this is acomma-delimited list of directories then the image is replicated in all of thedirectories for redundancy.</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
修改配置文件mapred-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
复制一个mapred-site.xml
[hadoop@master hadoop]$ cp mapred-site.xml.template mapred-site.xml
<configuration></configuration>
中增加如下内容
[hadoop@master hadoop]$ vi mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>Theruntime framework for executing MapReduce jobs. Can be one of local, classic oryarn.</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>
master:10020</value>
<description>MapReduce JobHistoryServer IPC host:port</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>
master:19888</value>
<description>MapReduce JobHistoryServer Web UI host:port</description>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
/opt/hadoop-3.0.0/etc/hadoop,
/opt/hadoop-3.0.0/share/hadoop/common/*,
/opt/hadoop-3.0.0/share/hadoop/common/lib/*,
/opt/hadoop-3.0.0/share/hadoop/hdfs/*,
/opt/hadoop-3.0.0/share/hadoop/hdfs/lib/*,
/opt/hadoop-3.0.0/share/hadoop/mapreduce/*,
/opt/hadoop-3.0.0/share/hadoop/mapreduce/lib/*,
/opt/hadoop-3.0.0/share/hadoop/yarn/*,
/opt/hadoop-3.0.0/share/hadoop/yarn/lib/*
</value>
</property>
</property>
修改配置文件yarn-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
<configuration></configuration>
中增加如下内容
[hadoop@master hadoop]$ vi yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>
master</value>
<description>The hostname of theRM.</description>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>Shuffle service that needs to be set for Map Reduceapplications.</description>
</property>
修改配置文件 hadoop-env.sh
末尾追加
[hadoop@master hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64
修改配置文件workers
[hadoop@master hadoop]$ vi workers
slave01
slave02
复制hadoop3.2.0目录到两个Slave节点
[hadoop@master ~]$ scp -r /home/hadoop/hadoop-3.2.0 hadoop@slave01:/home/hadoop/
[hadoop@master ~]$ scp -r /home/hadoop/hadoop-3.2.0 hadoop@slave02:/home/hadoop/
将hadoop加入到主目录环境变量中
[hadoop@master ~]$ vi .bash_profile
PATH=$PATH:$HOME/bin
export HADOOP_HOME=/home/hadoop/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
使环境变量生效
[hadoop@master ~]$ . .bash_profile
格式化namenode
[hadoop@master ~]$ hadoop namenode -format
启动hadoop
[hadoop@master ~]$ start-all.sh
hadoop验证
进程验证
Master节点验证
[hadoop@master ~]$ jps
29633 NameNode
30071 Jps
29820 SecondaryNameNode
29965 ResourceManager
Slave01节点验证
[hadoop@slave01 ~]$ jps
28083 NodeManager
27978 DataNode
28158 Jps
Slave02节点验证
[hadoop@slave02 ~]$ jps
28176 Jps
28054 NodeManager
27947 DataNode
web界面验证
http://169.254.1.100:50070
http://169.254.1.100:8088
查看状态报告验证
[hadoop@master ~]$ hdfs dfsadmin -report
19/02/21 16:03:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Configured Capacity: 60932890624 (56.75 GB)
Present Capacity: 48979066880 (45.62 GB)
DFS Remaining: 48979017728 (45.62 GB)
DFS Used: 49152 (48 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: