环境说明
本环境有三台机器。其中master(10.0.0.2),slave1(10.0.0.3),slave2(10.0.0.4)。
一、初始化hadoop环境
1. 创建hadoop帐号
useradd -d /data/hadoop -u 600 -g root hadoop
#修改hadoop的密码
passwd hadoop
2.修改主机名称
- 将主机名称改成master,从机依次改成slave1,slave2.
vi /etc/hostname
maste
注意:如果是slave1,则此处填写slave1,如果是slave2,则填写slave2
- 修改hosts的配置,其他机器复制此配置
vi /etc/hosts
10.0.0.2 maste
10.0.0.3 slave1
10.0.0.4 slave2
127.0.0.1 localhost localhost.localdomain localhost
- 修改network的配置
vi /etc/sysconfig/network
# Created by anaconda
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=maste
- 重启master,slave1,slave2,使配置生效
3. 设置面密码登录
- 生成密钥对
执行ssh-keygen -t rsa命令。一直按enter建进入即可
- 将公钥复制到slave1,slave2
执行ssh-copy-id -i ~/.ssh/id_rsa.pub slave1
二、安装java
1.到oracle官网下载java
2.将java文件解压到安装目录
tar -xzvf jdk-8u91-linux-x64.tar.gz
3.设置java的环境变量
vi ~/.bash_profile
.bash_profile 文件内容如下所示
# .bash_profile
Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
User specific environment and startup programs
PATH=PATH:HOME/.local/bin:$HOME/bin:/data/hadoop/hadoop-2.6.4/share/
export PATH
JAVA_HOME=/data/hadoop/jdk1.8.0_91
CLASSPATH=.:$JAVA_HOME/lib
PATH=JAVA\_HOME/bin:PATH
export JAVA_HOME CLASSPATH PATH
注意:
- 在CLASSPATH 前面必须有.:这个目录,否则会出现【找不到或无法加载主类】的报错
三、安装hadoop
1. 下载hadoop。我们选择hadoop-2.6.4.tar.gz文件,此文件是编译后版本,直接解压后即可。
2. 将文件解压到安装到目录
tar -xzvf hadoop-2.6.4.tar.gz
3.设置环境变量
vi ~/.bashrc
# .bashrc
Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
Uncomment the following line if you don't like systemctl's auto-paging feature:
export SYSTEMD_PAGER=
User specific aliases and functions
export HADOOP_PREFIX=$HOME/hadoop-2.6.4
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARH_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export PATH=PATH:HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
source ~/.bashrc 使配置文件生效
4.修改hadoop配置文件
- 进入配置文件目录
cd /data/hadoop/hadoop-2.6.4/etc/hadoop
- 修改hadoop-env.sh ,在JAVA_HOME目录下面添加java路径
# distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Set Hadoop-specific environment variables here.
The only required environment variable is JAVA_HOME. All others are
optional. When running a distributed configuration it is best to
set JAVA_HOME in this file, so that it is correctly defined on
remote nodes.
The java implementation to use.
export JAVA_HOME=/data/hadoop/jdk1.8.0_91
- 将slave注册。
#localhost
slave1
slave2
- 修改core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name> <value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name> <value>/data/hadoop/tmp/hadoop-master</value> <description>Abase for other temporary directories.</description>
</property>
</configuration>
- 修改hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name> <value>Master:50090</value>
</property>
<property>
<name>dfs.datanode.data.dir</name> <value>file:///data/hadoop/tmp/hdfs/datanode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name> <value>file:///data/hadoop/tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name> <value>file:///data/hadoop/tmp/hdfs/namesecondary</value>
</property>
<property>
<name>dfs.replication</name> <value>2</value>
</property>
</configuration>
注意:dfs.replication说明的是节点的数量。本案例中有两个slave,因此填写数值为2
- 修改mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name> <value>yarn</value>
</property>
<property>
<name>mapreduece.jobtracker.staging.root.dir</name> <value>/user</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name> <value>Master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name> <value>Master:19888</value>
</property>
</configuration>
- 修改yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name> <value>mapreduce\_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name> <value>master</value>
</property>
</configuration>
5.将hadoop用户下面的目录全部打包,复制到slave1,slave2上。
四、hadoop 启动验证
1.启动hadoop
start-all.sh 即可
2.检查是否正常
hdfs dfsadmin -report
[hadoop@master hadoop]$ vi yarn-site.xml
[hadoop@master hadoop]$ hdfs dfsadmin -report
Configured Capacity: 20867301376 (19.43 GB)
Present Capacity: 16041099264 (14.94 GB)
DFS Remaining: 15645147136 (14.57 GB)
DFS Used: 395952128 (377.61 MB)
DFS Used%: 2.47%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Live datanodes (2):
Name: 10.0.0.3:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 10433650688 (9.72 GB)
DFS Used: 197976064 (188.80 MB)
Non DFS Used: 2413105152 (2.25 GB)
DFS Remaining: 7822569472 (7.29 GB)
DFS Used%: 1.90%
DFS Remaining%: 74.97%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Jun 23 11:03:48 CST 2016
Name: 10.0.0.4:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 10433650688 (9.72 GB)
DFS Used: 197976064 (188.80 MB)
Non DFS Used: 2413096960 (2.25 GB)
DFS Remaining: 7822577664 (7.29 GB)
DFS Used%: 1.90%
DFS Remaining%: 74.97%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Jun 23 11:03:49 CST 2016
说明:因为我们是两个节点,因此只要在这里看到两个节点,就说明正常