一准备工作:
1.下载安装: http://hadoop.apache.org/releases.html
2.SSH互信免密码登陆,安装部署jdk. host修改, 创建新用户: https://abc.htmltoo.com/thread-692.htm
master 116.57.56.220
slave1 116.57.86.221
slave2 116.57.86.222
创建新用户:
sudo useradd -d /home/hadoop -m hadoop # -d设置用户目录路径,-m设置登录名
passwd hadoop # 设置密码
hadoop:x:1002:1002::/home/hadoop:/bin/bash # 命令行开头只显示$:,并且一些shell语句无法使用
vi /etc/sudoers
hadoop ALL=(ALL)ALL # 新创建的用户需要在/etc/sudoers中添加sudo权限
二.部署:
core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
<property>
<name>hadoop.tmp.dir</name>
<value>file:///home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/tmp</value>
</property>
</property>
hdfs-site.xml:
mkdir -p /data/hd3/namenode /data/hd3/datanode
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/hd3/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/data/hd3/datanode</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>slave1:9001</value>
</property>
yarn-site.xml:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>49152</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>49152</value>
</property>
or
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandle</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>
mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property
or
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/etc/hadoop,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/common/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/common/lib/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/hdfs/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/hdfs/lib/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/mapreduce/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/mapreduce/lib/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/yarn/*,
/home/hadoop3.0/usr/local/hadoop/hadoop-3.0.0-alpha1/share/hadoop/yarn/lib/*
</value>
</property>
workers:
slave1
slave2
hadoop-env.sh:
JAVA_HOME=/home/cloud/jdk1.8.0_144
修改profile文件:
vim /etc/profile
export HADOOP_PREFIX=/home/cloud/hadoop-3.0.0
export HADOOP_HDFS_HOME=/home/cloud/hadoop-3.0.0
export HADOOP_CONF_DIR=/home/cloud/hadoop-3.0.0/etc/hadoop
source /etc/profile
OK,至此配置全部完成,下面启动Hadoop。
三.启动.
hdfs namenode -format # 首次启动需要format namenode
start-dfs.sh # 启动HDFS
start-yarn.sh #启动YARN
如果不报错的话,应该看到如下信息:
在Master机器上通过JPS命令查看:
$ jps
6993 SecondaryNameNode
7715 NodeManager
9524 Jps
7371 ResourceManager
6492 NameNode
6669 DataNode
在Slave机器上jps:
$ jps
21360 DataNode
30233 Jps
21643 NodeManager
四.Web端查看
访问 http://10.0.0.1:8088/
访问 http://10.0.0.1:9870 ,注意,这里是9870,不是50070了.
五.mapreduce程序测试
使用自带的example进行测试
1.生成HDFS请求目录执行MapReduce任务
hdfs dfs -mkdir /user
hdfs dfs -mkdir /user/hduser
2.将输入文件拷贝到分布式文件系统
hdfs dfs -mkdir /user/hduser/input
hdfs dfs -put etc/hadoop/*.xml /user/hduser/input
3.运行提供的示例程序
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha1.jar grep /user/hduser/input output 'dfs[a-z.]+'
4.查看输出文件:
将输出文件从分布式文件系统拷贝到本地文件系统查看:
hdfs dfs -get output output
cat output/*
在分布式文件系统上查看输出文件:
bin/hdfs dfs -cat output/*
完成全部操作后,停止守护进程:
sbin/stop-dfs.sh
查看HDFS文件系统数据的shell命令:
# 查看指定目录下的文件和文件夹。/user/hadoop/output是HDFS上的目录,不是本地目录
hadoop fs -ls /user/hadoop/output
hadoop fs -cat /user/hadoop/output #查看文件内容