jdk安装
。。。
安装rsync服务及配置hosts
。。 。
SSH互信免密码登录
1、 Master/Slave1/Slave2集群的主机均进行配置
。。。
2、Master节点进行配置
。。。
3、Slave1节点进行配置
。。。
4、Slave2节点进行配置
。。。
Scala 安装部署(集群的主机均需部署)
[root@localhost admin]# wget https://downloads.lightbend.com/scala/2.12.0-M1/scala-2.12.0-M1.tgz
[root@localhost admin]# tar xf scala-2.12.0-M1.tgz
[root@localhost admin]# mv scala-2.12.0 scala
[root@localhost admin]# chown –R admin:admin scala
[root@localhost admin]# vim /etc/profile ## 追加
#scala path
export SCALA_HOME=/home/admin/scala
export PATH=$PATH:$SCALA_HOME/bin
Hadoop部署
1、下载并解压hadoop安装包
[root@localhost admin]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz
[root@localhost admin]# tar xf hadoop-2.8.0.tar.gz
[root@localhost admin]# mv hadoop-2.8.0 hadoop
2、hadoop-env.sh里的java环境变量
[root@localhost admin]# vim /home/admin/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/home/admin/java ###修改java环境变量
3、slaves配置
[root@localhost admin]# vim /home/admin/hadoop/etc/hadoop/slaves
Slave1
Slave2
4、core-site.xml配置
[root@localhost admin]# vim /home/admin/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://Master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/admin/hadoop/tmp</value>
</property>
</configuration>
5、hdfs-site.xml配置
[root@localhost admin]# vim /home/admin/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>Master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/admin/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/admin/hadoop/hdfs/data</value>
</property>
</configuration>
6、mapred-site.xml配置
[root@localhost admin]# vim /home/admin/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Master:19888</value>
</property>
</configuration>
7、yarn-site.xml配置
[root@localhost admin]# vim /home/admin/hadoop/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>Master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>Master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>Master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>Master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>Master:8088</value>
</property>
</configuration>
8、拷贝hadoop程序到各slave节点
[root@localhost admin]# chown –R admin:admin hadoop
[root@localhost admin]# scp -r /home/admin/hadoop admin@Slave1:/home/admin
[root@localhost admin]# scp -r /home/admin/hadoop admin@Slave2:/home/admin
9、启动hadoop程序
[root@localhost admin]# su admin
[admin@localhost ~]$ hadoop namenode -format
[admin@localhost ~]$ /home/admin/hadoop/sbin/start-all.sh
Spark部署
1、下载并解压spark安装包
[admin@localhost ~]$ su - root
[root@localhost admin]# wget https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz
[root@localhost admin]# tar xf spark-2.1.1-bin-hadoop2.7.tgz
[root@localhost admin]# mv spark-2.1.1-bin-hadoop2.7 spark
2、设置spark环境变量
[root@localhost admin]# vim /etc/profile
#spark
export SPARK_HOME=/home/admin/spark
export PATH=$PATH:$SPARK_HOME/bin
[root@localhost admin]# cd /home/admin/spark/conf
[root@localhost conf]# cp spark-env.sh.template spark-env.sh
[root@localhost conf]# vim spark-env.sh
export JAVA_HOME=/home/admin/jdk
export SCALA_HOME=/home/admin/scala
export HADOOP_HOME=/home/admin/hadoop
export HADOOP_CONF_DIR=/home/admin/hadoop/etc/hadoop
export SPARK_MASTER_IP=192.168.1.240
export SPARK_MASTER_HOST=192.168.1.240
export SPARK_LOCAL_IP=192.168.1.240
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_CORES=2
export SPARK_HOME=/home/admin/spark
export SPARK_DIST_CLASSPATH=$(/home/admin/hadoop/bin/hadoop classpath)
3、设置slaves
[root@localhost conf]# cp slaves.template slaves
[root@localhost conf]# vim slaves
Master
Slave1
Slave2
4、拷贝spark程序到各slave节点
[root@localhost conf]# cd /home/admin
[root@localhost admin]# chown –R admin:admin spark
[root@localhost admin]# su admin
[admin@localhost ~]$ scp -r /home/admin/spark admin@Slave1:/home/admin/
[admin@localhost ~]$ scp -r /home/admin/spark admin@Slave2:/home/admin/
5、在slave1节点配置
[admin@localhost ~]$ vim /home/admin/spark/conf/spark-env.sh
export SPARK_LOCAL_IP=192.168.1.241
6、在slave2节点配置
[admin@localhost ~]$ vim /home/admin/spark/conf/spark-env.sh
export SPARK_LOCAL_IP=192.168.1.242
7、在Master节点启动spark程序
[admin@localhost ~]$ /home/admin/spark/sbin/start-all.sh