CDH5部署实操全过程

2019-02-17 19:57
4246
3

 

一、集群分配

在腾讯云上申请了三台用于测试的云服务器,系统为centos 7.2, 硬件配置为4核16G,500G本地硬盘。

master(10.135.37.224) slave1(10.104.29.73) slave2(10.135.187.242)
CM Server    
CM Agent CM Agent CM Agent
NameNode DataNode DataNode
Mysql    
Zookeeper Zookeeper Zookeeper
  flume flume
  hive hive
  impala impala

二、环境准备

1、网络配置

1.1. 修改服务器主机名

把三台测试服务器的主机名称分别修改为master、slave1、slave2。

vi /etc/hostname #不会永久修改
hostnamectl --static set-hostname  #永久修改,会自动保存。

1.2. 修改网络配置

vi /etc/hosts
#在三台服务器的hosts均输入以下内容即可
127.0.0.1 localhost.localdomain localhost
10.135.37.224 master
10.104.29.73 slave1
10.135.187.242 slave2
​
#在centos 7.2系统中遇到hosts文件使用“reboot”重启后无法保存
#解决办法:编辑 vi /etc/cloud/templates/hosts.redhat.tmpl 与/etc/hosts保持一致即可

1.3. 禁用IPv6

echo "alias net-pf-10 off" >> /etc/modprobe.d/dist.conf
echo "alias ipv6 off" >> /etc/modprobe.d/dist.conf 

2、jdk1.7的安装

#卸载系统自带OPEN-JDK
rpm -qa|grep java 
rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
rpm -e --allmatches --nodeps tzdata-java-2013g-1.el6.noarch
rpm -ivh jdk-7u79-linux-x64.rpm
#配置环境变量
vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_79
export CLASSPATH=.:JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=JAVA_HOME/bin:$PATH
source /etc/profile

3、SSH无密码验证配置

3.1. 第一步分别在master,slave1,slave2上生成authorized_key。

#step 1:
chmod 700 $HOME/.ssh/ && \
ssh-keygen -t rsa
#step 2:
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys && \
chmod 600 $HOME/.ssh/authorized_keys

3.2. 把slave1与slave2生成的authorized_keys文件内容写入master的authorized_keys文件中。

echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDFA1JLha++svGZKJBWFNchIrc2sQvx/Gr46/tFArN8MROFdk6vJ4DDEoYEm41j/TV3jW6PyyuIJTICl8/AwgdvFMdiFZTjdr9RIYt2jjyS6O27hUOgdQF1GbAMnQPixOWxaBgmmGoXC4k0Ci1lri75eYTxFHopdy+1pdE1Odp9hMJv/CR2hqVvdor7LzV0CHNlj33yFtqggLTU0W/SRKCTk/N2v9dQFgnWqqWAsOb2krKaSR8S8tYrBaFNstEFkksK2GN56PeRKpALDWYcVt4m26EOVoV12TKfBojfL1xNwnzj+c8fOXBTNs98HVbcxAFQv2WIVgu9ND2PEmFMQfNp root@slave1" >> $HOME/.ssh/authorized_keys && \
echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDiMxRcrQEkotJW2siqxqUy1EZB7sV4tki8nz+Ypql0DrFFip/8gWdI8yuqu9wbDSYs/5RCq/N3ZZSU+iHBTTmyG2aysFtet83ufa/M8auefdsib7X12nfzqSY0EP095JxyQOWZxrEnmQ4QeV5uNmFsjEGQwlSkZMSew3IPmzTdwfWAoGcx4wV50ntCy4jrelOYD42e5+nrycgKiwgC4qNUQ4S61wtU1JBKEJ6t8DFyhQYFBjXRdq1J/5fycFefW+LmrmDxy3hrVt0gRy1+xnjgIk0kYv9oS8jn/lE01tapXgjQ/ILTZLTHzKv/aGAHOCXO0WRDQPOCl5V805p8t9kn root@slave2" >> $HOME/.ssh/authorized_keys

3.3. 把slave1与slave2原有的authorized_keys文件删除,在master服务器中执行文件本地上传命令

scp $HOME/.ssh/authorized_keys root@slave1:$HOME/.ssh/ 
scp $HOME/.ssh/authorized_keys root@slave2:$HOME/.ssh/

4、关闭防火墙(临时关闭即可,非必须)

4.1. 关闭防火墙

centos从7开始默认用的是firewalld,这个是基于iptables的,虽然有iptables的核心,但是iptables的服务是#没安装的。所以你只要停止firewalld服务即可。

sudo systemctl stop firewalld.service && \
sudo systemctl disable firewalld.service

4.2 关闭SELINUX

#临时生效
setenforce 0 
#重启后永久生效
vi /etc/selinux/config
SELINUX=disabled 
#验证一下
getenforce

 

5、配置NTP服务(时间同步)

所有节点配置NTP服务,所有datanode节点以master节点为基础同步时间。

(即两个从节点同步主节点的时间)

yum install ntp -y #自动安装
#安装成功后,在进行配置前,所有节点先手动同步一下时间:
ntpdate -u cn.pool.ntp.org # cn.pool.ntp.org、ntp1<-7>.aliyun.com 阿里云的1-7的ntp服务器
#修改集群各节点配置文件
vi /etc/ntp.conf
#
restrict 127.0.0.1
restrict -6 ::1
restrict default nomodify notrap
server cn.pool.ntp.org prefer
driftfile /var/lib/ntp/drift
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
​
#与
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
server master prefer
driftfile /var/lib/ntp/drift
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
restrict 127.0.0.1
restrict -6 ::1
​
#启动服务,先启动master,master时间同步成功后,在启动其他从节点
 systemctl start ntpd.service
#设置为开机启动
systemctl enable ntpd.service
#查看运行状态
systemctl status ntpd.service
#查看是否同步(这个很重要)
ntpstat
[root@master ~]# ntpstat
synchronised to NTP server (60.190.217.142) at stratum 3 
   time correct to within 102 ms
   polling server every 256 s
[root@slave1 ~]# ntpstat
synchronised to NTP server (10.135.37.224) at stratum 4 
   time correct to within 109 ms
   polling server every 256 s
[root@slave2 ~]# ntpstat
synchronised to NTP server (10.135.37.224) at stratum 4 
   time correct to within 114 ms
   polling server every 256 s
   
   #
   ntpupdate.tencentyun.com
   ntpdate -u ntpupdate.tencentyun.com
   
   systemctl stop ntpd.service

6、安装mysql

6.1 容器安装

因为是测试环境,故可以使用docker 方式 运行mysql容器即可

#docker环境在centos系统中安装文档
https://docs.docker.com/engine/installation/linux/docker-ce/centos/ #docker-machine 
#docker-compose 安装文档
https://docs.docker.com/compose/install/
#mysql docker-compose的编写
version: '3'
services:
  mysql:
    image: mysql:5.7
    ports:
      - 3306:3306
    environment:
      MYSQL_ROOT_PASSWORD: "gizwits"
      MYSQL_USER: "kevin"
      MYSQL_PASSWORD: "gizwits"
    

6.2 常规安装

详见《mysql 安装及其异常情况解决办法》文档

三、Cloudera Manager 安装

1、自动安装 不推荐

​
 wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
#"lastest" 可以被替换掉 如果你选择的是5.14.1的版本,则为:
#http://archive.cloudera.com/cm5/installer/5.14.1/cloudera-manager-installer.bin
 chmod u+x cloudera-manager-installer.bin
 sudo ./cloudera-manager-installer.bin

2、离线安装

  1. 版本选择:因为生产服务器是cm5.3.0的版本,但是测试服务器由于使用的centos 7.2系统,cm官网信息显示cm5.3.0并不支持在centos 7.2系统环境下安装。故在当前测试环境下采用cm5.14.1版本。

  2. 下载CM软件包

下载地址:http://archive.cloudera.com/cm5

如果想使用centos的rpm方式请参考如下地址:rpm方式安装

这里采用可以通用其他linux系统的解压tar.gz文件的方式安装,

安装操作过程如下:

#下载对应系统centos7的 cloudera-manager 并解压到指定目录
wget -q http://archive.cloudera.com/cm5/cm/5/cloudera-manager-centos7-cm5.14.3_x86_64.tar.gz -O /tmp/cm5.14.3.tar.gz && \
tar -xzvf /tmp/cm5.14.3.tar.gz -C /opt/cloudera-manager
#在主节点操作即可,然后上传至其他从节点即可
[root@master ~]# scp /tmp/cm5.14.3.tar.gz slave1:/tmp
cm5.14.3.tar.gz                                                                                                                             100%  794MB 129.1MB/s   00:06    
[root@master ~]# scp /tmp/cm5.14.3.tar.gz slave2:/tmp
cm5.14.3.tar.gz                                                                                                                             100%  794MB 128.1MB/s   00:06    
#接着分别在其他从节点中执行解压并移动到/opt/cloudera-manager的操作
  1. 所有节点创建用户cloudera-scm

#2、所有节点创建用户cloudera-scm
#由于Cloudera Manager和Managed Services默认使用cloudera-scm,所以需要创建此用户
​
$ sudo useradd --system --home=/opt/cloudera-manager/cm-5.14.3/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
​
  1. 主节点创建Cloudera Manager服务本地数据存储目录,注意该数据存储目录的用户为cloudera-scm。

sudo mkdir -p /var/lib/cloudera-scm-server && \
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server
  1. 配置所有从节点的Agent

#配置所有从节点的Agent,路径:/opt/cloudera-manager/cm-5.14.3/etc/cloudera-scm-#agent/config.ini,将server_host修改成主节点的主机名。
vi /opt/cloudera-manager/cm-5.14.3/etc/cloudera-scm-agent/config.ini
#server_port无需修改采用默认的即可
  1. 下载mysql-connector-java.jar,并保存到所有节点的/usr/java/jdk1.7.0_79目录下

wget -q https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz -O /tmp/mysql-connector-java-5.1.46.tar.gz && \
tar -xzvf /tmp/mysql-connector-java-5.1.46.tar.gz -C /tmp && \
mkdir -p /usr/share/java && \
cp /tmp/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46.jar /usr/share/java/mysql-connector-java.jar
#备注:这里一定要是这样的路径/usr/share/java/mysql-connector-java.jar,而jar包也一定要是mysql-connector-java.jar
  1. 新建一个scm用户,并赋予所有权限,密码是scm,并创建scm数据库。

# 新建一个scm用户,并赋予所有权限,密码是scm
mysql> grant all privileges on *.* to 'scm'@'%' identified by 'scm' with grant option;
Query OK, 0 rows affected, 1 warning (0.00 sec)
#如果没有关闭validate_password插件,就有可能遇到密码过于简单的报错哦。
# 新建scm数据库
mysql> create database scm DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
  1. 初始化cloudera manager 主要是初始化数据库。

/opt/cloudera-manager/cm-5.14.3/share/cmf/schema/scm_prepare_database.sh  mysql scm scm scm
#说明一下.sh后面的参数分别为:database-type 、database-name、username、password.
[root@master ~]# /opt/cloudera-manager/cm-5.14.3/share/cmf/schema/scm_prepare_database.sh  mysql scm scm scm
JAVA_HOME=/usr/java/jdk1.7.0_79
Verifying that we can write to /opt/cloudera-manager/cm-5.14.3/etc/cloudera-scm-server
Creating SCM configuration file in /opt/cloudera-manager/cm-5.14.3/etc/cloudera-scm-server
Executing:  /usr/java/jdk1.7.0_79/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/opt/cloudera-manager/cm-5.14.3/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/cloudera-manager/cm-5.14.3/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
log4j:ERROR Could not find value for key log4j.appender.A
log4j:ERROR Could not instantiate appender named "A".
Thu May 31 21:25:56 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[2018-05-31 21:25:57,196] INFO     0[main] - com.cloudera.enterprise.dbutil.DbCommandExecutor.testDbConnection(DbCommandExecutor.java) - Successfully connected to database.
All done, your SCM database is configured correctly!
#已经ok,成功初始化啦
#但是这里有一个日志报错,我们遇到啦,保险起见还是得解决一下,不然后面指不定会出现什么问题。
#进入 /opt/cloudera-manager/cm-5.14.3/etc/cloudera-scm-server/ 检查log4j.properties,还有#db.properties,两个配置文件都没有什么问题。这里的appender named "A"根本没有在log4j.properties配
#置。这里可能就是一个小bug而已,无伤大雅,可以不予理会。
  1. 主节点准备Parcels,用以安装CDH5。

#下载路径:http://archive.cloudera.com/cdh5/parcels/5.14/
wget -q http://archive.cloudera.com/cdh5/parcels/5.14/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel -O /tmp/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel && \
mkdir -p /opt/cloudera/parcel-repo && \
cp /tmp/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel /opt/cloudera/parcel-repo/ && \
wget -q http://archive.cloudera.com/cdh5/parcels/5.14/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel.sha1 -O /tmp/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel.sha && \
cp /tmp/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel.sha /opt/cloudera/parcel-repo/CDH-5.14.2-1.cdh5.14.2.p0.3-el7.parcel.sha && \
wget -q http://archive.cloudera.com/cdh5/parcels/5.14/manifest.json -O /tmp/manifest.json && \
cp /tmp/manifest.json /opt/cloudera/parcel-repo/
  1. 启动主节点的 Server和所有节点的Agent

[root@master ~]# /opt/cloudera-manager/cm-5.14.3/etc/init.d/cloudera-scm-server start
Starting cloudera-scm-server:                              [  OK  ]
[root@master ~]# /opt/cloudera-manager/cm-5.14.3/etc/init.d/cloudera-scm-agent start
Starting cloudera-scm-agent:                               [  OK  ]
[root@master ~]# ssh slave1
Last login: Fri Jun  1 10:18:30 2018 from 10.135.37.224
[root@slave1 ~]# /opt/cloudera-manager/cm-5.14.3/etc/init.d/cloudera-scm-agent start
Starting cloudera-scm-agent:                               [  OK  ]
[root@slave1 ~]# ssh slave2
[root@slave2 ~]# /opt/cloudera-manager/cm-5.14.3/etc/init.d/cloudera-scm-agent start
Starting cloudera-scm-agent:                               [  OK  ]
#查看cloudera-scm-server启动日志
tail -fn 100 /opt/cloudera-manager/cm-5.14.3/log/cloudera-scm-server/cloudera-scm-server.log 
#查看cloudera-scm-agent启动日志
 tail -fn 200 /opt/cloudera-manager/cm-5.14.3/log/cloudera-scm-agent/cloudera-scm-agent.log 

四、CDH安装

1、登录CM管理页面

网址为:http://master:7180 (http://193.112.212.196:7180)

初始用户名和密码都是admin。

2、选择Cloudear Express免费版

 

3、为CDH集群指定安装主机

4、选择CDH版本进行安装

因为前面已经安装了JDK,故不用选中。

5、安装成功,并修复问题

  • 修改swappiness阈值

echo "vm.swappiness=10" >> /etc/sysctl.conf && \
sysctl -p && \
cat /proc/sys/vm/swappiness
  • 关闭透明大页面功能

#先确认是否开启了透明大页面功能
cat /sys/kernel/mm/transparent_hugepage/enabled
#如果输出结果为[always]表示透明大页启用了,[never]表示透明大页禁用
#使用如下命令,不用重启。
[root@slave2 ~]# echo never > /sys/kernel/mm/transparent_hugepage/defrag
[root@slave2 ~]# echo never > /sys/kernel/mm/transparent_hugepage/enabled
[root@slave2 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
[root@slave2 ~]# cat /sys/kernel/mm/transparent_hugepage/defrag 
always madvise [never]
#系统重启的时候,予以设置
vi /etc/rc.local
#内容如下
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
 echo never > /sys/kernel/mm/transparent_hugepage/enabled
elif test -f /sys/kernel/mm/transparent_hugepage/defrag; then
 echo never > /sys/kernel/mm/transparent_hugepage/defrag
else
 echo "nothing"
fi

五、CDH组件服务安装

1、选择服务

2、分配服务角色

3、数据库设置

  • 先创建各角色数据库

    <pre>
    

    [root@master ~]# mysql -uroot -pgo4gizwits ​ mysql> create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) ​ mysql> create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) ​ mysql> create database amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) ​ mysql> create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) ​ mysql> show databases; +--------------------+ | Database           | +--------------------+ | information_schema | | amon               | | hive               | | hue               | | mysql             | | oozie             | | performance_schema | | scm               | | sys               | +--------------------+ 9 rows in set (0.00 sec) ​ mysql> grant all on . to 'kevin'@'%' identified by 'gizwits' with grant option;             Query OK, 0 rows affected, 1 warning (0.00 sec)

  • hue数据库无法连接问题

hue 数据库连接失败 
Unexpected error. Unable to verify database connection
​
查看日志报错是:ImportError: libxslt.so.1: cannot open shared object file: No such file or directory
​
原因是centos缺少库文件,执行如下命令即可
​
yum install krb5-devel cyrus-sasl-gssapi cyrus-sasl-deve libxml2-devel libxslt-devel mysql mysql-devel openldap-devel python-devel python-simplejson sqlite-devel
​
  • 测试连接成功,点击继续按照。

特别注意:这里的mysql数据库都在master主节点上,故hive,hue,ActivityMonitor,Oozie Server都推荐安装在主节点上。

4、各个组件服务的初始配置

  • hdfs、DataNode、NameNode等配置

  • Hive数据仓库及Impala的配置

  • Host/Service Monitor 以及Oozie配置

  • Sqoop的相关配置

  • NodeManager 以及zookeeper的配置

5、自动安装各组件,直到所有的组件全部自动安装成功即止

 

全部评论