Zookeeper3.6.0集群部署文档

Zookeeper3.6.0集群部署文档

一点一点完善…

下载安装包

https://zookeeper.apache.org/releases.html

image-20200415150513009

1
2
3
4
5
cd /tmp && \
wget https://downloads.apache.org/zookeeper/zookeeper-3.6.0/apache-zookeeper-3.6.0-bin.tar.gz
tar -zxvf apache-zookeeper-3.6.0-bin.tar.gz -C /usr/local/
ln -s /usr/local/apache-zookeeper-3.6.0-bin /usr/local/zookeeper

配置环境变量

1
2
3
4
vi ~/.bashrc
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin

修改配置文件

zoo.cfg

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
cat zoo.cfg
# 抄的clickhouse官网
# https://clickhouse.tech/docs/en/operations/tips/#zookeeper
tickTime=2000
initLimit=30000
syncLimit=10
maxClientCnxns=2000
maxSessionTimeout=60000000
autopurge.snapRetainCount=10
autopurge.purgeInterval=1
preAllocSize=131072
snapCount=3000000
leaderServes=yes
standaloneEnabled=false
reconfigEnabled=true
4lw.commands.whitelist=*
dataDir=/data/zookeeper/test/data
dataLogDir=/data/zookeeper/test/logs
dynamicConfigFile=/usr/local/apache-zookeeper-3.6.0-bin/conf/zoo_replicated1.cfg.dynamic
4lw.commands.whitelist=*

dynamicConfigFile

vim /usr/local/apache-zookeeper-3.6.0-bin/conf/zoo_replicated1.cfg.dynamic

三个节点一样

1
2
3
4
#cat zoo_replicated1.cfg.dynamic
server.1=172.16.24.2:2888:3888:participant;2181
server.2=172.16.24.13:2888:3888:participant;2181
server.3=172.16.24.109:2888:3888:participant;2181

注意不能用0.0.0.0, 否则有bug

用于客户端连接的端口
clientPort: 2181
用于节点间通信的TCP端口
peerPort: 2888
用于首领选举的TCP端口
leaderPort: 3888

participant代表参与者

myid

1
2
3
4
5
6
7
8
#master
echo "1">/data/zookeeper/ch_9000/data/myid
#slave1
echo "2">/data/zookeeper/ch_9000/data/myid
#slave2
echo "3">/data/zookeeper/ch_9000/data/myid

配置zk日志的滚动输入

看bin/zkEnv.sh 里面

默认zk日志输出到一个文件,且不会自动清理,所以,一段时间后zk日志会非常大!
这里配置zk日志滚动输出,且每个文件10M限制,最多保留10个文件.

  1. zookeeper-env.sh
    ./conf目录下新建zookeeper-env.sh文件,修改到sudo chmod 755 zookeeper-env.sh权限

    1
    2
    3
    4
    5
    6
    7
    #cat conf/zookeeper-env.sh
    #!/usr/bin/env bash
    #tip:custom configurationfile,do not amend the zkEnv.sh file
    #chang the log dir and output of rolling file
    ZOO_LOG_DIR="/usr/local/zookeeper/logs"
    ZOO_LOG4J_PROP="INFO,ROLLINGFILE"
  2. log4j.properties 修改日志的输入形式

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    zookeeper.root.logger=INFO, ROLLINGFILE
    #zookeeper.root.logger=INFO, CONSOLE
    zookeeper.console.threshold=INFO
    zookeeper.log.dir=/usr/local/zookeeper/logs
    zookeeper.log.file=zookeeper.log
    zookeeper.log.threshold=INFO
    zookeeper.log.maxfilesize=256MB --要改就改这个
    zookeeper.log.maxbackupindex=20 --要改就改这个
  3. 1
    mkdir /usr/local/zookeeper/logs

配置运行zk的jvm

看bin/zkEnv.sh 里面

./conf目录下新建java.env文件,修改到sudo chmod 755 java.env权限,主要用于GC log,RAM等的配置.

1
2
3
4
5
6
7
#!/usr/bin/env bash
#config the jvm parameter in a reasonable
#note that the shell is source in so that do not need to use export
#set java classpath
#CLASSPATH=""
#set jvm start parameter , also can set JVMFLAGS variable
SERVER_JVMFLAGS="-Xms1024m -Xmx2048m $JVMFLAGS"

启动zookeeper服务(所有节点)

1
2
3
4
# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

遇到问题

问题1

使用stat验证zookeeper服务时报错

1
2
3
4
5
6
7
#telnet 127.0.0.1 2181
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
stat
stat is not executed because it is not in the whitelist.
Connection closed by foreign host.

这里出问题了. 3.5.3以后新增参数4lw.commands.whitelist
https://zookeeper.apache.org/doc/r3.6.0/zookeeperAdmin.html

问题2

之前我是这样配置的dynamicConfigFile的

三个节点, 自己都是0.0.0.0

1
2
3
4
#cat zoo_ch_9000.cfg.dynamic
server.1=0.0.0.0:2888:3888:participant;0.0.0.0:2181
server.2=172.16.24.13:2888:3888:participant;0.0.0.0:2181
server.3=172.16.24.109:2888:3888:participant;0.0.0.0:2181
1
2
3
4
#cat zoo_ch_9000.cfg.dynamic
server.1=172.16.24.2:2888:3888:participant;0.0.0.0:2181
server.2=0.0.0.0:2888:3888:participant;0.0.0.0:2181
server.3=172.16.24.109:2888:3888:participant;0.0.0.0:2181
1
2
3
4
#cat zoo_ch_9000.cfg.dynamic
server.1=172.16.24.2:2888:3888:participant;0.0.0.0:2181
server.2=172.16.24.13:2888:3888:participant;0.0.0.0:2181
server.3=0.0.0.0:2888:3888:participant;0.0.0.0:2181

这样装完以后能用, 但是myid=1挂掉重启后一直无法加入

1
2
3
4
5
6
7
2020-04-15 16:03:24,420 [myid:1] - INFO [WorkerSender[myid=1]:QuorumCnxManager@462] - Have smaller server identifier, so dropping the connection: (3, 1)
2020-04-15 16:03:24,622 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumCnxManager@462] - Have smaller server identifier, so dropping the connection: (2, 1)
2020-04-15 16:03:24,623 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumCnxManager@462] - Have smaller server identifier, so dropping the connection: (3, 1)
2020-04-15 16:03:24,623 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):FastLeaderElection@966] - Notification time out: 400
2020-04-15 16:03:25,024 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumCnxManager@462] - Have smaller server identifier, so dropping the connection: (2, 1)
2020-04-15 16:03:25,025 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):QuorumCnxManager@462] - Have smaller server identifier, so dropping the connection: (3, 1)
2020-04-15 16:03:25,025 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):FastLeaderElection@966] - Notification time out: 800

貌似是bug

https://issues.apache.org/jira/browse/ZOOKEEPER-2938

用3.4.14的配置启动3.6.0 仍然有此问题, 说明可能不是配置问题
反复测过几次就是0.0.0.0的问题, 实际上我在来云账户之前从没有用过0.0.0.0, 之前马蜂窝的服务器也是双网卡, 我看过运维的kafka和大数据的kafka都没有使用过0.0.0.0这种方式, 来到这边才看到这种用法, 本着”可能有坑、与线上统一”的原则继承了这样的配置, 实际对这种配置我还是不太理解, 虽然百度了一下说是ECS或Docker不这样配有问题

Powered by Hexo and Hexo-theme-hiker

Copyright © 2013 - 2020 Fan() All Rights Reserved.

访客数 : | 访问量 :