Galear Cluster将write-sets存储在一个称为Write-set Cache(或称为GCache)的特殊的cache中. GCache cache is a memory allocator for write-sets.主要目的是为了最大限度地减少RAM上的write-setfootprint. Galera集群通过将卸载写入集存储到磁盘来改善此问题.
GCache采用三种类型的存储:
Permanent In-Memory Store Here write-sets allocate using the default memory allocator for the operating system. This is useful in systems that have spare RAM. The store has a hard size limit.
Galera Cluster uses an allocation algorithm that attempts to store write-sets in the above order. That is, first it attempts to use permanent in-memory store. If there is not enough space for the write-set, it attempts to store to the permanent ring-buffer file. The page store always succeeds, unless the write-set is larger than the available disk space.
在停机之前在潜在的DONOR节点上重新配置gcache需要关闭节点.(gcache不能动态调整大小), Restoring it back to original size needs another shutdown. So “three shutdowns” for a single downtime. *No way …… not acceptable with busy production clusters and the possibility of more errors.*
1. -1(默认值): no freeze, the purge operates as normal.
2. **x (should be valid seqno in gcache):** freeze purge of write-sets >= x. The best way to select x is to use the wsrep_last_applied value as an indicator from the node that you plan to shut down. (wsrep_applied * 0.09. Retain this extra 10% to trick the [safety gap heuristic algorithm of IST](https://www.percona.com/blog/2017/11/15/understanding-ist-donor-selected/).)
3. **now:** freeze purge of write-sets >= smallest seqno currently in gcache. Instant freeze of gcache-purge. (If tracing x (above) is difficult, simply use “now” and you are good).
Set this on an existing node of the cluster (that will continue to be part of the cluster and can act as potential DONOR). This node continues to retain the write-sets, thereby allowing the restarting node to rejoin using IST. (You can feed the said node as a preferred DONOR through wsrep_sst_donor while restarting the said rejoining node.)
tar -zxvf mysql-5.5.59-linux-glibc2.12-x86_64.tar.gz -C /usr/local/mysql-5.5.59/ tar -zxvf mysql-5.6.39-linux-glibc2.12-x86_64.tar.gz -C /usr/local/mysql-5.6.39/ tar -zxvf mysql-5.7.21-linux-glibc2.12-x86_64.tar.gz -C /usr/local/mysql-5.7.21/
Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log mysql-bin.000004, end_log_pos 812. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
查看error log
1 2 3 4 5 6 7 8 9 10 11 12 13 14
2018-02-10T19:52:52.347979+08:00 3 [Warning] Slave I/O for channel '': Notifying master by SET @master_binlog_checksum= @@global.binlog_checksum failedwith er ror: Unknownsystemvariable'binlog_checksum', Error_code: 1193 2018-02-10T19:52:52.348080+08:003 [Warning] Slave I/O for channel '': Unknownsystemvariable'SERVER_UUID'on master. A probable cause is that the variable i s not supported on the master (version: 5.5.59-log), even though it ison the slave (version: 5.7.21-log), Error_code: 1193 2018-02-10T19:52:52.445947+08:005 [ERROR] SlaveSQLfor channel '': Worker 1failed executing transaction'ANONYMOUS'atmasterlog , end_log_pos 2651; Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL serverversionfor the right syntax touse near 'GET, POST, FILE, CLASS, METHOD ) VALUES ( 'maokaixin', 1518159802, 3395959414, '' at line 1'on query. Defaultdatabase: 'fandb'. Query: 'INSERT INTO postlog ( USERNAME, TIME, IP, GE T, POST, FILE, CLASS, METHOD ) VALUES ( 'maokaixin', 1518159802, 3395959414, 'gameid:0;', '', '', 'api', 'ajaxGetServers' )', Error_code: 1064 2018-02-10T19:52:52.446217+08:00 4 [Warning] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in incon sistent state. A restart should restore consistency automatically, although using non-transactionalstoragefordataor info tablesorDDL queries could lead t o problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756 2018-02-10T19:52:52.446235+08:004 [Note] SlaveSQLthreadfor channel '' exiting, replication stopped inlog'mysql-bin.000003'atposition2313 2018-02-10T19:53:15.708058+08:007 [Note] SlaveSQLthreadfor channel ''initialized, startingreplicationinlog'mysql-bin.000003'atposition2313, relay l og './mysql-relay.000009'position: 304
找出原因
头两个Warning是由于主库没有binlog_checksum参数,也没有SERVER_UUID参数(看来从库开始同步时要先去主库查询这两个参数) 接着的ERROR报的错误竟然是error in your SQL syntax语法错误.
解析binlog后查到SQL语句为
1
INSERT INTO postlog ( USERNAME, TIME, IP, GET, POST, FILE, CLASS, METHOD ) VALUES( 'maokaixin', 1518159802, 3395959414, 'gameid:0;', '', '', 'api', 'ajaxGetServers' );
[root@OA_P 13:04:15 /etc/ansible/roles] #ansible OA* -m yum -a"name=mutt state=present" OA_P | FAILED! => { "changed": false, "msg": "python2 bindings for rpm are needed for this module. python2 yum module is needed for this module" } OA_S | FAILED! => { "changed": false, "msg": "python2 bindings for rpm are needed for this module. python2 yum module is needed for this module" } ^C [root@OA_P 13:04:28 /etc/ansible/roles] #ansible OA* -m yum -a"name=mutt state=present" -e "ansible_python_interpreter=/usr/bin/python" OA_S | SUCCESS => { "changed": false, "msg": "", "rc": 0, "results": [ "5:mutt-1.5.20-8.20091214hg736b6a.el6.x86_64 providing mutt is already installed" ] } OA_P | FAILED! => { "changed": false, "msg": "The following packages have pending transactions: mutt-x86_64", "rc": 125, "results": [] }
这是因为2.6可以import yum而2.7不行
1 2 3 4 5 6 7 8 9 10 11 12 13 14
#python Python 2.7.14 (default, Dec 15 2017, 23:08:56) [GCC 4.4.7 20120313 (Red Hat 4.4.7-18)] on linux2 Type "help", "copyright", "credits"or"license"for more information. >>> import yum Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named yum
#python2.6 Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37) [GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2 Type "help", "copyright", "credits"or"license"for more information. >>> import yum
Non used lines are blanked. For instance if you used to have multiple save directives, but the current configuration has fewer or none as you disabled RDB persistence, all the lines will be blanked.
未使用的行将被删除.例如,如果您曾经设置多个save,但是后来却通过config set只设置了更少的或者config set save ‘’,那么所有的未使用save将被删除
[root@cn_mu_binlog_backup ~]# redis-cli -p 6379 config get save 1) "save" 2) "" [root@cn_mu_binlog_backup ~]# cat /usr/local/redis/etc/6379_redis.conf|grep save # save <seconds> <changes> # Will save the DB if both the given number of seconds and the given # In the example below the behaviour will be to save: # Note: you can disable saving completely by commenting out all "save" lines. # It is also possible to remove all the previously configured save # points by adding a save directive with a single empty string argument # save "" save 900 1 save 300 10 save 60 10000 ...
重写 [root@cn_mu_binlog_backup ~]# redis-cli -p 6379 config rewrite OK
[root@cn_mu_binlog_backup ~]# cat /usr/local/redis/etc/6379_redis.conf|grep save # save <seconds> <changes> # Will save the DB if both the given number of seconds and the given # In the example below the behaviour will be to save: # Note: you can disable saving completely by commenting out all "save" lines. # It is also possible to remove all the previously configured save # points by adding a save directive with a single empty string argument # save "" # (at least one save point) and the latest background save failed. stop-writes-on-bgsave-erroryes # If you want to save some CPU in the saving child set it to 'no' but # algorithms (in order to save memory), so you can tune it for speed or # the configured save points). # saving process (a background save or AOF log background rewriting) is # Lists are also encoded in a special way to save a lot of space. # order to save a lot of space. This encoding is only used when the length and
In order to make sure the redis.conf file is always consistent, that is, on errors or crashes you always end with the old file, or the new one, the rewrite is performed with a single write(2) call that has enough content to be at least as big as the old file. Sometimes additional padding in the form of comments is added in order to make sure the resulting file is big enough, and later the file gets truncated to remove the padding at the end.