Want IST Not SST for Node Rejoins? We Have a Solution!

2018-06-02 阅读量

Want IST Not SST for Node Rejoins? We Have a Solution!

Krunal Bauskar | February 13, 2018 | Posted In: High-availability, MySQL, Percona XtraDB Cluster

如果我们告诉你，有一种确定的方法可以让节点rejoin使用IST而不是SST？您可以保证新节点使用IST重新加入. 听起来很有趣？请继续阅读.

通常当一个节点脱离集群一段时间(处于维护目的或就是shutdown了), 集群上其他节点的gcache将用来在前者重新加入集群时提供前者在脱离期间缺失的write-set(s). 如果您配置了较大的gcache, 或downtime足够段, 则此方法可行. 对于生产环境来说, 无论是设置较大的gcache或者缩短停机窗口都不够好.

在停机之前在潜在的DONOR节点上重新配置gcache需要关闭节点.(gcache不能动态调整大小), Restoring it back to original size needs another shutdown. So “three shutdowns” for a single downtime. *No way …… not acceptable with busy production clusters and the possibility of more errors.*

Introducing “gcache.freeze_purge_at_seqno”

基于以上痛点, 我们在Percona XtraDB Cluster 5.7.20引入了gcache.freeze_purge_at_seqno.这将控制清除gcache, 从而在节点重新加入时保留更多的数据以促进IST.

Galera集群世界中的所有事务都被分配了唯一的全局序列号（seqno）.跟踪事情发生使用此seqno（如wsrep_last_applied，wsrep_last_committed，wsrep_replicated，wsrep_local_cached_downto等^1）.wsrep_local_cached_downto表示gcache已被清除的序列号。假设wsrep_local_cached_downto = N，那么gcache具有来自[N，wsrep_replicated]的数据, 并清除了[1，N)数据。

gcache.freeze_purge_at_seqno takes three values:

1. -1(默认值): no freeze, the purge operates as normal.
2. **x (should be valid seqno in gcache):** freeze purge of write-sets >= x. The best way to select x is to use the wsrep_last_applied value as an indicator from the node that you plan to shut down. (wsrep_applied * 0.09. Retain this extra 10% to trick the [safety gap heuristic algorithm of IST](https://www.percona.com/blog/2017/11/15/understanding-ist-donor-selected/).)
3. **now:** freeze purge of write-sets &gt;= smallest seqno currently in gcache. Instant freeze of gcache-purge. (If tracing x (above) is difficult, simply use “now” and you are good).

在集群的现有节点上进行设置（这将继续作为集群的一部分，并可以充当潜在的捐助者）。该节点继续保留写集，从而允许重启节点使用IST重新加入。（您可以在重启需要rejoin的节点时通过指定—wsrep_sst_donor将该节点作为首选DONOR进行提供）

Set this on an existing node of the cluster (that will continue to be part of the cluster and can act as potential DONOR). This node continues to retain the write-sets, thereby allowing the restarting node to rejoin using IST. (You can feed the said node as a preferred DONOR through wsrep_sst_donor while restarting the said rejoining node.)

请记住，一旦节点重新加入，请将其设回-1。这可以避免超出上述时间表的DONOR上的占用空间。在下一个清除周期中，所有旧的保留写入集也会被释放（回收空间回到原始状态）.

Note:

To find out existing value of gcache.freeze_purge_at_seqno query wsrep_provider_options.
select @@wsrep_provider_options;

To set gcache.freeze_purge_at_seqno
set global wsrep_provider_options="gcache.freeze_purge_at_seqno = now";

Why should you use it?

gcache动态增长（使用现有的页面存储机制），并在用户将其设置回-1时收缩。这意味着您只在需要时使用(更多的)磁盘空间.
不需要重新启动. 用户只需专注于需要维护的节点.
No complex math or understanding of seqno involved (simply use “now”).
Less prone to error, as SST is one of the major error-prone areas with the cluster.

So why wait? Give it a try! It is part of Percona XtraDB Cluster 5.7.20 onwards, and helps you get IST not SST for node rejoins

Note: If you need more information about gcache, check here and here