视图合并、hash join连接列数据分布不均匀引发的惨案

表大小

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
SQL> select count(*) from agent.TB_AGENT_INFO;

COUNT(*)
----------
1751

SQL> select count(*) from TB_CHANNEL_INFO ;

COUNT(*)
----------
1807

SQL> select count(*) from TB_USER_CHANNEL;

COUNT(*)
----------
7269

SQL> select count(*) from OSS_USER_STATION;

COUNT(*)
----------
2149

SQL> select count(*) from tb_user_zgy ;

COUNT(*)
----------
43

SQL> select count(*) from act.tb_user_agent_relat;

COUNT(*)
----------
29612

SQL> select count(*) from agent.base_data_user_info ;

COUNT(*)
----------
30005

SQL> select count(*) from agent.base_data_invest_info;

COUNT(*)
----------
3530163

慢的sql

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
select a.city,
a.agent_id,
a.username,
a.real_name,
phone,
zgy_name,
login_count,
user_count,
count(distinct b.invest_id) user_invested,
sum(b.order_amount / 100) invest_amount
from (select a.city,
a.agent_id,
a.username,
a.real_name, -- 业主姓名
a.phone, -- 业主手机号
d.real_name zgy_name, -- 所属专管员
count(distinct case
when c.str_day <= '20160821' then
c.login_name
end) login_count,
count(distinct case
when c.str_day <= '20160821' then
decode(c.status, 1, c.invest_id, null)
end) user_count
from (select agent_id, city, username, real_name, phone
from agent.TB_AGENT_INFO
where agent_id in
(SELECT agent_id
FROM (SELECT distinct *
FROM TB_CHANNEL_INFO t
START WITH t.CHANNEL_ID in
(select CHANNEL_ID
from TB_USER_CHANNEL
where USER_ID = 596)
CONNECT BY PRIOR
t.CHANNEL_ID = t.PARENT_CHANNEL_ID)
WHERE agent_id IS NOT NULL)) a
left join oss_user_station e
on a.agent_id = e.agent_id
and e.user_type = 0
left join tb_user_zgy d
on e.username = d.username
left join act.tb_user_agent_relat c
on a.agent_id = c.agent_id
group by a.city,
a.username,
a.real_name,
a.phone,
d.real_name,
a.agent_id) a
left join (select invest_id, order_amount, agent_id, str_day
from agent.base_data_invest_info
where str_day >= '20150801' and str_day<='20160821') b
on a.agent_id = b.agent_id
group by a.city,
a.agent_id,
a.username,
a.real_name,
a.phone,
a.zgy_name,
a.login_count,
a.user_count
这个查询可以看成两部分,第一部分一堆小表关联的a和唯一的一个大表再做关联

man
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 55M| 6616M| | 3934K (1)| 13:06:52 |
| 1 | HASH GROUP BY | | 55M| 6616M| | 3934K (1)| 13:06:52 |
| 2 | VIEW | VW_DAG_1 | 55M| 6616M| | 3934K (1)| 13:06:52 |
| 3 | HASH GROUP BY | | 55M| 6301M| 7681M| 3934K (1)| 13:06:52 |
| 4 | VIEW | VM_NWVW_0 | 55M| 6301M| | 2456K (1)| 08:11:15 |
| 5 | SORT GROUP BY | | 55M| 10G| 11G| 2456K (1)| 08:11:15 |
|* 6 | HASH JOIN RIGHT OUTER | | 55M| 10G| | 21643 (2)| 00:04:20 |
| 7 | TABLE ACCESS FULL | TB_USER_AGENT_RELAT | 27937 | 1200K| | 102 (0)| 00:00:02 |
|* 8 | HASH JOIN OUTER | | 3374K| 511M| | 21392 (1)| 00:04:17 |
|* 9 | HASH JOIN SEMI | | 1712 | 188K| | 2007 (1)| 00:00:25 |
|* 10 | HASH JOIN RIGHT OUTER | | 1712 | 173K| | 32 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | TB_USER_ZGY | 43 | 903 | | 3 (0)| 00:00:01 |
|* 12 | HASH JOIN RIGHT OUTER | | 1712 | 138K| | 29 (0)| 00:00:01 |
|* 13 | TABLE ACCESS FULL | OSS_USER_STATION | 1075 | 25800 | | 6 (0)| 00:00:01 |
| 14 | TABLE ACCESS FULL | TB_AGENT_INFO | 1712 | 98K| | 23 (0)| 00:00:01 |
| 15 | VIEW | VW_NSO_1 | 16271 | 143K| | 1975 (1)| 00:00:24 |
|* 16 | VIEW | | 16271 | 143K| | 1975 (1)| 00:00:24 |
| 17 | HASH UNIQUE | | 16271 | 8882K| 10M| 1975 (1)| 00:00:24 |
|* 18 | CONNECT BY WITHOUT FILTERING (UNIQUE)| | | | | | |
|* 19 | HASH JOIN RIGHT SEMI | | 530 | 146K| | 29 (0)| 00:00:01 |
|* 20 | TABLE ACCESS FULL | TB_USER_CHANNEL | 600 | 7800 | | 7 (0)| 00:00:01 |
| 21 | TABLE ACCESS FULL | TB_CHANNEL_INFO | 1807 | 476K| | 22 (0)| 00:00:01 |
| 22 | TABLE ACCESS FULL | TB_CHANNEL_INFO | 1807 | 476K| | 22 (0)| 00:00:01 |
|* 23 | TABLE ACCESS FULL | BASE_DATA_INVEST_INFO | 3374K| 148M| | 19375 (1)| 00:03:53 |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - access("AGENT_ID"="C"."AGENT_ID"(+))
8 - access("AGENT_ID"="AGENT_ID"(+))
9 - access("AGENT_ID"="AGENT_ID")
10 - access("C"."USERNAME"="D"."USERNAME"(+))
12 - access("AGENT_ID"="C"."AGENT_ID"(+))
13 - filter("C"."USER_TYPE"(+)=0)
16 - filter("AGENT_ID" IS NOT NULL)
18 - access("T"."PARENT_CHANNEL_ID"=PRIOR "T"."CHANNEL_ID")
19 - access("T"."CHANNEL_ID"="CHANNEL_ID")
20 - filter("USER_ID"=596)
23 - filter("STR_DAY"(+)>='20150801' AND "STR_DAY"(+)<='20160821')

抓跑得慢的SQL,查看正在跑的sql的等待事件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
select (sysdate-a.logon_time)*24*60 minutes,
a.username,
a.BLOCKING_INSTANCE,
a.BLOCKING_SESSION,
a.program,
a.machine,
a.osuser,
a.status,
a.sid,
a.serial#,
a.event,
a.p1,
a.p2,
a.p3,
a.sql_id,
a.sql_child_number,
b.sql_text
from v$session a, v$sql b
where
a.sql_address = b.address
and a.sql_hash_value = b.hash_value
and a.sql_child_number=b.child_number
and a.event not in ('SQL*Net message from client','Space Manager: slave idle wait')
-- and a.username like '%USERNAME%'
order by 1 desc;

在查执行计划

1
Select * from table(dbms_xplan.display_cursor('91prd0dh0bs8d',0,'ALL'));

Hexo博客配置

Hexo博客配置

基于hexo-theme-hiker主题

先看myblog下的_config.yml完整配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# Hexo Configuration
## Docs: https://hexo.io/docs/configuration.html
## Source: https://github.com/hexojs/hexo/

# Site
title: Fan()
subtitle:
description:
author: a dba's blog
language: zh-CN
timezone: Asia/Shanghai

# URL
## If your site is put in a subdirectory, set url as 'http://yoursite.com/child' and root as '/child/'
url: http://yoursite.com
root: /Fandb.github.io
permalink: :year/:month/:day/:title/
permalink_defaults:

# Directory
source_dir: source
public_dir: public
tag_dir: tags
archive_dir: archives
category_dir: categories
code_dir: downloads/code
i18n_dir: :lang
skip_render:

# Writing
new_post_name: :title.md # File name of new posts
default_layout: post
titlecase: false # Transform title into titlecase
external_link: true # Open external links in new tab
filename_case: 0
render_drafts: false
post_asset_folder: false
relative_link: false
future: true
highlight:
enable: true
line_number: true
auto_detect: true
tab_replace:

# Home page setting
# path: Root path for your blogs index page. (default = '')
# per_page: Posts displayed per page. (0 = disable pagination)
# order_by: Posts order. (Order by date descending by default)
index_generator:
path: ''
per_page: 6
order_by: -date

# Category & Tag
default_category: uncategorized
category_map:
tag_map:

# Date / Time format
## Hexo uses Moment.js to parse and display date
## You can customize the date format as defined in
## http://momentjs.com/docs/#/displaying/format/
date_format: YYYY-MM-DD
time_format: HH:mm:ss

# Pagination
## Set per_page to 0 to disable pagination
per_page: 10
pagination_dir: page

# Extensions
## Plugins: https://hexo.io/plugins/
## Themes: https://hexo.io/themes/
theme: hiker

# Deployment
## Docs: https://hexo.io/docs/deployment.html
deploy:
type: git
repository:
github: https://github.com/Fanduzi/Fandb.github.io.git
coding: https://git.coding.net/Fandb/Fandb.blog.git
branch: master

Note

冒号后面要空一格

使用Hexo + github快速搭建个人博客

使用Hexo + github快速搭建个人博客

安装依赖软件

安装git

1
sudo brew install git

安装Node.js

Mac下最简单的做法便是直接下载pkg文件进行安装,最新版本的下载地址如下,选择后缀为pkg的文件下载安装即可:
https://nodejs.org/download/release/latest/
安装完成后修改环境变量

1
2
3
vi ~/.bash_profile
添加
export PATH=/usr/local/bin:$PATH

将npm源替换成淘宝源

1
npm config set registry http://registry.npm.taobao.org/

安装Hexo

安装前先介绍几个hexo常用的命令

1
2
3
4
$ hexo g #完整命令为hexo generate,用于生成静态文件
$ hexo s #完整命令为hexo server,用于启动服务器,主要用来本地预览
$ hexo d #完整命令为hexo deploy,用于将本地文件发布到github上
$ hexo n #完整命令为hexo new,用于新建一篇文章

利用 npm 命令安装:
1
sudo npm install -g hexo

报错可尝试
1
npm install --unsafe-perm -g hexo

Percona Monitoring and Management架构

Percona Monitoring and Management架构

PMM基于简单的client-server模型,可实现高效的扩展性,它包含以下模块:

  • PMM Client,安装在任何你希望被监控的数据库服务器上.它会手机服务器指标和查询分析数据以提供一份完整的性能概览.数据被收集并发送到PMM Server
  • PMM Server是PMM的核心部分,它聚合手机的数据,并以Web界面的表格,仪表盘和图形的形式展现

这些模块被封装以提供简单的安装和使用给用户,无需关心它的内部实现方法.但是,如果要利用PMM的全部潜力,内部结构就很重要.

PMM是旨在无缝协同工作的工具集合。 一些是由Percona开发的,一些是第三方开源工具。

PMM部署遇到的坑

PMM部署遇到的坑

系统 内核版本
CentOS release 6.4 (Final) 2.6.32-358.el6.x86_64

1.公司环境pull不下来

在自己的环境pull下来然后save image

1
2
3
4
5
6
7
8
9
[root@slave oracle]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/percona/pmm-server 1.2.0 eb82a0e154c8 2 weeks ago 1.266 GB
docker.io/percona/pmm-server latest eb82a0e154c8 2 weeks ago 1.266 GB
[root@slave oracle]# docker save eb82a0e154c8 > pmm-server.tar

[root@slave oracle]# scp pmm-server.tar 10.4.2.43:~/
root@10.4.2.43's password:
pmm-server.tar 100% 1232MB 902.5KB/s 23:18

在原环境导入

1
[root@test2 ~]# docker load < pmm-server.tar 

repostory和tag是

1
2
3
4
[root@test2 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> abdf7c1b7a63 2 weeks ago 1.266 GB
<none> <none> 3690474eb5b4 11 months ago 0 B

修改tag

1
2
3
4
5
[root@test2 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> abdf7c1b7a63 2 weeks ago 1.266 GB
<none> <none> 3690474eb5b4 11 months ago 0 B
[root@test2 ~]# docker tag abdf7c1b7a63 docker.io/percona/pmm-server:1.2.0

2.docker容器无法启动 no such file or directory statusCode=404

这个搜了半天不知道是什么原因,怀疑是内核版本太低

因为CentOS6.4自带内核版本是2.6.32-358.23.2.el6.x86_64,而Docker要求内核版本大于3.0,推荐3.8以上的内核

https://yq.aliyun.com/ziliao/48262

遂升级内核

1
wget http://elrepo.org/linux/kernel/el6/x86_64/RPMS/kernel-lt-3.10.107-1.el6.elrepo.x86_64.rpm

如果连接不对,自己去http://elrepo.org/linux/kernel/el6/x86_64/RPMS/看一眼,找一个合适的

1
rpm -ivh kernel-lt-3.10.107-1.el6.elrepo.x86_64.rpm

修改grub.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
vi /etc/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/mapper/VolGroup-LogVol00
# initrd /initrd-[generic-]version.img
#boot=/dev/sda
default=1
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (3.10.107-1.el6.elrepo.x86_64)
root (hd0,0)
kernel /vmlinuz-3.10.107-1.el6.elrepo.x86_64 ro root=/dev/mapper/VolGroup-LogVol00 rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=VolGroup/LogVol00 rd_NO_DM rhgb quiet numa=off elevator=deadline
initrd /initramfs-3.10.107-1.el6.elrepo.x86_64.img
title CentOS (2.6.32-358.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-358.el6.x86_64 ro root=/dev/mapper/VolGroup-LogVol00 rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=VolGroup/LogVol00 rd_NO_DM rhgb quiet numa=off elevator=deadline
initrd /initramfs-2.6.32-358.el6.x86_64.img

现在title CentOS (3.10.107-1.el6.elrepo.x86_64)在 0 号位置,所以将
default=1改为default=0
重启os

重启后

1
2
3
4
[root@test2 ~]# cat /etc/redhat-release 
CentOS release 6.9 (Final)
[root@test2 ~]# uname -r
3.10.107-1.el6.elrepo.x86_64

再次创建container成功

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[root@test2 ~]# docker load < pmm-server.tar 
[root@test2 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> abdf7c1b7a63 2 weeks ago 1.266 GB
<none> <none> 3690474eb5b4 11 months ago 0 B
[root@test2 ~]# docker tag abdf7c1b7a63 docker.io/percona/pmm-server:1.2.0
[root@test2 ~]# docker create \
> -v /opt/prometheus/data \
> -v /opt/consul-data \
> -v /var/lib/mysql \
> -v /var/lib/grafana \
> --name pmm-data \
> percona/pmm-server:1.2.0 /bin/true
094c63bd911b5139a267abe7939e5c4442cdc857970dedaccb9ae0cb5f165fc9
[root@test2 ~]# docker run -d \
> -p 80:80 \
> --volumes-from pmm-data \
> --name pmm-server \
> --restart always \
> percona/pmm-server:1.2.0
69195dca404bc607fa12a9cd6436a9786a71dcf226a0e4c1d6bf0b9879a14f03
[root@test2 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69195dca404b percona/pmm-server:1.2.0 "/opt/entrypoint.sh" 11 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 443/tcp pmm-server
094c63bd911b percona/pmm-server:1.2.0 "/bin/true" 21 seconds ago pmm-data

Note

PMM-server选择一个内核版本搞的服务器就行

PMM-client无所谓

3.pmm-data 状态为Exited
内核版本低

4.MySQL dashboard没数据
防火墙没关

1
2
3
4
5
6
pmm-admin check-network

centos 7
停止: systemctl disable firewalld
禁用: systemctl stop firewalld

5.容器时区为UTC与我们系统CST差八个小时
进入容器

1
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

http://www.cnblogs.com/w2206/p/6904446.html

部署PMM

部署PMM

1.在主机上运行PMM Server,用于访问收集的数据,查看基于时间的图表,并执行性能分析。

可以通过三种方式运行PMM Server:

这里只介绍通过Docker方式运行PMM Server

Running PMM Server Using Docker

PMM服务器的Docker映像公开托管在https://hub.docker.com/r/percona/pmm-server/。 如果要从Docker映像运行PMM Server,则主机必须能够运行Docker 1.12.6或更高版本,并具有网络访问权限。

1
2
3
4
5
6
7
8
安装docker
首先安装epel源
wget -P /etc/yum.repos.d/ http://mirrors.aliyun.com/repo/epel-6.repo
yum clean all
yum makecache
yum -y install docker-io
yum upgrade device-mapper-libs -y
service docker start

修改docker容器存储路径,默认存储在/var/lib/docker下.这里使用软连接方式修改

1
2
3
4
5
6
service stop docker
cd /var/lib/docker
cp -r * /data/docker/
mv /var/lib/docker /var/lib/docker.bak
ln -s /var/lib/docker /data/docker
service start docker

禁用防火墙

1
2
停止: systemctl disable firewalld
禁用: systemctl stop firewalld

关闭或者设置 IPTABLES
1
2
3
#PMM
$IPTABLES -A INPUT -p tcp --dport 42002 -j ACCEPT
$IPTABLES -A INPUT -p tcp --dport 42000 -j ACCEPT

这两个端口号是哪来的的呢? 是在pmm-admin check-network看到的(官方文档也有描述)
1
2
3
4
5
6
* Connection: Client <-- Server
-------------- ---------- --------------------- ------- ---------- ---------
SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD
-------------- ---------- --------------------- ------- ---------- ---------
linux:metrics pt_slave1 120.27.136.247:42000 OK YES -
mysql:metrics pt_slave1 120.27.136.247:42002 OK YES -

安装Percona-toolkit (Query Analytics需要)
1
2
yum install percona-xtrabackup-24.x86_64 percona-xtrabackup-24-debuginfo.x86_64 percona-xtrabackup-test-24.x86_64 percona-toolkit.x86_64 percona-toolkit-debuginfo.x86_64 percona-toolkit.x86_64 -y
yum install qpress* -y

创建专用用户赋予权限
1
2
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'pmm'@' localhost' IDENTIFIED BY 'pass' WITH MAX_USER_CONNECTIONS 10;
GRANT SELECT, UPDATE, DELETE, DROP ON performance_schema.* TO 'pmm'@' localhost';

mysql:metrics 需要REPLICATION CLIENT权限
mysql:queries 需要SUPER权限
具体参考 [What privileges are required to monitor a MySQL instance?(https://www.percona.com/doc/percona-monitoring-and-management/faq.html#id11)

Step 1.创建PMM Data Container

1
2
3
4
5
6
7
$ docker create \
-v /opt/prometheus/data \
-v /opt/consul-data \
-v /var/lib/mysql \
-v /var/lib/grafana \
--name pmm-data \
percona/pmm-server:latest /bin/true

Note

如果再本地找不到,Docker会从Dockerhub拉取image

确保你在使用最新版本的Docker

This container does not run, it simply exists to make sure you retain all PMM data when you upgrade to a newer pmm-server image. Do not remove or re-create this container, unless you intend to wipe out all PMM data and start over.

上述命令执行以下工作:

  • docker create命令指示Docker守护程序从映像创建一个容器.
  • -v选项初始化容器的数据卷.
  • --name选项为可用于引用Docker网络中的容器的容器分配一个自定义名称。 在次数为:pmm-data.
  • percona/pmm-server:1.5.2是导出容器的映像的名称和版本标签.
  • /bin/true是容器运行命令

Step 2.Create and Run the PMM Server Container

1
2
3
4
5
6
7
8
9
10
11
12
docker run -d \
-p 80:80 \
--volumes-from pmm-data \
--name pmm-server \
-e SERVER_USER=hehe \
-e SERVER_PASSWORD=ninainaide \
-e METRICS_RETENTION=4320h \
-e METRICS_MEMORY=35651584 \
-e METRICS_RESOLUTION=3s \
-e DISABLE_TELEMETRY=true \
--restart always \
percona/pmm-server:1.5.2

上述命令执行以下工作:

  • docker run命令指示docker守护程序从映像运行容器。
  • -d选项以分离模式(即后台)启动容器。
  • -p选项映射用于访问PMM服务器Web UI的端口。 例如,如果端口80不可用,则可以使用-p 8080:80将着陆页映射到端口8080。
  • --volumes-from选项从pmm-data容器中装载卷(请参阅步骤1.创建一个PMM数据容器)。
  • —name选项为可用于引用Docker网络中的容器的容器分配一个自定义名称。 在这种情况下:pmm-server。
  • —restart选项定义容器的重新启动策略。 设置它始终确保Docker守护程序在启动时启动容器,并在容器退出时重新启动它。
  • percona / pmm-server:1.5.2是导出容器的映像的名称和版本标签。
  • -e SERVER_USER SERVER_PASSWORD是为了安全,设置访问PMM web页面所需的用户名和密码(否则任何有你pmm地址的人都可以访问) Security Features in Percona Monitoring and Management
  • -e METRICS_MEMORY 默认prometheus使用768M内存存储最近的data chunks 使用此参数设置 具体参考How to control memory consumption for PMM?
  • -e METRICS_RESOLUTION pmm-server 和 client 网络不太好的话把这个值设大一点 1~5s 超过5s promethes无法启动 具体参考What resolution is used for metrics?

2.在所有需要监控的服务器安装PMM Client

1
2
3
安装percona源
rpm -ivh https://www.percona.com/downloads/percona-release/redhat/0.1-4/percona-release-0.1-4.noarch.rpm
sudo yum install pmm-client

3.Connect PMM Client to PMM Server

在安装完PMM Client后,它并不会自动连接PMM Server.

要将客户端连接到PMM服务器,请使用pmm-admin config –server命令指定IP地址。 例如,如果PMM服务器在192.168.100.1上运行,并且在IP 192.168.200.1的计算机上安装了PMM Client:

1
2
3
4
5
6
$ sudo pmm-admin config --server 192.168.100.1 --server-user jsmith --server-password pass1234
OK, PMM server is alive.

PMM Server | 192.168.100.1
Client Name | ubuntu-amd64
Client Address | 192.168.200.1

Note

If you changed the default port 80 when running PMM Server, specify it after the server’s IP address. For example:

1
$ sudo pmm-admin config --server 192.168.100.1:8080 --server-user jsmith --server-password pass1234

可以添加–client-name参数指定客户端名称,否则为主机名,但是像ECS那样的默认主机名iZbp1igpeohje7z5ugkdr9Z肯定不是我们想要的

4.Start data collection

After you connect the client to PMM Server, enable data collection from the database instance by adding a monitoring service.

To enable general system metrics, MySQL metrics, and MySQL query analytics, run:

1
sudo pmm-admin add mysql --user mysql --password mysql --socket /data/mysqldata/3306/mysql.sock

To enable general system metrics, MongoDB metrics, and MongoDB query analytics, run:

1
sudo pmm-admin --dev-enable add mongodb

Note

MongoDB查询分析是实验性的,在添加时需要–dev-enable选项。 没有此选项,则只会添加一般系统指标和MongoDB指标.

To enable ProxySQL performance metrics, run:

1
sudo pmm-admin add proxysql:metrics

要查看正在监控的内容,请运行:

1
$ sudo pmm-admin list

举例你开启了系统和MongoDB指标监控,会产生类似下面的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ sudo pmm-admin list
pmm-admin 1.1.3

PMM Server | 192.168.100.1
Client Name | ubuntu-amd64
Client Address | 192.168.200.1
Service manager | linux-systemd

---------------- ----------- ----------- -------- ---------------- --------
SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS
---------------- ----------- ----------- -------- ---------------- --------
linux:metrics mongo-main 42000 YES -
mongodb:metrics mongo-main 42003 YES localhost:27017

有关添加实例的更多信息,请运行pmm-admin add —help.

装完一定要用pmm-admin check-network检查,下面的是正常的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[root@iZbp1igpeohje7z5ugkdr9Z log]# pmm-admin check-network
PMM Network Status

Server Address | xx.xx.xx.xx
Client Address | xx.xx.xx.xx

* System Time
NTP Server (0.pool.ntp.org) | 2017-09-28 15:44:39 +0800 CST
PMM Server | 2017-09-28 07:44:38 +0000 GMT
PMM Client | 2017-09-28 15:44:39 +0800 CST
PMM Server Time Drift | OK
PMM Client Time Drift | OK
PMM Client to PMM Server Time Drift | OK

* Connection: Client --> Server
-------------------- -------
SERVER SERVICE STATUS
-------------------- -------
Consul API OK
Prometheus API OK
Query Analytics API OK

Connection duration | 31.032856ms
Request duration | 31.557981ms
Full round trip | 62.590837ms


* Connection: Client <-- Server
-------------- ---------- -------------------- ------- ---------- ---------
SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD
-------------- ---------- -------------------- ------- ---------- ---------
linux:metrics pt_slave5 101.37.14.213:42000 OK YES -
mysql:metrics pt_slave5 101.37.14.213:42002 OK YES -

修改容器时区(你试试不改时区grafana邮件告警的时候时间对吗 :) )
1
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

重启

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@test2 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69195dca404b percona/pmm-server:1.5.2 "/opt/entrypoint.sh" 28 hours ago Exited (128) 6 minutes ago 0.0.0.0:80->80/tcp, 443/tcp pmm-server
094c63bd911b percona/pmm-server:1.5.2 "/bin/true" 28 hours ago pmm-data
[root@test2 ~]# docker start 69195dca404b
Error response from daemon: Cannot start container 69195dca404b: Error getting container 69195dca404bc607fa12a9cd6436a9786a71dcf226a0e4c1d6bf0b9879a14f03 from driver devicemapper: Error mounting '/dev/mapper/docker-253:0-6947028-69195dca404bc607fa12a9cd6436a9786a71dcf226a0e4c1d6bf0b9879a14f03' on '/var/lib/docker/devicemapper/mnt/69195dca404bc607fa12a9cd6436a9786a71dcf226a0e4c1d6bf0b9879a14f03': device or resource busy
Error: failed to start containers: [69195dca404b]

umount /var/lib/docker/devicemapper/mnt/69195dca404bc607fa12a9cd6436a9786a71dcf226a0e4c1d6bf0b9879a14f03
[root@test2 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69195dca404b percona/pmm-server:1.5.2 "/opt/entrypoint.sh" 28 hours ago Exited (128) 8 minutes ago 0.0.0.0:80->80/tcp, 443/tcp pmm-server
094c63bd911b percona/pmm-server:1.5.2 "/bin/true" 28 hours ago pmm-data
[root@test2 ~]# docker start 69195dca404b
69195dca404b
[root@test2 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69195dca404b percona/pmm-server:1.5.2 "/opt/entrypoint.sh" 28 hours ago Up 5 seconds 0.0.0.0:80->80/tcp, 443/tcp pmm-server
094c63bd911b percona/pmm-server:1.5.2 "/bin/true" 28 hours ago pmm-data

MHA在监控和故障转移时都做了什么

MHA在监控和故障转移时都做了什么

以下是MHA(masterha_manager)在监控和故障切换上的基本流程

验证复制配置和识别当前主库

  • 通过连接配置文件中描述的所有主机来识别当前主库.你不必手动指明那个主句是主库,MHA会自动检查复制设置并识别当前主库.

    注意:MHA本身不能构建复制环境,MHA监控已存在的复制环境

  • If any slave is dead at this stage, terminating the script for safety reasons(If any slave is dead, MHA can not recover the dead slave, of course). 开启监控时任何slave发生故障都会导致监控退出,MHA并不能修复从库

  • 如果任何必要的脚本没有安装在所有Node,MHA abort而不会启动监控

masterha_manager参数说明

masterha_manager: Command to run MHA Manager

MHA Manager can be started by executing masterha_manager command.

1
2
# masterha_manager --conf=/etc/conf/masterha/app1.cnf

masterha_manager takes below arguments.

Common arguments


Powered by Hexo and Hexo-theme-hiker

Copyright © 2013 - 2022 Fan() All Rights Reserved.

访客数 : | 访问量 :