ClickHouse查询分布式表LEFT JOIN改RIGHT JOIN的大坑

由一个慢查询衍生出的问题

我们线上有一个ClickHouse集群, 总共6个服务器, 配置均为16C 64G SSD, 集群配置为三分片两副本

有两个表这里称为small_tablebig_table. 都是ReplicatedMergeTree引擎(三个分片两个副本).

small_table有79w数据, big_table有5亿数据(数据在之后的示例中没有任何变化), 在下文中small_tablebig_table都为分布式表, 可以获取全量数据, small_table_localbig_table_local为各节点上的本地表名称

1
2
3
4
5
6
7
8
9
10
11
12
13
14
SELECT
table,
formatReadableSize(sum(data_compressed_bytes)) AS tc,
formatReadableSize(sum(data_uncompressed_bytes)) AS tu,
sum(data_compressed_bytes) / sum(data_uncompressed_bytes) AS ratio
FROM system.columns
WHERE (database = currentDatabase()) AND (table IN ('small_table_local', 'big_table_local'))
GROUP BY table
ORDER BY table ASC
┌─table─────────────────────────┬─tc────────┬─tu────────┬──────────────ratio─┐
│ small_table_local │ 12.87 MiB │ 14.91 MiB │ 0.8633041477100831
│ big_table_local │ 15.46 GiB │ 57.31 GiB │ 0.2697742507036428
└───────────────────────────────┴───────────┴───────────┴────────────────────┘
1
2
3
4
5
6
SELECT count(*)
FROM small_table
┌─count()─┐
794469
└─────────┘
1
2
3
4
5
6
7
SELECT count(*)
FROM big_table
┌───count()─┐
519898780
└───────────┘

有如下查询

1
SELECT a.UID,B.UID from dwh.small_table a LEFT JOIN dwh.big_table b on a.UID = b.UID

这个查询在ClickHouse中要跑近300秒

1
2
3
4
5
6
7
8
9
10
11
12
13
#time clickhouse-client --time --progress --query="
SELECT
a.UID, B.UID
FROM
dwh.small_table a
LEFT JOIN
dwh.big_table b ON a.UID = b.UID
" > /dev/null
293.769
real 4m53.798s
user 0m0.574s
sys 0m0.225s

单个节点

而在TIDB只需要20秒(节点数和配置比CH略好, 数据略多于CH, 未使用TIFlash)

1
2
3
4
5
6
7
8
9
10
11
12
13
# time mysql -uroot -hxx.xx.xx -P4000 -p dwh -e "
SELECT
a.UID, B.UID
FROM
dwh.small_table a
LEFT JOIN
dwh.big_table b ON a.UID = b.UID;
" > /dev/null
Enter password:
real 0m20.955s
user 0m11.292s
sys 0m2.321s

本人接触ClickHouse不久, 没什么实战经验, 看到这结果就感觉肯定是自己使用姿势不对

JOIN操作时一定要把数据量小的表放在右边

一通百度Google, 看到一篇来自携程的文章每天十亿级数据更新,秒出查询结果,ClickHouse在携程酒店的应用, 其中有一段话:

JOIN操作时一定要把数据量小的表放在右边,ClickHouse中无论是Left Join 、Right Join还是Inner Join永远都是拿着右表中的每一条记录到左表中查找该记录是否存在,所以右表必须是小表。

有点神奇..

我们知道在常见的关系型数据库如Oralce、MySQL中, LEFT JOIN和RIGTH JOIN是可以等价改写的, 那么我改成RIGHT JOIN不就”把小表放在右边”了吗, 于是SQL改写为

1
SELECT a.UID,B.UID from dwh.big_table b RIGHT JOIN dwh.small_table a on a.UID = b.UID

实测

1
2
3
4
5
6
7
8
9
10
11
12
13
#time clickhouse-client --time --progress --query="
SELECT
a.UID, B.UID
FROM
dwh.big_table b
RIGHT JOIN
dwh.small_table a ON a.UID = b.UIDT
" > /dev/null
19.588
real 0m19.609s
user 0m0.742s
sys 0m0.293s

没想到还真好使… 难道CH优化器这么弱?

谨慎起见, 我比对了一下结果, 简单count一下吧

LEFT JOIN

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#time clickhouse-client --time --progress --query="
SELECT
COUNT(*)
FROM
dwh.small_table a
LEFT JOIN
dwh.big_table b ON a.UID = b.UID
"
6042735 --count
917.560 --时间
real 15m17.580s
user 0m0.253s
sys 0m0.489s

RIGHT JOIN

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#time clickhouse-client --time --progress --query="
SELECT
COUNT(*)
FROM
dwh.big_table b
RIGHT JOIN
dwh.small_table a ON a.UID = b.UID
"
6897617 --count
11.655 --时间
real 0m11.675s
user 0m0.014s
sys 0m0.017s

RIGHT JOIN数据不对啊

ClickHouse分布式表A LEFT JOIN B != B RIGHT JOIN A

创建测试表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
ch-node-05 default@localhost:9000 [dwh]
:) show create table t1;
SHOW CREATE TABLE t1
┌─statement─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
CREATE TABLE dwh.t1 (`I_ID` String, `CTIME` DateTime) ENGINE = Distributed('ch_cluster_all', 'dwh', 't1_local', rand()) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.001 sec.
ch-node-05 default@localhost:9000 [dwh]
:) show create table t2;
SHOW CREATE TABLE t2
┌─statement─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
CREATE TABLE dwh.t2 (`I_ID` String, `CTIME` DateTime) ENGINE = Distributed('ch_cluster_all', 'dwh', 't2_local', rand()) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.001 sec.
ch-node-05 default@localhost:9000 [dwh]
:) show create table t1_local;
SHOW CREATE TABLE t1_local
┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
CREATE TABLE dwh.t1_local (`I_ID` String, `CTIME` DateTime) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/dwh/tables/{layer}-{shard}/t1', '{replica}') PARTITION BY toDate(CTIME) ORDER BY I_ID SETTINGS index_granularity = 8192
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.001 sec.
ch-node-05 default@localhost:9000 [dwh]
:) show create table t2_local;
SHOW CREATE TABLE t2_local
┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
CREATE TABLE dwh.t2_local (`I_ID` String, `CTIME` DateTime) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/dwh/tables/{layer}-{shard}/t2', '{replica}') PARTITION BY toDate(CTIME) ORDER BY I_ID SETTINGS index_granularity = 8192
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.001 sec.

数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
ch-node-05 default@localhost:9000 [dwh]
:) select * from t1;
SELECT *
FROM t1
┌─I_ID─┬───────────────CTIME─┐
12020-08-27 15:24:05
22020-08-27 15:24:50
82020-08-27 15:24:50
└──────┴─────────────────────┘
┌─I_ID─┬───────────────CTIME─┐
32020-08-27 15:24:50
52020-08-27 15:24:50
92020-08-27 15:24:50
└──────┴─────────────────────┘
┌─I_ID─┬───────────────CTIME─┐
102020-08-27 15:24:50
32020-08-27 15:24:50
62020-08-27 15:24:50
72020-08-27 15:24:50
└──────┴─────────────────────┘
10 rows in set. Elapsed: 0.003 sec.
ch-node-05 default@localhost:9000 [dwh]
:) select * from t2;
SELECT *
FROM t2
┌─I_ID─┬───────────────CTIME─┐
12020-08-27 15:25:14
└──────┴─────────────────────┘
┌─I_ID─┬───────────────CTIME─┐
22020-08-27 15:25:33
52020-08-27 15:25:33
└──────┴─────────────────────┘
┌─I_ID─┬───────────────CTIME─┐
32020-08-27 15:25:33
32020-08-27 15:25:33
└──────┴─────────────────────┘
5 rows in set. Elapsed: 0.003 sec.
ch-node-05 default@localhost:9000 [dwh]
:) SELECT
:-] _shard_num,
:-] count(*)
:-] FROM
:-] (
:-] SELECT
:-] _shard_num,
:-] a.*
:-] FROM dwh.t1 AS a
:-] )
:-] GROUP BY _shard_num
:-] WITH ROLLUP;
SELECT
_shard_num,
count(*)
FROM
(
SELECT
_shard_num,
a.*
FROM dwh.t1 AS a
)
GROUP BY _shard_num
WITH ROLLUP
┌─_shard_num─┬─count()─┐
33
23
14
└────────────┴─────────┘
┌─_shard_num─┬─count()─┐
010
└────────────┴─────────┘
4 rows in set. Elapsed: 0.004 sec.
ch-node-05 default@localhost:9000 [dwh]
:) SELECT
:-] _shard_num,
:-] count(*)
:-] FROM
:-] (
:-] SELECT
:-] _shard_num,
:-] a.*
:-] FROM dwh.t2 AS a
:-] )
:-] GROUP BY _shard_num
:-] WITH ROLLUP;
SELECT
_shard_num,
count(*)
FROM
(
SELECT
_shard_num,
a.*
FROM dwh.t2 AS a
)
GROUP BY _shard_num
WITH ROLLUP
┌─_shard_num─┬─count()─┐
32
21
12
└────────────┴─────────┘
┌─_shard_num─┬─count()─┐
05
└────────────┴─────────┘
4 rows in set. Elapsed: 0.005 sec.

测试LEFT JOIN RIGHT JOIN

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
ch-node-05 default@localhost:9000 [dwh]
:) SELECT
:-] a.I_ID,
:-] b.I_ID
:-] FROM dwh.t2 AS a
:-] LEFT JOIN dwh.t1 AS b ON a.I_ID = b.I_ID
:-] ORDER BY a.I_ID ASC;
SELECT
a.I_ID,
b.I_ID
FROM dwh.t2 AS a
LEFT JOIN dwh.t1 AS b ON a.I_ID = b.I_ID
ORDER BY a.I_ID ASC
┌─I_ID─┬─b.I_ID─┐
11
└──────┴────────┘
┌─I_ID─┬─b.I_ID─┐
22
33
33
33
33
└──────┴────────┘
┌─I_ID─┬─b.I_ID─┐
55
└──────┴────────┘
7 rows in set. Elapsed: 0.006 sec.
ch-node-05 default@localhost:9000 [dwh]
:) SELECT
:-] a.I_ID,
:-] b.I_ID
:-] FROM dwh.t1 AS b
:-] RIGHT JOIN dwh.t2 AS a ON a.I_ID = b.I_ID
:-] ORDER BY a.I_ID ASC;
SELECT
a.I_ID,
b.I_ID
FROM dwh.t1 AS b
RIGHT JOIN dwh.t2 AS a ON a.I_ID = b.I_ID
ORDER BY toUInt32(a.I_ID) ASC
┌─a.I_ID─┬─I_ID─┐
1 │ │
11
1 │ │
2 │ │
22
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
2 │ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
33
33
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
3 │ │
3 │ │
33
33
5 │ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
5 │ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
55
└────────┴──────┘

可以看到RIGHT JOIN返回了一些错误的数据

难道想要对于这个SQL, 就不能用分布式表了吗? 如果我想用RIGHT JOIN就只能用单机? 说好的水平扩展呢?

那看一下单机查询速度吧..

为此我在一个CH节点创建两个表small_table_totalbig_table_toal 他们都不是分布式表, 都拥有全量数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT count(*)
FROM `big_table_toal`
┌───count()─┐
519898780
└───────────┘
1 rows in set. Elapsed: 0.001 sec.
SELECT count(*)
FROM `small_table_total`
┌─count()─┐
794469
└─────────┘

分布式表只能和分布式表关联, 分布式表无法和本地表关联

1
2
3
4
5
> #clickhouse-client --time --progress --query="SELECT count(*) from dwh.big_table b RIGHT JOIN dwh.small_table_total a on a.UID = b.UID"
> → Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) Received exception from server (version 20.3.5):
> Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Received from bj2-ch-node-04:9000. DB::Exception: Table dwh.small_table_total doesn't exist..
> 0.107
>

测试查询速度

1
2
3
4
5
6
7
8
9
10
#clickhouse-client --time --progress --query="
SELECT
COUNT(*)
FROM
dwh.big_table_total b
RIGHT JOIN
dwh.small_table_total a ON a.UID = b.UID
"
6042735 --count
7.262 --用时

好家伙, 数据准确, 而且比分片了还快

难道只能用本地表?

分布式表要想RIGHT JOIN返回正确结果, 只能改写SQL

原始语句

1
2
3
4
5
6
SELECT
a.UID, b.UID
FROM
dwh.small_table a
LEFT JOIN
dwh.big_table b ON a.UID = b.UID

改写为INNER JOIN, 但是没有改表顺序(性能差)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SELECT
a.id, b.uid
FROM
dwh.small_table a
GLOBAL INNER JOIN
dwh.big_table b ON a.UID = b.UID
UNION ALL SELECT
a.UID, NULL
FROM
dwh.small_table a
WHERE
a.UID GLOBAL NOT IN (SELECT
UID
FROM
dwh.big_table)

这里我还没理解为什么要用GLOBAL JOIN

在我的例子中 ,这个语句根本跑不成功, GLOBAL太耗费内存了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
SELECT
a.UID,
b.UID
FROM dwh.small_table AS a
GLOBAL INNER JOIN dwh.big_table AS b ON a.UID = b.UID
UNION ALL
SELECT
a.UID,
NULL
FROM dwh.small_table AS a
WHERE a.UID GLOBAL NOT IN
(
SELECT UID
FROM dwh.big_table
)
↑ Progress: 220.53 million rows, 29.82 GB (51.24 million rows/s., 6.93 GB/s.) ████████████████████████████████████▋ 20%Received exception from server (version 20.3.5):
Code: 241. DB::Exception: Received from localhost:9000. DB::Exception: Memory limit (for query) exceeded: would use 50.00 GiB (attempt to allocate chunk of 4249200 bytes), maximum: 50.00 GiB: (while reading column CH_BILL_USER_TELEPHONE): (while reading from part /data/clickhouse/ch_9000/data/dwh/big_table_local/201910_0_5_1/ from mark 216 with max_rows_to_read = 8192):
Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 50.00 GiB (attempt to allocate chunk of 4227680 bytes), maximum: 50.00 GiB: (while reading column CH_XXX_NAME): (while reading from part /data/clickhouse/ch_9000/data/dwh/big_table_local/202001_6_11_1/ from mark 240 with max_rows_to_read = 8192) (version 20.3.5.21 (official build)):
Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 50.00 GiB (attempt to allocate chunk of 5211280 bytes), maximum: 50.00 GiB: (avg_value_size_hint = 66, avg_chars_size = 69.6, limit = 8192): (while reading column CH_BROKER_NAME): (while reading from part /data/clickhouse/ch_9000/data/dwh/big_table_local/202007_6_11_1/ from mark 24 with max_rows_to_read = 8192) (version 20.3.5.21 (official build)):
Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 50.00 GiB (attempt to allocate chunk of 4572064 bytes), maximum: 50.00 GiB: (avg_value_size_hint = 66.00048828125, avg_chars_size = 69.6005859375, limit = 8192): (while reading column CH_XX_NAME): (while reading from part /data/clickhouse/ch_9000/data/dwh/big_table_local/201805_2_2_0/ from mark 24 with max_rows_to_read = 8192) (version 20.3.5.21 (official build)): While executing CreatingSetsTransform.
0 rows in set. Elapsed: 4.404 sec. Processed 220.53 million rows, 29.82 GB (50.07 million rows/s., 6.77 GB/s.)

如果去掉GLOBAL JOIN, 也不行(GLOBAL IN不能去)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
SELECT
a.UID,
b.UID
FROM dwh.small_table AS a
INNER JOIN dwh.big_table AS b ON a.UID = b.UID
UNION ALL
SELECT
a.UID,
NULL
FROM dwh.small_table AS a
WHERE a.UID GLOBAL NOT IN
(
SELECT UID
FROM dwh.big_table
)
↑ Progress: 1.91 billion rows, 105.59 GB (6.36 million rows/s., 352.30 MB/s.) ██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ 90%Received exception from server (version 20.3.5):
Code: 241. DB::Exception: Received from localhost:9000. DB::Exception: Received from clickhouse-node-01:9000. DB::Exception: Memory limit (total) exceeded: would use 50.10 GiB (attempt to allocate chunk of 133829856 bytes), maximum: 50.00 GiB: While executing CreatingSetsTransform.
0 rows in set. Elapsed: 299.809 sec. Processed 1.91 billion rows, 105.59 GB (6.35 million rows/s., 352.18 MB/s.)

改写表顺序, 让小表在”右边”

1
2
3
4
5
6
7
8
9
10
SELECT a.UID FROM dwh.big_table b
GLOBAL INNER JOIN dwh.small_table a
ON a.UID = b.UID
UNION ALL
SELECT a.UID,null FROM dwh.small_table a
WHERE a.UID GLOBAL NOT IN
(
SELECT UID FROM dwh.big_table
WHERE UID GLOBAL IN (SELECT id FROM dwh.small_table)
)

实测

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
time clickhouse-client --time --progress --query="
SELECT a.UID,b.UID FROM dwh.big_table b
GLOBAL INNER JOIN dwh.small_table a
on a.UID = b.UID
UNION ALL
SELECT a.UID,null FROM dwh.small_table a
WHERE a.UID GLOBAL NOT IN
(
SELECT UID FROM dwh.big_table
WHERE UID GLOBAL IN (SELECT UID FROM dwh.small_table)
)" >/dev/null
21.142
real 0m21.164s
user 0m1.133s
sys 0m0.378s

看一下行数对不对

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
time clickhouse-client --time --progress --query="
SELECT sum(cnt) FROM (
SELECT count(*) cnt FROM dwh.big_table b
GLOBAL INNER JOIN dwh.small_table a
on a.UID = b.UID
UNION ALL
SELECT count(*) cnt FROM dwh.small_table a
WHERE a.UID GLOBAL NOT IN
(
SELECT UID FROM dwh.big_table
WHERE UID GLOBAL IN (SELECT UID FROM dwh.small_table)
))"
6042735 --count
12.525 --用时
real 0m12.545s
user 0m0.018s
sys 0m0.012s

最后一个问题

不知道你们是否注意到了

ClickHouse

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
SELECT
a.I_ID,
b.I_ID
FROM dwh.t1 AS b
RIGHT JOIN dwh.t2 AS a ON a.I_ID = b.I_ID
ORDER BY a.I_ID ASC
┌─a.I_ID─┬─I_ID─┐
1 │ │
11
1 │ │
2 │ │
22
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
2 │ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
33
33
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
3 │ │
3 │ │
33
33
5 │ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
5 │ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
55
└────────┴──────┘

MySQL(例子无关)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
root@localhost 15:15:26 [fanboshi]> select a.id,b.id from t1 a left join t3 b on a.id=b.id;
+----+------+
| id | id |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | NULL |
| 6 | NULL |
| 7 | NULL |
| 8 | NULL |
| 9 | NULL |
| 11 | NULL |
| 13 | NULL |
| 15 | NULL |
| 17 | NULL |
| 19 | NULL |
| 21 | NULL |
| 23 | NULL |
| 25 | NULL |
| 27 | NULL |
| 29 | NULL |
| 30 | NULL |
| 31 | NULL |
+----+------+
21 rows in set (0.00 sec)

在CH里外链接不想MySQL那样”用null补未匹配的数据”而是用该列数据类型的默认值填充

https://github.com/ClickHouse/ClickHouse/blob/master/src/Core/Settings.h#L189

join_use_nulls可以在语句,用户profile添加

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
SELECT
a.I_ID,
b.I_ID
FROM dwh.t1 AS b
RIGHT JOIN dwh.t2 AS a ON a.I_ID = b.I_ID
ORDER BY toUInt32(a.I_ID) ASC
SETTINGS join_use_nulls = 1
┌─a.I_ID─┬─I_ID─┐
1 │ ᴺᵁᴸᴸ │
11
1 │ ᴺᵁᴸᴸ │
2 │ ᴺᵁᴸᴸ │
22
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
2 │ ᴺᵁᴸᴸ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
33
33
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
3 │ ᴺᵁᴸᴸ │
3 │ ᴺᵁᴸᴸ │
33
33
5 │ ᴺᵁᴸᴸ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
5 │ ᴺᵁᴸᴸ │
└────────┴──────┘
┌─a.I_ID─┬─I_ID─┐
55
└────────┴──────┘
15 rows in set. Elapsed: 0.015 sec.

参考资料

我提了issue, 详细原因请看https://github.com/ClickHouse/ClickHouse/issues/14160

总之除了LEFT JOIN 外 For other OUTER JOINs there's no general solution to return expected result yet

Powered by Hexo and Hexo-theme-hiker

Copyright © 2013 - 2020 Fan() All Rights Reserved.

访客数 : | 访问量 :