Spread SELECTs from reporting apps among cluster nodes
Reporting apps usually generate various customer reports from SELECT
query results.
The load generated by such SELECT
s on ClickHouse
cluster may vary depending
on the number of online customers and on the generated report types. It is obvious
that the load must be limited in order to prevent cluster overload.
All the SELECT
s may be routed to a distributed table on a single node. But this increases resource usage (RAM, CPU and network) on the node comparing to other nodes, since it must do final aggregation, sorting and filtering for the data obtained from cluster nodes (shards).
It would be better to create identical distributed tables on each shard and spread SELECT
s among all the available shards.
The following minimal chproxy
config may be used for this use case: