clickhouse 指标采集
简介
采集 clickhouse 数据指标并上报到 DataFlux 中
前置条件
- 已安装 clickhouse-server(clickhouse 安装文档)
- 已安装 DataKit(DataKit 安装文档)
配置
进入 DataKit 安装目录下的 conf.d/db 目录,复制 clickhouse.conf.sample 并命名为 clickhouse.conf。示例如下:
# Read metrics from one or many ClickHouse servers
[[inputs.clickhouse]]
## Username for authorization on ClickHouse server
## example: username = "default"
username = "default"
## Password for authorization on ClickHouse server
## example: password = "super_secret"
## HTTP(s) timeout while getting metrics values
## The timeout includes connection time, any redirects, and reading the response body.
## example: timeout = 1s
# timeout = 5s
## List of servers for metrics scraping
## metrics scrape via HTTP(s) clickhouse interface
## https://clickhouse.tech/docs/en/interfaces/http/
## example: servers = ["http://127.0.0.1:8123","https://custom-server.mdb.yandexcloud.net"]
servers = ["http://127.0.0.1:8123"]
## If "auto_discovery"" is "true" plugin tries to connect to all servers available in the cluster
## with using same "user:password" described in "user" and "password" parameters
## and get this server hostname list from "system.clusters" table
## see
## - https://clickhouse.tech/docs/en/operations/system_tables/#system-clusters
## - https://clickhouse.tech/docs/en/operations/server_settings/settings/#server_settings_remote_servers
## - https://clickhouse.tech/docs/en/operations/table_engines/distributed/
## - https://clickhouse.tech/docs/en/operations/table_engines/replication/#creating-replicated-tables
## example: auto_discovery = false
# auto_discovery = true
## Filter cluster names in "system.clusters" when "auto_discovery" is "true"
## when this filter present then "WHERE cluster IN (...)" filter will apply
## please use only full cluster names here, regexp and glob filters is not allowed
## for "/etc/clickhouse-server/config.d/remote.xml"
## <yandex>
## <remote_servers>
## <my-own-cluster>
## <shard>
## <replica><host>clickhouse-ru-1.local</host><port>9000</port></replica>
## <replica><host>clickhouse-ru-2.local</host><port>9000</port></replica>
## </shard>
## <shard>
## <replica><host>clickhouse-eu-1.local</host><port>9000</port></replica>
## <replica><host>clickhouse-eu-2.local</host><port>9000</port></replica>
## </shard>
## </my-onw-cluster>
## </remote_servers>
##
## </yandex>
##
## example: cluster_include = ["my-own-cluster"]
# cluster_include = []
## Filter cluster names in "system.clusters" when "auto_discovery" is "true"
## when this filter present then "WHERE cluster NOT IN (...)" filter will apply
## example: cluster_exclude = ["my-internal-not-discovered-cluster"]
# cluster_exclude = []
## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
注意,auto_discovery
参数的作用是,在分布式 clickhouse 中自动发现其他节点,并将其地址添加到采集列表中。
例如 servers 参数中配置了 [http://192.168.1.10:8123:]
,当 auto_discovery
为 true
时,采集器会自动发现与 192.168.1.10:8123
在同一个集群的其他 clickhouse(比如 ip 为192.168.1.20
和192.168.1.30
),
此时采集器会给 192.168.1.20
和 192.168.1.30
配置端口 8123
,并将这两个地址添加到采集列表中,后续会自动采集这两个 clickhouse 的指标。详细说明需查看配置文件中的官方文档。
auto_discovery
默认是开启状态,即 true
。非分布式 clickhouse 请将此参数手动设置为 false
。
配置好后,重启 DataKit 即可生效。
采集指标
- 指标集 clickhouse_events
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
arena_alloc_bytes | fields | int |
arena_alloc_chunks | fields | int |
compressed_read_buffer_blocks | fields | int |
compressed_read_buffer_bytes | fields | int |
context_lock | fields | int |
created_read_buffer_ordinary | fields | int |
created_write_buffer_ordinary | fields | int |
disk_read_elapsed_microseconds | fields | int |
disk_write_elapsed_microseconds | fields | int |
file_open | fields | int |
function_execute | fields | int |
inserted_bytes | fields | int |
inserted_rows | fields | int |
io_buffer_alloc_bytes | fields | int |
io_buffer_allocs | fields | int |
merge | fields | int |
merge_tree_data_writer_blocks | fields | int |
merge_tree_data_writer_blocks_already_sorted | fields | int |
merge_tree_data_writer_compressed_bytes | fields | int |
merge_tree_data_writer_rows | fields | int |
merge_tree_data_writer_uncompressed_bytes | fields | int |
merged_rows | fields | int |
merged_uncompressed_bytes | fields | int |
merges_time_milliseconds | fields | int |
os_read_chars | fields | int |
os_write_bytes | fields | int |
os_write_chars | fields | int |
oscpu_virtual_time_microseconds | fields | int |
query | fields | int |
read_buffer_from_file_descriptor_read | fields | int |
read_buffer_from_file_descriptor_read_bytes | fields | int |
read_compressed_bytes | fields | int |
real_time_microseconds | fields | int |
rw_lock_acquired_read_locks | fields | int |
select_query | fields | int |
soft_page_faults | fields | int |
system_time_microseconds | fields | int |
user_time_microseconds | fields | int |
write_buffer_from_file_descriptor_write | fields | int |
write_buffer_from_file_descriptor_write_bytes | fields | int |
详情查看 system.events
- 指标集 clickhouse_metrics
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
background_buffer_flush_schedule_pool_task | fields | int |
background_distributed_schedule_pool_task | fields | int |
background_move_pool_task | fields | int |
background_pool_task | fields | int |
background_schedule_pool_task | fields | int |
cache_dictionary_update_queue_batches | fields | int |
cache_dictionary_update_queue_keys | fields | int |
context_lock_wait | fields | int |
delayed_inserts | fields | int |
dict_cache_requests | fields | int |
disk_space_reserved_for_merge | fields | int |
distributed_files_to_insert | fields | int |
distributed_send | fields | int |
ephemeral_node | fields | int |
global_thread | fields | int |
global_thread_active | fields | int |
http_connection | fields | int |
interserver_connection | fields | int |
local_thread | fields | int |
local_thread_active | fields | int |
memory_tracking | fields | int |
memory_tracking_for_merges | fields | int |
memory_tracking_in_background_buffer_flush_schedule_pool | fields | int |
memory_tracking_in_background_distributed_schedule_pool | fields | int |
memory_tracking_in_background_move_processing_pool | fields | int |
memory_tracking_in_background_processing_pool | fields | int |
memory_tracking_in_background_schedule_pool | fields | int |
merge | fields | int |
my_sql_connection | fields | int |
open_file_for_read | fields | int |
open_file_for_write | fields | int |
part_mutation | fields | int |
postgre_sql_connection | fields | int |
query | fields | int |
query_preempted | fields | int |
query_thread | fields | int |
read | fields | int |
readonly_replica | fields | int |
replicated_checks | fields | int |
replicated_fetch | fields | int |
replicated_send | fields | int |
revision | fields | int |
rw_lock_active_readers | fields | int |
rw_lock_active_writers | fields | int |
rw_lock_waiting_readers | fields | int |
rw_lock_waiting_writers | fields | int |
send_external_tables | fields | int |
send_scalars | fields | int |
storage_buffer_bytes | fields | int |
storage_buffer_rows | fields | int |
tcp_connection | fields | int |
version_integer | fields | int |
write | fields | int |
zoo_keeper_request | fields | int |
zoo_keeper_session | fields | int |
zoo_keeper_watch | fields | int |
详情查看 system.metrics
- 指标集 clickhouse_asynchronous_metrics
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
compiled_expression_cache_count | fields | int |
cpu_frequency_m_hz_0 | fields | int |
cpu_frequency_m_hz_1 | fields | int |
cpu_frequency_m_hz_2 | fields | int |
cpu_frequency_m_hz_3 | fields | int |
cpu_frequency_m_hz_4 | fields | int |
cpu_frequency_m_hz_5 | fields | int |
cpu_frequency_m_hz_6 | fields | int |
jemalloc.active | fields | int |
jemalloc.allocated | fields | int |
jemalloc.arenas.all.dirty_purged | fields | int |
jemalloc.arenas.all.muzzy_purged | fields | int |
jemalloc.arenas.all.pactive | fields | int |
jemalloc.arenas.all.pdirty | fields | int |
jemalloc.arenas.all.pmuzzy | fields | int |
jemalloc.background_thread.num_runs | fields | int |
jemalloc.background_thread.num_threads | fields | int |
jemalloc.background_thread.run_intervals | fields | int |
jemalloc.epoch | fields | int |
jemalloc.mapped | fields | int |
jemalloc.metadata | fields | int |
jemalloc.metadata_thp | fields | int |
jemalloc.resident | fields | int |
jemalloc.retained | fields | int |
mark_cache_bytes | fields | int |
mark_cache_files | fields | int |
max_part_count_for_partition | fields | int |
memory_code | fields | int |
memory_data_and_stack | fields | int |
memory_resident | fields | int |
memory_shared | fields | int |
memory_virtual | fields | int |
number_of_databases | fields | int |
number_of_tables | fields | int |
replicas_max_absolute_delay | fields | int |
replicas_max_inserts_in_queue | fields | int |
replicas_max_merges_in_queue | fields | int |
replicas_max_queue_size | fields | int |
replicas_max_relative_delay | fields | int |
replicas_sum_inserts_in_queue | fields | int |
replicas_sum_merges_in_queue | fields | int |
replicas_sum_queue_size | fields | int |
uncompressed_cache_bytes | fields | int |
uncompressed_cache_cells | fields | int |
uptime | fields | int |
详情查看 system.asynchronous_metrics
- 指标集 clickhouse_tables
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
database | tags | string |
table | tags | string |
bytes | fields | int |
parts | fields | int |
rows | fields | int |
- 指标集 clickhouse_zookeeper
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
root_nodes | fields | int |
root_nodes
system.zookeeper 节点数,其中 path = /
。
- 指标集 clickhouse_replication_queue
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
too_many_tries_replicas | fields | int |
too_many_tries_replicas
是在 system.replication_queue
中 num_tries
大于 1 的副本数量。
- 指标集 clickhouse_detached_parts
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
detached_parts | fields | int |
detached_parts
是从 system.detached_parts 中分离的表和数据库的总数。
- 指标集 clickhouse_dictionaries
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
dict_origin | fields | int |
is_loaded | fields | int |
bytes_allocated | fields | int |
dict_origin
xml Filename when dictionary created from *_dictionary.xml
, database.table when dictionary created from DDL
is_loaded
0 - when dictionary data not successful load, 1 - when dictionary data loading fail, see system.dictionaries for details
bytes_allocated
how many bytes allocated in RAM after a dictionary loaded)
- 指标集 clickhouse_mutations
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
running | fields | int |
failed | fields | int |
completed | fields | int |
running
尚未完成的数量
failed
执行失败数量
completed
执行成功数量
详情查看 system.mutations
- 指标集 clickhouse_disks
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
name | tags | string |
path | tags | string |
free_space_percent | fields | int |
keep_free_space_percent | fields | int |
free_space_percent
衡量显示可用磁盘空间字节相对于总磁盘空间字节的当前百分比
keep_free_space_percent
测量显示当前所需的保留可用磁盘字节相对于总磁盘空间字节的百分比
详情查看 system.disks
- 指标集 clickhouse_processes
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
percentile_50 | fields | int |
percentile_90 | fields | int |
longest_running | fields | int |
percentile_50
显示正在运行的进程的已用字段为50%(分位数0.5)
percentile_90
显示正在运行的进程的已使用字段的百分比为90%(分位数0.9)
longest_running
示正在运行的进程的经过字段的最大值
详情查看 system.processes
- 指标集 clickhouse_text_log
指标 | 类型 | 单位 |
---|---|---|
source | tags | string |
cluster | tags | string |
shard_num | tags | string |
level | tags | string |
messages_last_10_min | fields | int |
level
消息级别,仅收集具有小于或等于“通知”级别的消息,详情见 system.text_log
示例输出
clickhouse_asynchronous_metrics,cluster=test_cluster_two_shards_localhost,host=ubuntu-server,name=admin,shard_num=1,source=localhost cpu_frequency_m_hz_0=2194i,cpu_frequency_m_hz_6=2194i,replicas_max_absolute_delay=0i,cpu_frequency_m_hz_2=2194i,cpu_frequency_m_hz_1=2194i,number_of_databases=3i,replicas_max_merges_in_queue=0i,memory_virtual=1326022656i,jemalloc.arenas.all.dirty_purged=87415i,replicas_max_inserts_in_queue=0i,memory_data_and_stack=673378304i,number_of_tables=57i,cpu_frequency_m_hz_5=2194i,replicas_max_queue_size=0i,mark_cache_bytes=0i,memory_resident=170627072i,replicas_sum_merges_in_queue=0i,replicas_sum_inserts_in_queue=0i,jemalloc.arenas.all.pmuzzy=1i,jemalloc.mapped=125575168i,jemalloc.epoch=12i,jemalloc.background_thread.num_threads=0i,jemalloc.arenas.all.muzzy_purged=81497i,mark_cache_files=0i,jemalloc.retained=56877056i,jemalloc.allocated=46833320i,replicas_sum_queue_size=0i,memory_code=320724992i,uncompressed_cache_bytes=0i,jemalloc.background_thread.num_runs=0i,jemalloc.active=51195904i,max_part_count_for_partition=5i,memory_shared=89858048i,uncompressed_cache_cells=0i,cpu_frequency_m_hz_4=2194i,jemalloc.resident=104243200i,replicas_max_relative_delay=0i,jemalloc.arenas.all.pdirty=11502i,jemalloc.background_thread.run_intervals=0i,jemalloc.metadata=7318064i,uptime=615i,jemalloc.metadata_thp=0i,compiled_expression_cache_count=0i,cpu_frequency_m_hz_3=2194i,jemalloc.arenas.all.pactive=12499i 1597644720000000000
clickhouse_metrics,cluster=test_cluster_two_shards_localhost,host=ubuntu-server,name=admin,shard_num=1,source=localhost memory_tracking_in_background_distributed_schedule_pool=0i,local_thread=0i,cache_dictionary_update_queue_keys=0i,disk_space_reserved_for_merge=0i,replicated_checks=0i,background_pool_task=0i,tcp_connection=0i,send_external_tables=0i,global_thread_active=46i,replicated_send=0i,background_buffer_flush_schedule_pool_task=0i,revision=54436i,open_file_for_write=0i,ephemeral_node=0i,delayed_inserts=0i,postgre_sql_connection=0i,context_lock_wait=0i,interserver_connection=0i,open_file_for_read=7i,memory_tracking_in_background_schedule_pool=0i,storage_buffer_bytes=0i,query_preempted=0i,memory_tracking_in_background_move_processing_pool=0i,rw_lock_waiting_writers=0i,rw_lock_active_writers=0i,merge=0i,cache_dictionary_update_queue_batches=0i,readonly_replica=0i,distributed_send=0i,send_scalars=0i,storage_buffer_rows=0i,dict_cache_requests=0i,memory_tracking_for_merges=1108200i,zoo_keeper_request=0i,local_thread_active=0i,query=1i,http_connection=1i,read=2i,query_thread=0i,memory_tracking_in_background_processing_pool=1108200i,version_integer=20006003i,global_thread=48i,distributed_files_to_insert=0i,part_mutation=0i,my_sql_connection=0i,memory_tracking=170627072i,zoo_keeper_session=0i,replicated_fetch=0i,background_move_pool_task=0i,background_schedule_pool_task=0i,memory_tracking_in_background_buffer_flush_schedule_pool=0i,background_distributed_schedule_pool_task=0i,write=0i,zoo_keeper_watch=0i,rw_lock_waiting_readers=0i,rw_lock_active_readers=1i 1597644720000000000
clickhouse_events,cluster=test_cluster_two_shards_localhost,host=ubuntu-server,name=admin,shard_num=1,source=localhost select_query=11i,read_buffer_from_file_descriptor_read=13084i,read_compressed_bytes=977530i,created_read_buffer_ordinary=216i,merged_rows=5512i,merge_tree_data_writer_rows=1097i,os_write_bytes=4096i,query=11i,disk_write_elapsed_microseconds=10946i,rw_lock_acquired_read_locks=1410i,read_buffer_from_file_descriptor_read_bytes=1810450i,created_write_buffer_ordinary=963i,merge_tree_data_writer_uncompressed_bytes=1246838i,system_time_microseconds=673i,arena_alloc_chunks=3i,disk_read_elapsed_microseconds=570682766i,merge_tree_data_writer_compressed_bytes=1422177i,context_lock=1448i,os_read_chars=4083i,function_execute=147i,merged_uncompressed_bytes=10053922i,oscpu_virtual_time_microseconds=4101i,os_write_chars=10966i,io_buffer_alloc_bytes=556479885i,arena_alloc_bytes=12288i,merge_tree_data_writer_blocks_already_sorted=102i,io_buffer_allocs=2431i,compressed_read_buffer_blocks=23214i,compressed_read_buffer_bytes=10051706i,merge_tree_data_writer_blocks=102i,soft_page_faults=123i,write_buffer_from_file_descriptor_write=1276i,write_buffer_from_file_descriptor_write_bytes=2944754i,inserted_rows=1097i,inserted_bytes=1246838i,merge=216i,merges_time_milliseconds=156i,real_time_microseconds=11650i,user_time_microseconds=3444i,file_open=1206i 1597644720000000000
clickhouse_tables,cluster=test_cluster_two_shards_localhost,database=system,host=ubuntu-server,name=admin,shard_num=1,source=localhost,table=trace_log bytes=1735i,parts=2i,rows=8i 1597644720000000000
clickhouse_tables,cluster=test_cluster_two_shards_localhost,database=system,host=ubuntu-server,name=admin,shard_num=1,source=localhost,table=metric_log bytes=99366i,parts=5i,rows=639i 1597644720000000000
clickhouse_tables,cluster=test_cluster_two_shards_localhost,database=system,host=ubuntu-server,name=admin,shard_num=1,source=localhost,table=asynchronous_metric_log rows=450i,bytes=5720i,parts=5i 1597644720000000000