* refactor: consolidate Provider/Reprovider into unified Provide config - merge Provider and Reprovider configs into single Provide section - add fs-repo-17-to-18 migration for config consolidation - improve migration ergonomics with common package utilities - convert deprecated "flat" strategy to "all" during migration - improve Provide docs * docs: add total_provide_count metric guidance - document how to monitor provide success rates via prometheus metrics - add performance comparison section to changelog - explain how to evaluate sweep vs legacy provider effectiveness * fix: add OpenTelemetry meter provider for metrics - set up meter provider with Prometheus exporter in daemon - enables metrics from external libs like go-libp2p-kad-dht - fixes missing total_provide_count_total when SweepEnabled=true - update docs to reflect actual metric names --------- Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com> Co-authored-by: guillaumemichel <guillaume@michel.id> Co-authored-by: Daniel Norman <1992255+2color@users.noreply.github.com> Co-authored-by: Hector Sanjuan <code@hector.link>
5.7 KiB
Kubo metrics
By default, a Prometheus endpoint is exposed by Kubo at http://127.0.0.1:5001/debug/metrics/prometheus.
It includes default Prometheus Go client metrics + Kubo-specific metrics listed below.
Table of Contents
Warning
This documentation is incomplete. For an up-to-date list of metrics available at daemon startup, see test/sharness/t0119-prometheus-data/prometheus_metrics_added_by_measure_profile.
Additional metrics may appear during runtime as some components (like boxo/gateway) register metrics only after their first event occurs (e.g., HTTP request/response).
DHT RPC
Metrics from go-libp2p-kad-dht for DHT RPC operations:
Inbound RPC metrics
rpc_inbound_messages_total- Counter: total messages received per RPCrpc_inbound_message_errors_total- Counter: total errors for received messagesrpc_inbound_bytes_[bucket|sum|count]- Histogram: distribution of received bytes per RPCrpc_inbound_request_latency_[bucket|sum|count]- Histogram: latency distribution for inbound RPCs
Outbound RPC metrics
rpc_outbound_messages_total- Counter: total messages sent per RPCrpc_outbound_message_errors_total- Counter: total errors for sent messagesrpc_outbound_requests_total- Counter: total requests sentrpc_outbound_request_errors_total- Counter: total errors for sent requestsrpc_outbound_bytes_[bucket|sum|count]- Histogram: distribution of sent bytes per RPCrpc_outbound_request_latency_[bucket|sum|count]- Histogram: latency distribution for outbound RPCs
Provide
Legacy Provider
Metrics for the legacy provider system when Provide.DHT.SweepEnabled=false:
provider_reprovider_provide_count- Counter: total successful provide operations since node startupprovider_reprovider_reprovide_count- Counter: total reprovide sweep operations since node startup
DHT Provider
Metrics for the DHT provider system when Provide.DHT.SweepEnabled=true:
total_provide_count_total- Counter: total successful provide operations since node startup (includes both one-time provides and periodic provides done onProvide.DHT.Interval)
Note
These metrics are exposed by go-libp2p-kad-dht. You can enable debug logging for DHT provider activity with
GOLOG_LOG_LEVEL=dht/provider=debug.
Gateway (boxo/gateway)
Tip
These metrics are limited to IPFS Gateway endpoints. For general HTTP metrics across all endpoints, consider using a reverse proxy.
Gateway metrics appear after the first HTTP request is processed:
HTTP metrics
ipfs_http_gw_responses_total{code}- Counter: total HTTP responses by status codeipfs_http_gw_retrieval_timeouts_total{code,truncated}- Counter: requests that timed out during content retrievalipfs_http_gw_concurrent_requests- Gauge: number of requests currently being processed
Blockstore cache metrics
ipfs_http_blockstore_cache_hit- Counter: global block cache hitsipfs_http_blockstore_cache_requests- Counter: global block cache requests
Backend metrics
ipfs_gw_backend_api_call_duration_seconds_[bucket|sum|count]{backend_method}- Histogram: time spent in IPFSBackend API calls
Generic HTTP Servers
Tip
The metrics below are not very useful and exist mostly for historical reasons. If you need non-gateway HTTP metrics, it's better to put a reverse proxy in front of Kubo and use its metrics.
Core HTTP metrics (ipfs_http_*)
Prometheus metrics for the HTTP API exposed at port 5001:
ipfs_http_requests_total{method,code,handler}- Counter: total HTTP requests (Legacy - new metrics are provided by boxo/gateway for gateway traffic)ipfs_http_request_duration_seconds[_sum|_count]{handler}- Summary: request processing durationipfs_http_request_size_bytes[_sum|_count]{handler}- Summary: request body sizesipfs_http_response_size_bytes[_sum|_count]{handler}- Summary: response body sizes
HTTP Server metrics (http_server_*)
Additional HTTP instrumentation for all handlers (Gateway, API commands, etc.):
http_server_request_body_size_bytes_[bucket|count|sum]- Histogram: distribution of request body sizeshttp_server_request_duration_seconds_[bucket|count|sum]- Histogram: distribution of request processing timeshttp_server_response_body_size_bytes_[bucket|count|sum]- Histogram: distribution of response body sizes
These metrics are automatically added to Gateway handlers, Hostname Gateway, Libp2p Gateway, and API command handlers.
OpenTelemetry Metadata
Kubo uses Prometheus for metrics collection for historical reasons, but OpenTelemetry metrics are automatically exposed through the same Prometheus endpoint. These metadata metrics provide context about the instrumentation:
otel_scope_info- Information about instrumentation libraries producing metricstarget_info- Service metadata including version and instance information