kubo/metrics.md at 34debcbcb4e8e2737a9b073ec864f2df0b92eed9

coosld/kubo

Fork 0

mirror of https://github.com/ipfs/kubo.git synced 2026-02-21 10:27:46 +08:00

Marcin Rataj 71e883440e

CodeQL / codeql (push) Waiting to run

Details

Docker Check / lint (push) Waiting to run

Details

Docker Check / build (push) Waiting to run

Details

Gateway Conformance / gateway-conformance (push) Waiting to run

Details

Gateway Conformance / gateway-conformance-libp2p-experiment (push) Waiting to run

Details

Go Build / go-build (push) Waiting to run

Details

Go Check / go-check (push) Waiting to run

Details

Go Lint / go-lint (push) Waiting to run

Details

Go Test / go-test (push) Waiting to run

Details

Interop / interop-prep (push) Waiting to run

Details

Interop / helia-interop (push) Blocked by required conditions

Details

Interop / ipfs-webui (push) Blocked by required conditions

Details

Sharness / sharness-test (push) Waiting to run

Details

Spell Check / spellcheck (push) Waiting to run

Details

refactor(config): migration 17-to-18 to unify Provider/Reprovider into Provide.DHT (#10951 )

* refactor: consolidate Provider/Reprovider into unified Provide config

- merge Provider and Reprovider configs into single Provide section
- add fs-repo-17-to-18 migration for config consolidation
- improve migration ergonomics with common package utilities
- convert deprecated "flat" strategy to "all" during migration
- improve Provide docs

* docs: add total_provide_count metric guidance

- document how to monitor provide success rates via prometheus metrics
- add performance comparison section to changelog
- explain how to evaluate sweep vs legacy provider effectiveness

* fix: add OpenTelemetry meter provider for metrics

- set up meter provider with Prometheus exporter in daemon
- enables metrics from external libs like go-libp2p-kad-dht
- fixes missing total_provide_count_total when SweepEnabled=true
- update docs to reflect actual metric names

---------

Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
Co-authored-by: guillaumemichel <guillaume@michel.id>
Co-authored-by: Daniel Norman <1992255+2color@users.noreply.github.com>
Co-authored-by: Hector Sanjuan <code@hector.link>

2025-09-18 22:17:43 +02:00

5.7 KiB

Raw Blame History

Kubo metrics

By default, a Prometheus endpoint is exposed by Kubo at http://127.0.0.1:5001/debug/metrics/prometheus.

It includes default Prometheus Go client metrics + Kubo-specific metrics listed below.

DHT RPC
- Inbound RPC metrics
- Outbound RPC metrics
Provide
- Legacy Provider
- DHT Provider
Gateway (boxo/gateway)
Generic HTTP Servers
- Core HTTP metrics
- HTTP Server metrics
OpenTelemetry Metadata

Warning

This documentation is incomplete. For an up-to-date list of metrics available at daemon startup, see test/sharness/t0119-prometheus-data/prometheus_metrics_added_by_measure_profile.

Additional metrics may appear during runtime as some components (like boxo/gateway) register metrics only after their first event occurs (e.g., HTTP request/response).

DHT RPC

Metrics from go-libp2p-kad-dht for DHT RPC operations:

Inbound RPC metrics

rpc_inbound_messages_total - Counter: total messages received per RPC
rpc_inbound_message_errors_total - Counter: total errors for received messages
rpc_inbound_bytes_[bucket|sum|count] - Histogram: distribution of received bytes per RPC
rpc_inbound_request_latency_[bucket|sum|count] - Histogram: latency distribution for inbound RPCs

Outbound RPC metrics

rpc_outbound_messages_total - Counter: total messages sent per RPC
rpc_outbound_message_errors_total - Counter: total errors for sent messages
rpc_outbound_requests_total - Counter: total requests sent
rpc_outbound_request_errors_total - Counter: total errors for sent requests
rpc_outbound_bytes_[bucket|sum|count] - Histogram: distribution of sent bytes per RPC
rpc_outbound_request_latency_[bucket|sum|count] - Histogram: latency distribution for outbound RPCs

Provide

Legacy Provider

Metrics for the legacy provider system when Provide.DHT.SweepEnabled=false:

provider_reprovider_provide_count - Counter: total successful provide operations since node startup
provider_reprovider_reprovide_count - Counter: total reprovide sweep operations since node startup

DHT Provider

Metrics for the DHT provider system when Provide.DHT.SweepEnabled=true:

total_provide_count_total - Counter: total successful provide operations since node startup (includes both one-time provides and periodic provides done on Provide.DHT.Interval)

Note

These metrics are exposed by go-libp2p-kad-dht. You can enable debug logging for DHT provider activity with GOLOG_LOG_LEVEL=dht/provider=debug.

Gateway (`boxo/gateway`)

Tip

These metrics are limited to IPFS Gateway endpoints. For general HTTP metrics across all endpoints, consider using a reverse proxy.

Gateway metrics appear after the first HTTP request is processed:

HTTP metrics

ipfs_http_gw_responses_total{code} - Counter: total HTTP responses by status code
ipfs_http_gw_retrieval_timeouts_total{code,truncated} - Counter: requests that timed out during content retrieval
ipfs_http_gw_concurrent_requests - Gauge: number of requests currently being processed

Blockstore cache metrics

ipfs_http_blockstore_cache_hit - Counter: global block cache hits
ipfs_http_blockstore_cache_requests - Counter: global block cache requests

Backend metrics

ipfs_gw_backend_api_call_duration_seconds_[bucket|sum|count]{backend_method} - Histogram: time spent in IPFSBackend API calls

Generic HTTP Servers

Tip

The metrics below are not very useful and exist mostly for historical reasons. If you need non-gateway HTTP metrics, it's better to put a reverse proxy in front of Kubo and use its metrics.

Core HTTP metrics (`ipfs_http_*`)

Prometheus metrics for the HTTP API exposed at port 5001:

ipfs_http_requests_total{method,code,handler} - Counter: total HTTP requests (Legacy - new metrics are provided by boxo/gateway for gateway traffic)
ipfs_http_request_duration_seconds[_sum|_count]{handler} - Summary: request processing duration
ipfs_http_request_size_bytes[_sum|_count]{handler} - Summary: request body sizes
ipfs_http_response_size_bytes[_sum|_count]{handler} - Summary: response body sizes

HTTP Server metrics (`http_server_*`)

Additional HTTP instrumentation for all handlers (Gateway, API commands, etc.):

http_server_request_body_size_bytes_[bucket|count|sum] - Histogram: distribution of request body sizes
http_server_request_duration_seconds_[bucket|count|sum] - Histogram: distribution of request processing times
http_server_response_body_size_bytes_[bucket|count|sum] - Histogram: distribution of response body sizes

These metrics are automatically added to Gateway handlers, Hostname Gateway, Libp2p Gateway, and API command handlers.

OpenTelemetry Metadata

Kubo uses Prometheus for metrics collection for historical reasons, but OpenTelemetry metrics are automatically exposed through the same Prometheus endpoint. These metadata metrics provide context about the instrumentation:

otel_scope_info - Information about instrumentation libraries producing metrics
target_info - Service metadata including version and instance information

5.7 KiB Raw Blame History

Kubo metrics

Table of Contents

DHT RPC

Inbound RPC metrics

Outbound RPC metrics

Provide

Legacy Provider

DHT Provider

Gateway (boxo/gateway)

HTTP metrics

Blockstore cache metrics

Backend metrics

Generic HTTP Servers

Core HTTP metrics (ipfs_http_*)

HTTP Server metrics (http_server_*)

OpenTelemetry Metadata

5.7 KiB

Raw Blame History

Gateway (`boxo/gateway`)

Core HTTP metrics (`ipfs_http_*`)

HTTP Server metrics (`http_server_*`)