* refactor: consolidate Provider/Reprovider into unified Provide config
- merge Provider and Reprovider configs into single Provide section
- add fs-repo-17-to-18 migration for config consolidation
- improve migration ergonomics with common package utilities
- convert deprecated "flat" strategy to "all" during migration
- improve Provide docs
* docs: add total_provide_count metric guidance
- document how to monitor provide success rates via prometheus metrics
- add performance comparison section to changelog
- explain how to evaluate sweep vs legacy provider effectiveness
* fix: add OpenTelemetry meter provider for metrics
- set up meter provider with Prometheus exporter in daemon
- enables metrics from external libs like go-libp2p-kad-dht
- fixes missing total_provide_count_total when SweepEnabled=true
- update docs to reflect actual metric names
---------
Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
Co-authored-by: guillaumemichel <guillaume@michel.id>
Co-authored-by: Daniel Norman <1992255+2color@users.noreply.github.com>
Co-authored-by: Hector Sanjuan <code@hector.link>
* telemetry: use systemd-detect-virt for container/vm detection
Current VM detection is not very accurate and systemd-detect-virt does exactly
what's needed under a miriad of virtualization platforms.
The downside is that we are running a system command which is uglier and might
perhaps flip anti-viruses or something.
* telemetry: improve vm/container detection with pure go
replace systemd-detect-virt with file-based detection to avoid:
- security risks from executing external binaries
- unnecessary repeated detection (now cached with sync.Once)
- missing detection on non-systemd systems
removes false positives:
- cpu hypervisor flag (indicates capability, not guest status)
- generic dmi strings that match physical hardware
- overlay filesystem check (used by immutable distros)
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* Initial pass at Telemetry plugin
Currently, IP Shipyard, with the help of Probelab, monitor and extract
Amino/IPFS public network metrics with the use of DHT crawlers and
bootstrappers (via peerlog plugin). For example, we log all peer IDs seen and
their AgentVersion/Addresses obtained from the `identify` protocol, which
provides insights into protocol usage, total number of peers etc.
We would like to increase the ability to obtain more insights from the network
by collecting some more information in the future, but also to give users more
control over this collection (i.e. opt-out). The information collected will
not allow unique identification of anyone and is only used for aggregation.
Now, this PR explores a way of moving in this direction:
* A new "telemetry" fx plugin is in charge of dealing with telemetry
* The FX plugin allows to plug and make decisions / take actions during the setup phase:
* We can inspect whether we are using Private Networks before the libp2p.Host has been initialized.
* We can send telemetry after the libp2p Host is initialized.
* Everything is self-contained. Custom builds can remove the plugin altogether without needing to surgically edit the code.
As for behaviour:
* The user can opt-in/out via EnvVar, file in the repo path or plugin configuration.
* Users on private networks or with custom bootstrappers are detected, offered a wall of text explaining why we need telemetry and invited to opt-in. Opt-out happens otherwise on a timeout (with no input). Their preferences are stored.
* Users on standard settings are opted-in by default. This is the status quo in Kubo already, except they don't get a chance to opt out.
The telemetry libp2p protocol is yet to be defined, but expect something similar to identify, with a protobuf being pushed to bootstrappers or to a specific telemetry node that we define. In the case of pnets, this will be done with a temporary peer.
* checkpoint
* telemetry plugin: second pass
* On first run it generates a UUID and shows a message to the user.
* UUID is persistend to "telemetry_uuid"
* Sends telemetry 1 minute after boot and every 24h
* LogEvent is the thing containing all the telemetry that is sent
* Opt-out possible via env-var or plugin configuration
* Telemetry: add changelog and environment variable documentation
* docs: improved daemon message
making it more obvious nothing was sent yet
and that user had 15m to out-out
plus some debug logs that confirm opt-out
* refactor: rename IPFS_TELEMETRY_MODE to IPFS_TELEMETRY
* fix: add User-Agent header to telemetry requests
---------
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>