* provider: protect libp2p connections
Use latest kad-dht version, introducing connection protection and
retention of addresses in peerstore during provide operations.
* depend on kad-dht master
* fix: add MFS operation limit for --flush=false
adds a global counter that tracks consecutive MFS operations performed
with --flush=false and fails with clear error after limit is reached.
this prevents unbounded memory growth while avoiding the data corruption
risks of auto-flushing.
- adds Internal.MFSNoFlushLimit config
- operations fail with actionable error at limit
- counter resets on successful flush or any --flush=true operation
- operations with --flush=true reset and don't count
this commit removes automatic flush from https://github.com/ipfs/kubo/pull/10971
and instead errors to encourage users of --flush=false to develop a habit
of calling 'ipfs files flush' periodically.
boxo will no longer auto-flush (https://github.com/ipfs/boxo/pull/1041) to
avoid corruption issues, and kubo applies the limit to 'ipfs files' commands
instead.
closes#10842
* test: add tests for MFSNoFlushLimit
tests verify the new Internal.MFSNoFlushLimit config option:
- default limit of 256 operations
- custom limit configuration
- counter reset on flush=true
- counter reset on explicit flush command
- limit=0 disables the feature
- multiple MFS command types count towards limit
* docs: explain why MFS operations fail instead of auto-flushing
addresses feedback from https://github.com/ipfs/kubo/pull/10985#pullrequestreview-3256250970
- clarify that automatic flushing at limit was considered but rejected
- explain the data corruption risks of auto-flushing
- guide users who want auto-flush to use --flush=true (default)
- document benefits of explicit failure for batch operations
* Filestore: provide Filestore nodes
When strategy is set to "all" (the blockstore does all the providing when a
block is written), no providing was happening to Filestore blocks that were
not written to the underlying blockstore (so, the DAG leaves, as they live in
the filesystem directly). This fixes that.
* docs: clarify filestore and urlstore fix in changelog
both filestore (local file references) and urlstore (HTTP/HTTPS URL
references) blocks are now properly provided shortly after initial add
* fix: SweepingProvider shouldn't error when missing DHT
* fix: prevent panic when SweepingProvider has no DHT
when SweepingProvider is enabled but no DHT is available (e.g., Routing.Type=none),
the daemon would panic with a nil pointer dereference in ResettableKeystore.ResetCids.
this fix:
- returns NoopProvider when no DHT implementation is available
- skips keystore initialization for NoopProvider to avoid unnecessary operations
- allows nodes to run without DHT when using HTTP-only routing or offline mode
the panic occurred because initKeyStore tried to access a nil keystore when
SweepingProvider returned nil for the keystore parameter. by checking if the
provider is NoopProvider and skipping keystore operations, we avoid the panic
while maintaining correct behavior for all other provider types.
cc #10974#10975
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* feat: allow custom http provide when offline
* refactor: improve offline HTTP provider handling and tests
- fixed comment/function name mismatch
- added mock server test for HTTP provide success
- clarified test names for offline scenarios
* test: simplify single-node provider tests
use h.NewNode().Init() instead of NewNodes(1) for cleaner test setup
* fix: allow SweepingProvider to work with HTTP-only routing
when no DHT is available but HTTP routers are configured for providing,
return NoopProvider instead of failing. this allows the daemon to start
and HTTP-based providing to work through the routing system.
moved HTTP provider detection to config package as HasHTTPProviderConfigured()
for better code organization and reusability.
this fix is important as SweepingProvider will become the new default in the future.
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* docs: improve slow reprovide warning messages
simplify warning text and provide actionable solutions in order of preference
* feat(config): add validation for Provide.DHT settings
- validate interval doesn't exceed DHT record validity (48h)
- validate worker counts and other parameters are within valid ranges
- improve slow reprovide warning messages to reference config parameter
- add tests for all validation cases
* docs: add reprovide cycle visualization
shows traffic patterns of legacy vs sweep vs accelerated DHT
* ci: optimize build workflows
- use go version from go.mod instead of hardcoding
- group platforms by OS for parallel builds
- remove legacy try-build targets
* fix: checkout before setup-go in all workflows
setup-go needs go.mod to be present, so checkout must happen first
* chore: remove deprecated // +build syntax
go 1.17+ uses //go:build, the old syntax is no longer needed
* simplify: remove nofuse tag from CI workflows
- workflows now rely on platform build constraints
- keep make nofuse target for manual builds
- remove unused appveyor.yml
* ci: remove legacy travis variable and fix gateway-conformance
- remove TRAVIS env variable from 4 workflows
- fix gateway-conformance checkout path to match working-directory
- replace deprecated cache-go-action with built-in setup-go caching
* fix: prevent --flush=false in 'ipfs files rm' command
the 'ipfs files rm' command always flushes for safety to ensure
data integrity. this change adds an explicit error when users
try to pass --flush=false, improving ux and preventing confusion.
related to #10842
* fix: add MFS cache size limit to prevent unbounded growth
- add Internal.MFSAutoflushThreshold config (experimental)
- directories auto-flush when cache exceeds threshold with --flush=false
- prevents high memory usage issue from #10842
- default: 256 entries per directory (matching HAMT shard size)
- set to 0 to restore old behavior (risky, may cause errors)
Closes#10842
* refactor: consolidate Provider/Reprovider into unified Provide config
- merge Provider and Reprovider configs into single Provide section
- add fs-repo-17-to-18 migration for config consolidation
- improve migration ergonomics with common package utilities
- convert deprecated "flat" strategy to "all" during migration
- improve Provide docs
* docs: add total_provide_count metric guidance
- document how to monitor provide success rates via prometheus metrics
- add performance comparison section to changelog
- explain how to evaluate sweep vs legacy provider effectiveness
* fix: add OpenTelemetry meter provider for metrics
- set up meter provider with Prometheus exporter in daemon
- enables metrics from external libs like go-libp2p-kad-dht
- fixes missing total_provide_count_total when SweepEnabled=true
- update docs to reflect actual metric names
---------
Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
Co-authored-by: guillaumemichel <guillaume@michel.id>
Co-authored-by: Daniel Norman <1992255+2color@users.noreply.github.com>
Co-authored-by: Hector Sanjuan <code@hector.link>
* reprovide sweep draft
* update reprovider dep
* go mod tidy
* fix provider type
* change router type
* dual reprovider
* revert to provider.System
* back to start
* SweepingReprovider test
* fix nil pointer deref
* noop provider for nil dht
* disabled initial network estimation
* another iteration
* suppress missing self addrs err
* silence empty rt err on lan dht
* comments
* new attempt at integrating
* reverting changes in core/node/libp2p/routing.go
* removing SweepingProvider
* make reprovider optional
* add noop reprovider
* update KeyChanFunc type alias
* restore boxo KeyChanFunc
* fix missing KeyChanFunc
* test(sharness): PARALLEL=1 and timeout 30m
running sequentially to see where timeout occurs
* initialize MHStore
* revert workflow debug
* config
* config docs
* merged IpfsNode provider and reprovider
* move Provider interface to from kad-dht to node
* moved Provider interface from kad-dht to kubo/core/node
* mod_tidy
* Add Clear to Provider interface
* use latest kad-dht commit
* make linter happy
* updated boxo provide interface
* boxo PR fix
* using latest kad-dht commit
* use latest boxo release
* fix fx
* fx cyclic deps
* fix merge issues
* extended tests
* don't provide LAN DHT
* docs
* restore dual dht provider
* don't start provider before it is online
* address linter
* dual/provider fix
* add delay in provider tests for dht bootstrap
* add OfflineDelay parameter to config
* remove increase number of workers in test
* improved keystore gc process
* fix: replace incorrect logger import in coreapi
replaced github.com/labstack/gommon/log with the standard
github.com/ipfs/go-log/v2 logger used throughout kubo.
removed unused labstack dependency from go.mod files.
* fix: remove duplicate WithDefault call in provider config
* fix: use correct option method for burst workers
* fix: improve error messages for experimental sweeping provider
updated error messages to clearly indicate when commands are unavailable
due to experimental sweeping provider being enabled via Reprovider.Sweep.Enabled=true
* docs: remove obsolete KeyStoreGCInterval config
removed from config.md as option no longer exists (removed in b540fba1a)
updated keystore description to reflect gc happens at reprovide interval
* docs: add TODO placeholder changelog for experimental sweeping DHT provider
using v0.38-TODO.md name to avoid merge conflicts with master branch
and allow CI tests to run. will be renamed to v0.38.md once config
migration is added to the PR
* fix: provideKeysRec go routine
* clear keystore on close
* fix: datastore prefix
* fix: improve error handling in provideKeysRec
- close errCh channel to distinguish between nil and pending errors
- check for pending errors when provided.New closes
- handle context cancellation during error send
- prevent race condition where errors could be silently lost
this ensures DAG walk errors are always propagated correctly
* address gammazero's review
* rename BurstProvider to LegacyProvider
* use latest provider/keystore
* boxo: make mfs StartProviding async
* bump boxo
* chore: update boxo to f2b4e12fb9a8ac138ccb82aae3b51ec51d9f631c
- updated boxo dependency to specified commit
- updated go.mod and go.sum files across all modules
* use latest kad-dht/boxo
* Buffered SweepingProvider wrapper
* use latest kad-dht commit
* allow no DHT router
* use latest kad-dht & boxo
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
validates Import configuration fields to prevent invalid values:
- CidVersion: must be 0 or 1
- UnixFSFileMaxLinks: must be positive
- UnixFSDirectoryMaxLinks: must be non-negative
- UnixFSHAMTDirectoryMaxFanout: power of 2, multiple of 8, ≤ 1024
- BatchMaxNodes/BatchMaxSize: must be positive
- UnixFSChunker: validates format patterns
- HashFunction: must be allowed by verifcid
* Reprovider strategy: rename "flat" to "all".
Value "flat" now parses to "all". Behaviour from "all" removed.
Fixes#10864 which has detailed explanation.
* core/node/provider.go: remove unused function mfsRootProvider
It was used in the "all" strategy.
* docs: improve reprovider.strategy=all changelog framing
- highlight memory efficiency improvements
- clarify this removes v0.28 workaround
- update config.md memory requirements
- fix announce-on profile typo
* feat: deprecate Reprovider.Strategy=flat
- add deprecation warning in daemon.go when flat strategy is detected
- document that flat is deprecated in ParseReproviderStrategy comment
- add explicit test case for flat -> all mapping
- flat continues to work but users are warned to migrate to all
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* fix(relay): feed connected peers to AutoRelay discovery
Feed all connected swarm peers to AutoRelay as potential relay
candidates. This allows peers from HTTP routing and manual connections
to serve as relays, not just DHT-discovered peers.
Fixes#10899
* docs: changelog
* Provide according to strategy
Updates boxo to a version with the changes from https://github.com/ipfs/boxo/pull/976, which decentralize the providing responsibilities (from a central providing.Exchange to blockstore, pinner, mfs).
The changes consist in initializing the Pinner, MFS and the blockstore with the provider.System, which is created first.
Since the provider.System is created first, the reproviding KeyChanFunc is set
later when we can create it once we have the Pinner, MFS and the blockstore.
Some additional work applies to the Add() workflow. Normally, blocks would get provided at the Blockstore or the Pinner, but when adding blocks AND a "pinned" strategy is used, the blockstore does not provide, and the
pinner does not traverse the DAG (and thus doesn't provide either), so we need to provide directly from the Adder. This is resolved by wrapping the DAGService in a "providingDAGService" which provides every added block, when using the "pinned" strategy.
`ipfs --offline add` when the ONLINE daemon is running will now announce blocks per the chosen strategy, where before it did not announce them. This is documented in the changelog. A couple of releases ago, adding with `ipfs --offline add` was faster, but this is no longer the case so we are not incurring in any penalties by sticking to the fact that the daemon is online and has a providing strategy that we follow.
Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* refactor: remove goprocess
The `goprocess` package is no longer needed. It can be replaces by modern `context` and `context.AfterFunc`.
* mod tidy
* log unmount errors on shutdown
* Do not log non-mounted errors on shutdown
* Use WaitGroup associated with IPFS node to wait for services to whutdown
* Prefer explicit Close to context.ArterFunc
* Do not use node-level WaitGroup
* Unmount for non-supported platforms
* fix return values
* test: daemon shuts down gracefully
make sure ongoing operations dont block shutdown
* test(cli): add TestFUSE
* test: smarter RequiresFUSE
opportunistically run FUSE tests if env has fusermount
and TEST_FUSE was not explicitly set
* docs: changelog
---------
Co-authored-by: gammazero <gammazero@users.noreply.github.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
No behaviour changes.
Currently we are using ProvideManyRouter for Bitswap, which is only meant to
use ContentDiscovery. This makes things more clear in that there is a
designated ContentDiscovery instance.
After boxo v0.33.1, this is a recommended step to fix http retrieval bugs.
Having a single ConnectEventManager prevents misdirected operations in the
network.Router to change the Connectedness state in a way that the counterpart
(httpnet or bsnet) can later correct.
* provider: clear reprovide queue when reprovide strategy changes
When the currently configured reprovide strategy does not match the previous strategy read from the datastore, then clear the reprovide queue and update the reprovide strategy that is stored in the datastore.
Depends on https://github.com/ipfs/boxo/pull/978Closes#10829
* Update docs/changelogs/v0.36.md
Co-authored-by: Guillaume Michel <guillaumemichel@users.noreply.github.com>
* update log message
* update boxo
* Move change log to v0.37.md
* Add `provide clear` command to clear provide queue
The `provide clear` command clears all items from the provide queue and prints out the number of items removed from the queue. The `quiet` option tells the command not to print output.
* refactor(cmds): ipfs provide clear
moving to new namespace to avoid conflicts, and also document other
commands
* docs: clarify Reprovider.Strategy
* chore: remove undesired md link
this wires up https://github.com/ipfs/boxo/pull/971
to make sure explicitly allowlisted hosts have
their own metric label
if we ever need more flexibility here, this can be exposed as
a separate config
* Configure bitswap braodcast reduction
Add new config items to `Internal.Bitswap` to allow configuration of bitswap broadcast reduction behavior. Broadcast reduction behavior is enabled by default, and uses settings that should be suitable for most installations of kubo.
* update sharness metrics test
* Explicit defaults for broadcast reduction configuration
* Update docs/config.md
* chore: update to go-log/v2
go-log v2 has been out for quite a while now and it is time to deprecate v1.
Replace all use of go-log with go-log/v2
Makes /api/v0/log/tail useful over HTTP
Updates dependencies that have moved to go-lov/v2
Removes support for ContextWithLoggable as this is not needed for tracing-like functionality
- Replaces: PR #8765
- Closes issue #8753
- Closes issue #9245
- Closes issue #10809
Other fixes:
* update go-ipfs-cmds
* update http logs test
* fix test
* Read/send one line of log data at a time
* Update -log-level docs
* fix(config): explicit Provider.Enabled flag
Adds missing config option described in
https://github.com/ipfs/kubo/issues/10803
* refactor: remove Experimental.StrategicProviding
removing experiment, replaced with Provider.Enabled
* test(cli): routing [re]provide
updated and added tests for manually triggering provide and reprovide
and making them respect global configuration flag to avoid
inconsistent behaviors
* docs: improve DelegatedRouters
* refactor: default DefaultProviderWorkerCount=16
- simplified default for both
- 16 is safer for non-accelerated DHT client
- acceletated DHT performs better without limit anyway - updated docs
* Feat: http retrieval as experimental feature
This introduces the http-retrieval capability as an experimental feature.
It can be enabled in the configuration `Experimental.HTTPRetrieval.Enabled = true`.
Documentation and changelog to be added later.
* refactor: HTTPRetrieval.Enabled as Flag
* docs(config): HTTPRetrieval section
* refactor: reusable MockHTTPContentRouter
* feat: HTTPRetrieval.TLSInsecureSkipVerify
allows self-signed certificates in tests
* feat(config): HTTPRetrieval.MaxBlockSize
* test: end-to-end HTTPRetrieval.Enabled
this spawns two http services on localhost:
1. HTTP router that returns HTTP provider when /routing/v1/providers/cid i queried
2. HTTP provider that returns a block when /ipfs/cid is queried
3. Configures Kubo to use (1) instead of cid.contact
this seems to work (running test with DEBUG=true shows (1) was queried
for the test CID and returned multiaddr of (2), but Kubo never requested
test CID block from (2) – needs investigation
* fix: enable /routing/v1/peers for non-cid.contact
we artificially limited every delegated routing endpoint because of
cid.contact being limited to one endpoint
* feat: Routing.DelegatedRouters
make it easy to override the hardcoded implicit HTTP routeur URL
without having to set the entire custom Router.Routers and
Router.Methods
(http_retrieval_client_test.go still needs to be fixed in future commit)
* test: flag remaining work
* docs: review feedback
* refactor: providerQueryMgr with bitswapNetworks
this fixes two regressions:
(1) introduced in https://github.com/ipfs/kubo/issues/10717
where we only used bitswapLib2p query manager
(this is why E2E did not act on http provider)
(2) introduced in https://github.com/ipfs/kubo/pull/10765
where it was not possible to set binary peerID in IgnoreProviders
(we changed to []string)
* refactor: Bitswap.Libp2pEnabled
replaces Bitswap.Enabled with Bitswap.Libp2pEnabled
adds tests that confirm it is possible to disable libp2p bitswap fully
and only keep http in client mode
also, removes the need for passing empty blockstore in client-only mode
* docs: changelog
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* feat: add Bitswap configuration and related tests
* fix: update Bitswap function to use 'provide' parameter for server enablement
* docs: update changelog for Bitswap functionality changes
* fix: update Bitswap server enablement logic and improve related tests
* fix: rename BitswapConfig to Bitswap and update references
* docs: config and changelog
* fix: `ipfs cat` panic when `Bitswap.Enabled=false`
Fixes panic described in:
https://github.com/ipfs/kubo/pull/10782#discussion_r2069116219
---------
Co-authored-by: gystemd <gystemd@gmail.com>
Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
Co-authored-by: Giulio Piva <giulio.piva@dedicated.world>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* adjust ipfs stats provide
* update boxo dep
* bump boxo
* fixing tests
* docs/chore: mark stat reprovide as experimental
* docs: Provider.Strategy
explicitly document it is not used - without this legacy users will have
it in their config and be very confused
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Adds `Routing.IgnoreProviders`.
This requires initializing a custom providerQueryManager and using it instead
of the default created internally in Bitswap. Since the default is created
with some internal default configuration options (MaxProviders), this hardcodes it.
Fixes#10596.
The reproviding process can take long. Currently, each CID to be provided is
obtained by making a query to the pinner and reading one by one as the CIDs
get provided.
While this query is ongoing, the pinner holds a Read mutex on the pinset.
If a pin-add-request arrives, a goroutine will start waiting for a Write mutex
on the pinset. From that point, no new Read mutexes can be taken until the writer
can proceed and finishes.
However, no one can proceed because the read mutex is still held while the
reproviding is ongoing.
The fix is mostly in Boxo, where we add a "buffered" provider which reads the
cids onto memory so that they can be provided at its own pace without making
everyone wait.
The consequence is we will need more RAM memory. Rule of thumb is 1GiB extra per 20M cids to be reprovided.
* use go-datastore without go-process
* update go-ds-xxx dependencies
* update go-libp2p-kad-dht
* bitswap api changes
* Do not use multiple multi-error packages, pick one
* update boxo
* update expected metrics