* feat(config): add Gateway.MaxRequestDuration option
exposes the previously hardcoded 1 hour gateway request deadline as a
configurable option, allowing operators to adjust it to fit deployment
needs. protects gateway from edge cases and slow client attacks.
boxo: https://github.com/ipfs/boxo/pull/1079
* test(gateway): add MaxRequestDuration integration test
verifies config is wired correctly and 504 is returned when exceeded
* docs: add MaxRequestDuration to gateway production guide
---------
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
* datastore: upgrade go-ds-flatfs to v0.6.0
See: https://github.com/ipfs/go-ds-flatfs/pull/142
* docs(changelog): add go-ds-flatfs atomic batch writes
*documents the new flatfs batch implementation that uses atomic
operations via temp directory, preventing orphan blocks on interrupted
imports and reducing memory usage.
* includes improved tests, batch cleanup fixes, and docs
* docs(changelog): reframe go-ds-flatfs entry for users
focus on user benefits instead of implementation details
* docs: mark custom routing as experimental
reorganize Routing.Type section for clarity, group production and
experimental options, consolidate DHT explanation, add limitations
section to delegated-routing.md documenting that HTTP-only routing
cannot provide content reliably
* chore(config): reorder Routing sections and improve callout formatting
move DelegatedRouters after Type, add config option names to CAUTION headers
* docs: address reviewer feedback on config.md
- clarify that `auto` can be combined with custom URLs in `Routing.DelegatedRouters`
- rename headers for consistency: `Routing.Routers.[name].Type`, `Routing.Routers.[name].Parameters`, `Routing.Methods`
- replace deprecated Strategic Providing reference with `Provide.*` config
- remove outdated caveat about 0.39 sweep limitation
- wording: "likely suffer" → "will be most affected"
* docs: remove redundant Summary section from delegated-routing.md
the IMPORTANT callout and Motivation section already cover what users
need to know. historical version info was noise for researchers trying
to configure custom routing.
addresses reviewer feedback from #11111.
---------
Co-authored-by: Daniel Norman <2color@users.noreply.github.com>
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
* feat(p2p): add --foreground flag to listen and forward commands
adds `-f/--foreground` option that keeps the command running until
interrupted (SIGTERM/Ctrl+C) or closed via `ipfs p2p close`. the
listener/forwarder is automatically removed when the command exits.
useful for systemd services and scripts that need cleanup on exit.
* docs: add p2p-tunnels.md with systemd examples
- add dedicated docs/p2p-tunnels.md covering:
- why p2p tunnels (NAT traversal, no public IP needed)
- quick start with netcat
- background and foreground modes
- systemd integration with path-based activation
- security considerations and troubleshooting
- document Experimental.Libp2pStreamMounting in docs/config.md
- simplify docs/experimental-features.md, link to new doc
- add "Learn more" links to ipfs p2p listen/forward --help
- update changelog entry with doc link
- add cross-reference in misc/README.md
* chore: reference kubo#5460 for p2p config
Ref. https://github.com/ipfs/kubo/issues/5460
* fix(daemon): write api/gateway files only after HTTP server is ready
fixes race condition where $IPFS_PATH/api and $IPFS_PATH/gateway files
were written before the HTTP servers were ready to accept connections.
this caused issues for tools like systemd path units that immediately
try to connect when these files appear.
changes:
- add corehttp.ServeWithReady() that signals when server is ready
- wait for ready signal before writing address files
- use sync.WaitGroup.Go() (Go 1.25) for cleaner goroutine management
- add TestAddressFileReady to verify both api and gateway files
* fix(daemon): buffer errc channel and wait for all listeners
- buffer error channel with len(listeners) to prevent deadlock when
multiple servers write errors simultaneously
- wait for ALL listeners to be ready before writing api/gateway file,
not just the first one
Feedback-from: https://github.com/ipfs/kubo/pull/11099#pullrequestreview-3593885839
* docs(changelog): improve p2p tunnel section clarity
reframe to lead with user benefit and add example output
* docs(p2p): remove obsolete race condition caveat
the "First launch fails but restarts work" troubleshooting section
described a race where the api file was written before the daemon was
ready. this was fixed in 80b703a which ensures api/gateway files are
only written after HTTP servers are ready to accept connections.
---------
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
- add TTY auto-detection for progress display (matching `dag export`)
- use single-line progress with carriage return instead of flooding
- show human-readable sizes alongside raw bytes in summary
- update --progress flag to be auto-detected by default
progress format: `Fetched/Processed N blocks, M bytes (X MB)`
summary format: `Total Size: 99 (99 B)`
* fix(routing): use LegacyProvider for HTTP-only custom routing
when `Routing.Type=custom` with only HTTP routers and no DHT,
fall back to LegacyProvider instead of SweepingProvider.
SweepingProvider requires a DHT client which is unavailable in
HTTP-only configurations, causing it to return NoopProvider and
breaking provider record announcements to HTTP routers.
fixes#11089
* test(routing): verify provide stat works with HTTP-only routing
* docs(config): clarify SweepEnabled fallback for HTTP-only routing
---------
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
* ci: parallelize gotest by separating test/cli into own job
split the Go Test workflow into two parallel jobs:
- `unit-tests`: runs unit tests (excluding test/cli)
- `cli-tests`: runs test/cli end-to-end tests
test/cli takes ~3 minutes (~50% of total gotest time), so running
it in parallel should reduce wall-clock CI time by ~1.5-2.5 minutes.
both jobs produce JUnit XML and HTML reports for consistent debugging.
* ci(gotest): reduce noise on test timeout panics
add GOTRACEBACK=single to show only one goroutine stack instead of all
when a test timeout panic occurs. this makes CI output much cleaner
when tests hang.
* fix(ci): prevent stderr from corrupting test JSON output
- remove 2>&1 which mixed "go: downloading" stderr messages into JSON
- add JSON validation before parsing
- print failed test names for easier debugging
* ci(gotest): use gotestsum for human-readable test output
- replace per-package coverage loop with single gotestsum invocation
- both unit-tests and cli-tests now show human-readable output
- simplified coverage collection (single -coverprofile, no gocovmerge)
- clarified step names to indicate they run tests
* ci: fix codecov uploads by adding token
- add CODECOV_TOKEN to gotest.yml and sharness.yml
- update codecov-action to v5.5.2
- add fail_ci_if_error: false for robustness
codecov stopped receiving coverage data ~1 year ago when they
started requiring tokens for public repos
* refactor(make): add test_unit and test_cli targets
- add `make test_unit` for unit tests with coverage (used by CI)
- add `make test_cli` for CLI integration tests (used by CI)
- only disable colors when CI env var is set (local dev gets colors)
- remove legacy targets: test_go_test, test_go_short, test_go_race, test_go_expensive
- update gotest.yml to use make targets instead of inline commands
- add test artifacts to .gitignore
* fix(ci): move client/rpc tests to cli-tests job
client/rpc tests use test/cli/harness which requires the ipfs binary.
Move them from test_unit to test_cli where the binary is built.
also:
- update gotestsum to v1.13.0
- simplify workflow step names
* fix(ci): use build tags when listing test packages
go list needs build tags to properly exclude packages like fuse/mfs
when running with TEST_FUSE=0 (nofuse tag).
* fix(ci): move test/integration to cli-tests job
test/integration tests need the ipfs binary, move them from test_unit
to test_cli.
* fix(test): fix flaky kubo-as-a-library and GetClosestPeers tests
kubo-as-a-library: use `Bootstrap()` instead of raw `Swarm().Connect()`
to fix race condition between swarm connection and bitswap peer
discovery. `Bootstrap()` properly integrates peers into the routing
system, ensuring bitswap learns about connected peers synchronously.
GetClosestPeers: simplify retry logic using `EventuallyWithT` with
10-minute timeout. tests all 4 routing types (`auto`, `autoclient`,
`dht`, `dhtclient`) against real bootstrap peers with patient polling.
* fix(example): use bidirectional Swarm().Connect() for reliable bitswap
- connect nodes bidirectionally (A→B and B→A) to simulate mutual peering
- mutual peering protects connection from resource manager culling
- use port 0 for random available ports (avoids CI conflicts)
- enable LoopbackAddressesOnLanDHT for local testing
- move retry logic to test file using require.Eventually
* fix(ci): add test_examples target and parallel example-tests job
- add `make test_examples` target to mk/golang.mk for consistency with test_unit/test_cli
- move example tests to separate parallel CI job (example-tests)
- example: use Bootstrap() with autoconf.FallbackBootstrapPeers for reliable bitswap
- example: increase context timeout to 10 minutes
- test: add 60s per-request timeout to GetClosestPeers (server has 30s routing timeout)
- test: reduce EventuallyWithT to 3 minutes (locally passes in under 1 minute)
* fix(ci): improve test targets, exclusion patterns, and artifact naming
- define COVERPKG_EXCLUDE and UNIT_EXCLUDE as documented variables
- use grep -vE with single regex instead of multiple grep -v calls
- add mkdir -p before rm to ensure directories exist
- add DEPS_GO dependency to test_cli target
- make CLI test timeout configurable via TEST_CLI_TIMEOUT (default 10m)
- fix test_examples cleanup on failure using subshell
- reduce GetClosestPeers test wait time from 3m to 2m
- rename artifacts to match job names: unit-tests-{junit,html}, cli-tests-{junit,html}
- update cli-tests upload-artifact from v5 to v6
* fix(ci): fix unit test exclusion and speed up example test
- fix UNIT_EXCLUDE regex to match client/rpc at end of path
- remove public bootstrap peers from example (only connect to nodeA)
- example test now runs in ~3s instead of timing out
* fix(test): fix flaky TestAddMultipleGCLive race condition
added time.Sleep after spawning GC goroutines to ensure they reach
GCLock() before the test proceeds. without this, the adder's
maybePauseForGC() might check GCRequested() before GC has even
requested the lock, causing the lock to not be released and GC to
block indefinitely.
this matches the existing pattern in TestAddGCLive which already
had this sleep.
also replaced context.Background() with t.Context() in both
TestAddMultipleGCLive and TestAddGCLive for proper test lifecycle
management.
* fix(example): use test harness settings for reliable CI
the kubo-as-a-library example was flaky on CI. applied test-harness-like
settings that match what transports_test.go uses:
- TCP-only on 127.0.0.1 with random port (no QUIC/UDP)
- explicitly disable non-TCP transports (QUIC, Relay, WebTransport, etc)
- use NilRouterOption (no routing) since we connect peers directly
- bitswap works with directly connected peers without DHT lookups
- 2-minute context timeout
- streaming output in test for debugging
addresses https://discuss.ipfs.tech/t/19933
- add docs/developer-guide.md with prerequisites, build, test, and troubleshooting
- link from README.md, docs/README.md, and CONTRIBUTING.md
- document test suite differences (unit vs e2e, test/cli vs test/sharness)
- include tips for running specific tests during development
* fix: update go-libp2p to v0.46.0
- reduced WebRTC log noise (go-libp2p#3426)
- fixed mDNS discovery on Windows/macOS (go-libp2p#3434)
- includes quic-go v0.57.1 (v0.56.0 + v0.57.0)
* fix(example): kubo-as-a-library test timeout
- use custom ports (4010/4011) to avoid conflicts with default 4001
- add 2-minute context timeout to fail fast
- get peer addresses dynamically instead of hardcoding wrong port
- wait for peer connection synchronously instead of fire-and-forget
- update comments to reference autoconf.FallbackBootstrapPeers
* chore: update p2p-forge to v0.7.0
* fix(test): wait for DHT readiness in GetClosestPeers test
the test was failing for `routing_type=auto` because it only waited for
swarm connections but not for the DHT routing table to be populated.
added a separate probe loop that waits for GetClosestPeers to succeed
before running the actual test assertions.
document known 0.39 limitation where sweep provider may fail to
estimate DHT size when accelerated client is still crawling the
network, resulting in single-region mode without efficiency gains.
also remove accelerated client recommendation from changelog
since it may mislead users into enabling both together.
- rewrite overview to lead with user value (self-hosting on consumer hardware)
- reorder highlights: provider features together, then UPnP, then housekeeping
- simplify titles (drop "Amino", "Fixed", verbose descriptions)
- link to Shipyard's sweep provider blogpost
This allows Kubo to respond to the GetClosestPeers() http routing v1 endpoint
as spec'ed here: https://github.com/ipfs/specs/pull/476
It is based on work from https://github.com/ipfs/boxo/pull/1021
We let IpfsNode implmement the contentRouter.Client interface with the new
method. We use our WAN-DHT to get the closest peers.
Additionally, Routing V1 HTTP API is exposed by default which enables light clients in browsers to use Kubo Gateway as delegated routing backend
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* fix(add): respect Provide config in fast-provide-root
fast-provide-root should honor the same config settings as the regular
provide system:
- skip when Provide.Enabled is false
- skip when Provide.DHT.Interval is 0
- respect Provide.Strategy (all/pinned/roots/mfs/combinations)
This ensures fast-provide only runs when appropriate based on user
configuration and the nature of the content being added (pinned vs
unpinned, added to MFS or not).
* feat(config): options to adjust global defaults
Add Import.FastProvideRoot and Import.FastProvideWait configuration options
to control default behavior of fast-provide-root and fast-provide-wait flags
in ipfs add command. Users can now set global defaults in config while
maintaining per-command flag overrides.
- Add Import.FastProvideRoot (default: true)
- Add Import.FastProvideWait (default: false)
- Add ResolveBoolFromConfig helper for config resolution
- Update docs with configuration details
- Add log-based tests verifying actual behavior
* refactor: extract fast-provide logic into reusable functions
Extract fast-provide logic from add command into reusable components:
- Add config.ShouldProvideForStrategy helper for strategy matching
- Add ExecuteFastProvide function reusable across add and dag import commands
- Move DefaultFastProvideTimeout constant to config/provide.go
- Simplify add.go from 72 lines to 6 lines for fast-provide
- Move fast-provide tests to dedicated TestAddFastProvide function
Benefits:
- cleaner API: callers only pass content characteristics
- all strategy logic centralized in one place
- better separation of concerns
- easier to add fast-provide to other commands in future
* feat(dag): add fast-provide support for dag import
Adds --fast-provide-root and --fast-provide-wait flags to `ipfs dag import`,
mirroring the fast-provide functionality available in `ipfs add`.
Changes:
- Add --fast-provide-root and --fast-provide-wait flags to dag import command
- Implement fast-provide logic for all root CIDs in imported CAR files
- Works even when --pin-roots=false (strategy checked internally)
- Share ExecuteFastProvide implementation between add and dag import
- Move ExecuteFastProvide to cmdenv package to avoid import cycles
- Add logging when fast-provide is disabled
- Conditional error handling: return error when wait=true, warn when wait=false
- Update config docs to mention both ipfs add and ipfs dag import
- Update changelog to use "provide" terminology and include dag import examples
- Add comprehensive test coverage (TestDagImportFastProvide with 6 test cases)
The fast-provide feature allows immediate DHT announcement of root CIDs
for faster content discovery, bypassing the regular background queue.
* docs: improve fast-provide documentation
Refine documentation to better explain fast-provide and sweep provider working
together, and highlight the performance improvement.
Changelog:
- add fast-provide to sweep provider features list
- explain performance improvement: root CIDs discoverable in <1s vs 30+ seconds
- note this uses optimistic DHT operations (faster with sweep provider)
- simplify examples, point to --help for details
Config docs:
- fix: --fast-provide-roots should be --fast-provide-root (singular)
- clarify Import.FastProvideRoot focuses on root CIDs while sweep handles all blocks
- simplify Import.FastProvideWait description
Command help:
- ipfs add: explain sweep provider context upfront
- ipfs dag import: add fast-provide explanation section
- both explain the split: fast-provide for roots, sweep for all blocks
* test: add tests for ShouldProvideForStrategy
add tests covering all provide strategy combinations with focus on
bitflag OR logic (the else-if bug fix). organized by behavior:
- all strategy always provides
- single strategies match only their flag
- combined strategies use OR logic
- zero strategy never provides
* refactor: error cmd on error and wait=true
change ExecuteFastProvide() to return error, enabling proper error
propagation when --fast-provide-wait=true. in sync mode, provide
failures now error the command as expected. in async mode (default),
always returns nil with errors logged in background goroutine.
also remove duplicate ExecuteFastProvide() from provide.go (75 lines),
keeping single implementation in cmdenv/env.go for reuse across add
and dag import commands.
call sites simplified:
- add.go: check and propagate error from ExecuteFastProvide
- dag/import.go: return error from ForEach callback, remove confusing
conditional error handling
semantics:
- precondition skips (DHT unavailable, etc): return nil (not failure)
- async mode (wait=false): return nil, log errors in goroutine
- sync mode (wait=true): return wrapped error on provide failure
* feat: fast provide
* Check error from provideRoot
* do not provide if nil router
* fix(commands): prevent panic from typed nil DHTClient interface
Fixes panic when ipfsNode.DHTClient is a non-nil interface containing a
nil pointer value (typed nil). This happened when Routing.Type=delegated
or when using HTTP-only routing without DHT.
The panic occurred because:
- Go interfaces can be non-nil while containing nil pointer values
- Simple `if DHTClient == nil` checks pass, but calling methods panics
- Example: `(*ddht.DHT)(nil)` stored in interface passes nil check
Solution:
- Add HasActiveDHTClient() method to check both interface and concrete value
- Update all 7 call sites to use proper check before DHT operations
- Rename provideRoot → provideCIDSync for clarity
- Add structured logging with "fast-provide" prefix for easier filtering
- Add tests covering nil cases and valid DHT configurations
Fixes: https://github.com/ipfs/kubo/pull/11046#issuecomment-3525313349
* feat(add): split fast-provide into two flags for async/sync control
Renames --fast-provide to --fast-provide-root and adds --fast-provide-wait
to give users control over synchronous vs asynchronous providing behavior.
Changes:
- --fast-provide-root (default: true): enables immediate root CID providing
- --fast-provide-wait (default: false): controls whether to block until complete
- Default behavior: async provide (fast, non-blocking)
- Opt-in: --fast-provide-wait for guaranteed discoverability (slower, blocking)
- Can disable with --fast-provide-root=false to rely on background reproviding
Implementation:
- Async mode: launches goroutine with detached context for fire-and-forget
- Added 10 second timeout to prevent hanging on network issues
- Timeout aligns with other kubo operations (ping, DNS resolve, p2p)
- Sufficient for DHT with sweep provider or accelerated client
- Sync mode: blocks on provideCIDSync until completion (uses req.Context)
- Improved structured logging with "fast-provide-root:" prefix
- Removed redundant "root CID" from messages (already in prefix)
- Clear async/sync distinction in log messages
- Added FAST PROVIDE OPTIMIZATION section to ipfs add --help explaining:
- The problem: background queue takes time, content not immediately discoverable
- The solution: extra immediate announcement of just the root CID
- The benefit: peers can find content right away while queue handles rest
- Usage: async by default, --fast-provide-wait for guaranteed completion
Changelog:
- Added highlight section for fast root CID providing feature
- Updated TOC and overview
- Included usage examples with clear comments explaining each mode
- Emphasized this is extra announcement independent of background queue
The feature works best with sweep provider and accelerated DHT client
where provide operations are significantly faster.
* fix(add): respect Provide config in fast-provide-root
fast-provide-root should honor the same config settings as the regular
provide system:
- skip when Provide.Enabled is false
- skip when Provide.DHT.Interval is 0
- respect Provide.Strategy (all/pinned/roots/mfs/combinations)
This ensures fast-provide only runs when appropriate based on user
configuration and the nature of the content being added (pinned vs
unpinned, added to MFS or not).
* Update core/commands/add.go
---------
Co-authored-by: gammazero <11790789+gammazero@users.noreply.github.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* telemetry: collect provideDHTSweepEnabled
Fixes#11055.
* telemetry: track custom Provide.DHT.Interval and MaxWorkers
collects whether users customize Interval and MaxWorkers from defaults
to help identify if defaults need adjustment
* docs: improve telemetry documentation structure and clarity
restructure docs/telemetry.md into meaningful sections (routing & discovery,
content providing, network configuration), add exact config field paths for all
tracked settings, and establish code as source of truth by linking from LogEvent
struct while removing redundant field comments
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
adds Gateway.MaxRangeRequestFileSize configuration to protect against CDN bugs
where range requests over certain sizes return entire files instead of requested
byte ranges, causing unexpected bandwidth costs.
- default: 0 (no limit)
- returns 501 Not Implemented for oversized range requests
- protects against CDNs like Cloudflare that ignore range requests over 5GiB
also introduces OptionalBytes type to reduce code duplication when handling
byte-size configuration values, replacing manual string parsing with humanize.ParseBytes.
migrates existing byte-size configs to use this new type.
Fixes: https://github.com/ipfs/boxo/issues/856
add "Understanding the Metrics" section explaining three types:
- per-worker rates (multiply by active workers for total throughput)
- per-region averages (do NOT multiply by worker count)
- system totals (cumulative across all workers)
enhance metric descriptions with:
- explicit calculation examples showing which worker counts to use
- warnings about when NOT to multiply by worker count
- cross-references to relevant sections
add "Capacity Planning" section with:
- step-by-step throughput capacity calculations
- diagnostic guidance for common scenarios
- worked examples for estimating required vs actual capacity
addresses confusion from PR #11034 comments about when to multiply
metrics by worker count and how to interpret per-worker rates
addresses stream frame memory pooling issue where StreamFrame objects
weren't properly returned to sync.Pool during stream cancellation
see quic-go/quic-go#5327
* chore(deps): update go-libp2p to v0.44.0
- includes self-healing UPnP port mappings after router restarts
- update go-netroute to v0.3.0
- update quic-go to v0.55.0
- add changelog entry for UPnP fix
* docs: improve provide and UPnP clarity in changelog and docs
- add alert polling rationale to changelog
- add UPnP config note with default clarification
- clarify sweep timing and prefix length explanations
- add concrete examples for time offset and record holders
- improve workers stats formatting
- add See Also section to provide-stats.md
* docs: add RISC-V prebuilt binaries to changelog and README
- highlight linux-riscv64 availability with open hardware context
- update README with arm64 builds, remove 32-bit examples
* feat: provide stats
* added N/A
* format
* workers stats alignment
* ipfs provide stat --all --compact
* consolidating compact stat
* update column alignment
* flags combinations errors
* command description
* change schedule AvgPrefixLen to float
* changelog
* alignments
* provide stat description draft
* rephrased provide-stats.md
* linking provide-stats.md from command description
* documentation test
* fix: refactor provide stat command type handling
- add extractSweepingProvider() helper to reduce nested type switching
- extract lowWorkerThreshold constant for worker availability check
- fix --lan error handling to work with buffered providers
* docs: add clarifying comments
* fix(commands): improve provide stat compact mode
- prevent panic when both columns are empty
- fix column alignment with UTF-8 characters
- only track col0MaxWidth for first column (as intended)
* test: add tests for ipfs provide stat command
- test basic functionality, flags, JSON output
- test legacy provider behavior
- test integration with content scheduling
- test disabled provider configurations
- add parseSweepStats helper with t.Helper()
* docs: improve provide command help text
- update tagline to "Control and monitor content providing"
- simplify help descriptions
- make error messages more consistent
- update tests to match new error messages
* metrics rename
```
Next reprovide at:
Next prefix:
```
updated to:
```
Next region prefix:
Next region reprovide:
```
* docs: improve Provide system documentation clarity
Enhance documentation for the Provide system to better explain how provider
records work and the differences between sweep and legacy modes.
Changes to docs/config.md:
- Provide section: add clear explanation of provider records and their role
- Provide.DHT: add provider record lifecycle and two provider systems overview
- Provide.DHT.Interval: explain relationship to expiration, contrast sweep vs legacy behavior
- Provide.DHT.SweepEnabled: rewrite to explain legacy problem, sweep solution, and efficiency gains
- Monitoring section: prioritize command-line tools (ipfs provide stat) before Prometheus
Changes to core/commands/provide.go:
- ipfs provide stat help: add explanation of provider records, TTL expiration, and how sweep batching works
Changes to docs/changelogs/v0.39.md:
- Add context about why stats matter for monitoring provider health
- Emphasize real-time monitoring workflow with watch command
- Explain what users can observe (rates, queues, worker availability)
* depend on latest kad-dht master
* docs: nits
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>