This was done taking a fresh look for someone learning about
the resource manager being enabled by default.
Moved and expanded documentation to the right places for more visibility.
Added an initial changelog entry.
This PR adds several new functionalities to make easier the usage of ResourceManager:
- Now resource manager logs when resources are exceeded are on ERROR instead of warning.
- The resources exceeded error now shows what kind of limit was reached and the scope.
- When there was no limit exceeded, we print a message for the user saying that limits are not exceeded anymore.
- Added `swarm limit all` command to show all set limits with the same format as `swarm stats all`
- Added `min-used-limit-perc` option to `swarm stats all` to only show stats that are above a specific percentage
- Simplify a lot default values.
- **Enable ResourceManager by default.**
Output example:
```
2022-11-09T10:51:40.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:51:50.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 483095 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:51:50.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:52:00.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 455294 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:00.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:52:10.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 471384 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:10.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 8 times with error "peer:12D3KooWKqcaBtcmZKLKCCoDPBuA6AXGJMNrLQUPPMsA5Q6D1eG6: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 192 times with error "peer:12D3KooWPjetWPGQUih9LZTGHdyAM9fKaXtUxDyBhA93E3JAWCXj: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 469746 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:52:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 484137 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 29 times with error "peer:12D3KooWPjetWPGQUih9LZTGHdyAM9fKaXtUxDyBhA93E3JAWCXj: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:52:40.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 468843 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:40.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:52:50.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 366638 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:52:50.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:53:00.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 405526 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:53:00.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 107 times with error "peer:12D3KooWQZQCwevTDGhkE9iGYk5sBzWRDUSX68oyrcfM9tXyrs2Q: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:53:00.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:53:10.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 336923 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:53:10.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:53:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 71 times with error "transient: cannot reserve inbound stream: resource limit exceeded".
2022-11-09T10:53:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
2022-11-09T10:53:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:64 Resrouce limits are no longer being exceeded.
```
## Validation tests
- Accelerated DHT client runs with no errors when ResourceManager is active. No problems were observed.
- Running an attack with 200 connections and 1M streams using yamux protocol. Node was usable during the attack. With ResourceManager deactivated, the node was killed by the OS because of the amount of memory consumed.
- Actions done when the attack was active:
- Add files
- Force a reprovide
- Use the gateway to resolve an IPNS address.
It closes#9001
It closes#9351
It closes#9322
New multi-router configuration system based on https://hackmd.io/G1KRDEX5T3qyfoBMkIrBew#Methods
- Added a new routing type: "custom"
- Added specific struct types for different Routers (instead of map[string]interface{})
- Added `Duration` config type, to make easier time string parsing
- Added config documentation.
- Use the latest go-delegated-routing library version with GET support.
- Added changelog notes for this feature.
It:
- closes#9157
- closes#9079
- closes#9186
* correct data type for Reprovider.Interval
Signed-off-by: Oleg Silkin <16809287+osilkin98@users.noreply.github.com>
Co-authored-by: Jorropo <jorropo.pgm@gmail.com>
* Add reference to Experimental config doc
this clarifies that `Experimental` is in fact a top level configuration key, and links to the most current documentation
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
* Delegated Routing.
Implementation of Reframe specs (https://github.com/ipfs/specs/blob/master/REFRAME.md) using go-delegated-routing library.
* Requested changes.
* Init using op string
* Separate possible ContentRouters for TopicDiscovery.
If we don't do this, we have a ciclic dependency creating TieredRouter.
Now we can create first all possible content routers, and after that,
create Routers.
* Set dht default routing type
* Add tests and remove uneeded code
* Add documentation.
* docs: Routing.Routers
* Requested changes.
Signed-off-by: Antonio Navarro Perez <antnavper@gmail.com>
* Add some documentation on new fx functions.
* Add changelog entry and integration tests
* test: sharness for 'dht' in 'routing' commands
Since 'routing' is currently the same as 'dht' (minus query command)
we need to test both, that way we won't have unnoticed divergence
in the default behavior.
* test(sharness): delegated routing via reframe URL
* Add more tests for delegated routing.
* If any put operation fails, the tiered router will fail.
* refactor: Routing.Routers: Parameters.Endpoint
As agreed in https://github.com/ipfs/kubo/pull/8997#issuecomment-1175684716
* Try to improve CHANGELOG entry.
* chore: update reframe spec link
* Update go-delegated-routing dependency
* Fix config error test
* use new changelog format
* Remove port conflict
* go mod tidy
* ProviderManyWrapper to ProviderMany
* Update docs/changelogs/v0.14.md
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
* fix: remove mdns_legacy
We've been running both implementations for a long, long time.
It is time to remove legacy version and lower the number of LAN packets
IPFS node produces.
See https://github.com/ipfs/go-ipfs/pull/9048#discussion_r906814717
for the Interval removal rational.
* feat: disable resource manager by default
We are disabling this by default for v0.13 as we work to improve the
UX around Resource Manager. It is still usable and can be enabled in
the IPFS config with "ipfs config --bool Swarm.ResourceMgr.Enabled true".
We intend to enable Resource Manager by default in a subsequent
release.
* docs(config): Swarm.ResourceMgr disabled by default
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* fix(core/gateway): option to limit directory size listing
* feat(gw): HTMLDirListingLimit
This is alternative take on the way we limit the HTML listing output.
Instead of a hard cut-off, we list up to HTMLDirListingLimit.
When a directory has more items than HTMLDirListingLimit we show
additional header and footer informing user that only $HTMLDirListingLimit
items are listed. This is a better UX.
* fix: 0 disables Gateway.HTMLDirListingLimit
* refactor: Gateway.FastDirIndexThreshold
see explainer in docs/config.md
* refactor: prealoc slices
* docs: Gateway.FastDirIndexThreshold
* refactor: core/corehttp/gateway_handler.go
https://github.com/ipfs/go-ipfs/pull/8853#discussion_r851437088
* docs: apply suggestions from code review
Co-authored-by: Alan Shaw <alan.shaw@protocol.ai>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Alan Shaw <alan.shaw@protocol.ai>
* update go-libp2p to v0.19.0
* chore: go-namesys v0.5.0
* refactor(config): cleanup relay handling
* docs(config): document updated defaults
* fix(tests): panic during sharness
* fix: t0160-resolve.sh
See https://github.com/ipfs/go-namesys/pull/32
* fix: t0182-circuit-relay.sh
* test: transport encryption
Old tests were no longer working because go-libp2p 0.19 removed
the undocumented 'ls' pseudoprotocol.
This replaces these tests with handshake attempt (name is echoed back on
OK or 'na' is returned when protocol is not available) for tls and noise
variants + adds explicit test that safeguards us against enabling
plaintext by default by a mistake.
* fix: ./t0182-circuit-relay.sh
test is flaky, for now we just restart the testbed when we get
NO_RESERVATION error
* refactor: AutoRelayFeeder with exp. backoff
It starts at feeding peers ever 15s, then backs off each time
until it is done once an hour
Should be acceptable until we have smarter mechanism in go-lib2p 0.20
* feat(AutoRelay): prioritize Peering.Peers
This ensures we feed trusted Peering.Peers in addition to any peers
discovered over DHT.
* docs(CHANGELOG): document breaking changes
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Gus Eggert <gus@gus.dev>
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca.
after clarification feedback from
https://github.com/ipfs/go-ipfs/pull/8680#discussion_r841680182
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <schomatis@gmail.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
This should fix the issue of users thinking badger
is "no-brainer faster choice" and then running into problems.
Co-authored-by: Johnny <9611008+johnnymatthews@users.noreply.github.com>
* feat: use Swarm.EnableHolePunching flag within libp2p
* docs: Swarm.EnableHolePunching
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
* Allow pubsub and namesys-pubsub to be enabled via config
Signed-off-by: Joe Holden <jwh@zorins.us>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
* plumb through go-datastore context changes
* update go-libp2p to v0.16.0
* use LIBP2P_TCP_REUSEPORT instead of IPFS_REUSEPORT
* use relay config
* making deprecation notice match the go-ipfs-config key
* docs(config): circuit relay v2
* docs(config): fix links and headers
* feat(config): Internal.Libp2pForceReachability
This switches to config that supports setting and reading
Internal.Libp2pForceReachability OptionalString flag
* use configuration option for static relays
* chore: go-ipfs-config v0.18.0
https://github.com/ipfs/go-ipfs-config/releases/tag/v0.18.0
* feat: circuit v1 migration prompt when Swarm.EnableRelayHop is set (#8559)
* exit when Swarm.EnableRelayHop is set
* docs: Experimental.ShardingEnabled migration
This ensures existing users of global sharding experiment get notified
that the flag no longer works + that autosharding happens automatically.
For people who NEED to keep the old behavior (eg. have no time to
migrate today) there is a note about restoring it with
`UnixFSShardingSizeThreshold`.
* chore: add dag-jose code to the cid command output
* add support for setting automatic unixfs sharding threshold from the config
* test: have tests use low cutoff for sharding to mimic old behavior
* test: change error message to match the current error
* test: Add automatic sharding/unsharding tests (#8547)
* test: refactored naming in the sharding sharness tests to make more sense
* ci: set interop test executor to convenience image for Go1.16 + Node
* ci: use interop master
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marten Seemann <martenseemann@gmail.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Gus Eggert <gus@gus.dev>
Co-authored-by: Lucas Molas <schomatis@gmail.com>
* feat: extract Bitswap fx initialization to its own file
* chore: bump go-bitswap dependency
* feat: bump go-ipfs-config dependency and utilize the new Internal.Bitswap configuration options. Add documentation around the new OptionalInteger config type as well as the Internal.Bitswap options.
* docs(docs/config.md): move the table of contents towards the top of the document and update it
Co-authored-by: Petar Maymounkov <petarm@gmail.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Gus Eggert <877588+guseggert@users.noreply.github.com>
* Enable downloading migrations over IPFS
There are now options in the config file that control how migrations are downloaded. This includes enabling downloading migrations using IPFS by (when migrations are required) spinning up a temporary node for fetching the migrations before running them. There is also a config option to decide what to do with the migrations binaries once they are downloaded (e.g. cache or pin them in your node, or just throw out the data).
Co-authored-by: Steven Allen <steven@stebalien.com>
- document implicit defaults for eth and crypto TLDs
- be very honest and link to ToS of cloudflare-eth.com
- provide suggestion that one should run own resolver if they really care about decentralization
We've had a reliable and enabled by default TLS implementation since
0.4.23 (over a year ago) and turned off SECIO in September of last year.
We might as well remove support entirely in the next release and
encourage users to upgrade their networks.
Noise is faster, anyways.
We used Clear-Site-Data to cushion transition period for local gateway
exposed at http://localhost while we were still figuring out
security-related details.
In the final implementation subdomain gateways are not tied to a
hostname explicitly, which removes the risk of cookies leaking,
removing the need for the header.
Turns out it causes issues for Firefox users, so let's just remove it.
Closes https://github.com/ipfs-shipyard/ipfs-companion/issues/977
Added support for remote pinning services
A pinning service is a service that accepts CIDs from a user in order to host the data associated with them.
The spec for these services is defined at https://github.com/ipfs/pinning-services-api-spec
Support is available via the `ipfs pin remote` CLI and the corresponding HTTP API
Co-authored-by: Petar Maymounkov <petarm@gmail.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Adin Schmahmann <adin.schmahmann@gmail.com>
Add support for one or more wildcards in the hostname definition
of a public gateway. This is useful for example to support easily
multiples environment.
Wildcarded hostname are set in the config as for example "*.domain.tld".
MVP for #6097
This feature will repeatedly reconnect (with a randomized exponential backoff)
to peers in a set of "peered" peers.
In the future, this should be extended to:
1. Include a CLI for modifying this list at runtime.
2. Include additional options for peers we want to _protect_ but not connect to.
3. Allow configuring timeouts, backoff, etc.
4. Allow groups? Possibly through textile threads.
5. Allow for runtime-only peering rules.
6. Different reconnect policies.
But this MVP should be a significant step forward.
1. Enable AutoNATService on _all_ nodes by default. If it's an issue, we can
disable it in RC3 but this will give us the best testing results.
2. Expose options to configure AutoNAT rate limiting.
Updated docs to reflect that Routing is not a child of Discovery
Ref. 664ccd976e/routing.go (L6)
License: MIT
Signed-off-by: Marcin Rataj <lidel@lidel.org>
Clarifying that the Announce array is only used when non-empty
and will override the default inferred announce addresses.
License: MIT
Signed-off-by: John Reed <john@re2d.xyz>
config.md: updating the Addresses section
Adding documentation on Announce and NoAnnounce.
Fixing typo in API default value.
License: MIT
Signed-off-by: John Reed <john@re2d.xyz>