mirror of
https://github.com/ipfs/kubo.git
synced 2026-02-21 10:27:46 +08:00
docs: add sweeping provide worker count recommendation (#11001)
Add recommentation for worker count for the sweeping provide system for
users with millions of CIDs.
(cherry picked from commit cf8194a8d1)
This commit is contained in:
parent
97a51fb7c1
commit
48cb03c3f4
@ -1633,12 +1633,14 @@ this limit in the configuration.
|
||||
**Why operations fail instead of auto-flushing:** Automatic flushing once the limit
|
||||
is reached was considered but rejected because it can lead to data corruption issues
|
||||
that are difficult to debug. When the system decides to flush without user knowledge, it can:
|
||||
|
||||
- Create partial states that violate user expectations about atomicity
|
||||
- Interfere with concurrent operations in unexpected ways
|
||||
- Make debugging and recovery much harder when issues occur
|
||||
|
||||
By failing explicitly, users maintain control over when their data is persisted,
|
||||
allowing them to:
|
||||
|
||||
- Batch related operations together before flushing
|
||||
- Handle errors predictably at natural transaction boundaries
|
||||
- Understand exactly when and why their data is written to disk
|
||||
@ -1647,6 +1649,7 @@ If you expect automatic flushing behavior, simply use the default `--flush=true`
|
||||
(or omit the flag entirely) instead of `--flush=false`.
|
||||
|
||||
**⚠️ WARNING:** Increasing this limit or disabling it (setting to 0) can lead to:
|
||||
|
||||
- **Out-of-memory errors (OOM)** - Each unflushed operation consumes memory
|
||||
- **Data loss** - If the daemon crashes before flushing, all unflushed changes are lost
|
||||
- **Degraded performance** - Large unflushed caches slow down MFS operations
|
||||
@ -2033,6 +2036,19 @@ connections this setting can generate.
|
||||
> At the same time, mind that raising this value too high may lead to increased load.
|
||||
> Proceed with caution, ensure proper hardware and networking are in place.
|
||||
|
||||
> [!TIP]
|
||||
> **When `SweepEnabled` is true:** Users providing millions of CIDs or more
|
||||
> should increase the worker count accordingly. Underprovisioning can lead to
|
||||
> slow provides (burst workers) and inability to keep up with content
|
||||
> reproviding (periodic workers). For nodes with sufficient resources (CPU,
|
||||
> bandwidth, number of connections), dedicating `1024` for [periodic
|
||||
> workers](#providedhtdedicatedperiodicworkers) and `512` for [burst
|
||||
> workers](#providedhtdedicatedburstworkers), and `2048` [max
|
||||
> workers](#providedhtmaxworkers) should be adequate even for the largest
|
||||
> users. The system will only use workers as needed - unused resources won't be
|
||||
> consumed. Ensure you adjust the swarm [connection manager](#swarmconnmgr) and
|
||||
> [resource manager](#swarmresourcemgr) configuration accordingly.
|
||||
|
||||
Default: `16`
|
||||
|
||||
Type: `optionalInteger` (non-negative; `0` means unlimited number of workers)
|
||||
@ -2098,6 +2114,11 @@ number of workers will be dedicated to the periodic region reprovide only. The s
|
||||
Any remaining workers (MaxWorkers - DedicatedPeriodicWorkers - DedicatedBurstWorkers)
|
||||
form a shared pool that can be used for either type of work as needed.
|
||||
|
||||
> [!NOTE]
|
||||
> If the provider system isn't able to keep up with reproviding all your
|
||||
> content within the [Provide.DHT.Interval](#providedhtinterval), consider
|
||||
> increasing this value.
|
||||
|
||||
Default: `2`
|
||||
|
||||
Type: `optionalInteger` (`0` means there are no dedicated workers, but the
|
||||
@ -2121,6 +2142,10 @@ number of workers will be dedicated to burst provides only. In addition to
|
||||
these, if there are available workers in the pool, they can also be used for
|
||||
burst provides.
|
||||
|
||||
> [!NOTE]
|
||||
> If CIDs aren't provided quickly enough to your taste, and you can afford more
|
||||
> CPU and bandwidth, consider increasing this value.
|
||||
|
||||
Default: `1`
|
||||
|
||||
Type: `optionalInteger` (`0` means there are no dedicated workers, but the
|
||||
|
||||
Loading…
Reference in New Issue
Block a user