docs: add sweeping provide worker count recommendation (#11001)

Add recommentation for worker count for the sweeping provide system for users with millions of CIDs.
2026-02-21 10:27:46 +08:00 · 2025-10-01 16:01:27 +02:00 · 2025-10-01 16:01:27 +02:00 · cf8194a8d1
commit cf8194a8d1
parent 42a4935abf
1 changed files with 25 additions and 0 deletions
--- a/docs/config.md
+++ b/docs/config.md
@ -1633,12 +1633,14 @@ this limit in the configuration.
 **Why operations fail instead of auto-flushing:** Automatic flushing once the limit
 is reached was considered but rejected because it can lead to data corruption issues
 that are difficult to debug. When the system decides to flush without user knowledge, it can:
+
 - Create partial states that violate user expectations about atomicity
 - Interfere with concurrent operations in unexpected ways
 - Make debugging and recovery much harder when issues occur

 By failing explicitly, users maintain control over when their data is persisted,
 allowing them to:
+
 - Batch related operations together before flushing
 - Handle errors predictably at natural transaction boundaries
 - Understand exactly when and why their data is written to disk
@ -1647,6 +1649,7 @@ If you expect automatic flushing behavior, simply use the default `--flush=true`
 (or omit the flag entirely) instead of `--flush=false`.

 **⚠️ WARNING:** Increasing this limit or disabling it (setting to 0) can lead to:
+
 - **Out-of-memory errors (OOM)** - Each unflushed operation consumes memory
 - **Data loss** - If the daemon crashes before flushing, all unflushed changes are lost
 - **Degraded performance** - Large unflushed caches slow down MFS operations
@ -2033,6 +2036,19 @@ connections this setting can generate.
 > At the same time, mind that raising this value too high may lead to increased load.
 > Proceed with caution, ensure proper hardware and networking are in place.

+> [!TIP]
+> **When `SweepEnabled` is true:** Users providing millions of CIDs or more
+> should increase the worker count accordingly. Underprovisioning can lead to
+> slow provides (burst workers) and inability to keep up with content
+> reproviding (periodic workers). For nodes with sufficient resources (CPU,
+> bandwidth, number of connections), dedicating `1024` for [periodic
+> workers](#providedhtdedicatedperiodicworkers) and `512` for [burst
+> workers](#providedhtdedicatedburstworkers), and `2048` [max
+> workers](#providedhtmaxworkers) should be adequate even for the largest
+> users. The system will only use workers as needed - unused resources won't be
+> consumed. Ensure you adjust the swarm [connection manager](#swarmconnmgr) and
+> [resource manager](#swarmresourcemgr) configuration accordingly.
+
 Default: `16`

 Type: `optionalInteger` (non-negative; `0` means unlimited number of workers)
@ -2098,6 +2114,11 @@ number of workers will be dedicated to the periodic region reprovide only. The s
 Any remaining workers (MaxWorkers - DedicatedPeriodicWorkers - DedicatedBurstWorkers)
 form a shared pool that can be used for either type of work as needed.

+> [!NOTE]
+> If the provider system isn't able to keep up with reproviding all your
+> content within the [Provide.DHT.Interval](#providedhtinterval), consider
+> increasing this value.
+
 Default: `2`

 Type: `optionalInteger` (`0` means there are no dedicated workers, but the
@ -2121,6 +2142,10 @@ number of workers will be dedicated to burst provides only. In addition to
 these, if there are available workers in the pool, they can also be used for
 burst provides.

+> [!NOTE]
+> If CIDs aren't provided quickly enough to your taste, and you can afford more
+> CPU and bandwidth, consider increasing this value.
+
 Default: `1`

 Type: `optionalInteger` (`0` means there are no dedicated workers, but the