* fix: remove timeout on default DHT operations
This removes the timeout by default for DHT operations. In particular
this causes issues with ProvideMany requests which can take an
indeterminate amount of time, but really these should just respect
context timeouts by default. Users can still specify timeouts here if
they want, but by default they will be set to "0" which means "no
timeout".
This is unlikely to break existing users of custom routing, because
there was previously no utility in configuring a router with timeout=0
because that would cause the router to immediately fail, so it is
unlikely (and incorrect) if anybody was using timeout=0.
* fix: remove 5m timeout on ProvideManyRouter
For context see
5fda291b66
---------
Co-authored-by: Marcin Rataj <lidel@lidel.org>
This is pretty common when working through PRs and ends up causing
tons of in-flight GitHub Actions workflows running because they aren't
currently canceled when a new commit is added. This will cancel
previous runs if a new commit is added on a branch (which is the
behavior we had on CircleCI).
This also means that rb-pinning-service-api is no longer required for
running remote pinning tests. This alone saves at least 3 minutes in
test runtime in CI because we don't need to checkout the repo, build
the Docker image, run it, etc.
Instead this implements a simple pinning service in Go that the test
runs in-process, with a callback that can be used to control the async
behavior of the pinning service (e.g. simulate work happening
asynchronously like transitioning from "queued" -> "pinning" ->
"pinned").
This also adds an environment variable to Kubo to control the MFS
remote pin polling interval, so that we don't have to wait 30 seconds
in the test for MFS changes to be repinned. This is purely for tests
so I don't think we should document this.
This entire test suite runs in around 2.5 sec on my laptop, compared to
the existing 3+ minutes in CI.
The test trims all whitespace bytes from the output of 'ipfs cat' but
if the random bytes end in a whitespace char then it trims that too,
resulting in random test failure.
Instead this updates the test harness to only trim a single trailing
newline char, so that it doesn't end up chomping legitimate output.
This fixes a deadlock introduced in 1457b4fd4a.
We can't use the coreapi here because it will try to take the PinLock (RLock) again, so revert this small part of 1457b4fd4a.
This used cause a deadlock when concurrently running `ipfs dag import` concurrently with the GC.
The bug is that `ipfs dag import` takes an RLock with the PinLock.
*the cars are imported, leaving a wide window of time*
Then GC Takes a Lock on that same RWMutex while taking the GC Lock (it blocks because it waits for the RLock to be released).
Then the car imports are finished and `ipfs dag import` tries to aqcuire the PinLock (doing an RLock) again in `Api().Pin`.
However at this point the RWMutex is starved, the runtime put a fence in front of RLocks if a Lock has been waiting for too lock (else you could have an endless stream of RLock / RUnlock forever delaying a Lock to ever go through).
The issue is that `ipfs dag import`'s original RLock which is blocking everyone will be released once it returns, which only happens when `Api().Pin` completes.
So we have a deadlock (ABA kind ?), because `ipfs dag import` waits on the GC Lock, which waits on `ipfs dag import`.
Calling the Pinner directly does not acquire the PinLock again, and thus does not have this issue.
- don't bypass the CoreApi
- don't use a goroutine and return channel for `importWorker`, when what's happening is really just a synchronous call
- only `PinLock()` when we are going to pin
- use `cid.Set` instead of an explicit map
- fail the request early if any pinning fail, no need to try to pin more if the request failed already
In practice there are cases when fx.Replace doesn't work as expected,
so this updates the documentation to recommend using fx.Decorate
instead which is a bit easier to use.