mirror of
https://github.com/ipfs/kubo.git
synced 2026-02-22 02:47:48 +08:00
Some checks are pending
CodeQL / codeql (push) Waiting to run
Docker Check / lint (push) Waiting to run
Docker Check / build (push) Waiting to run
Gateway Conformance / gateway-conformance (push) Waiting to run
Gateway Conformance / gateway-conformance-libp2p-experiment (push) Waiting to run
Go Build / go-build (push) Waiting to run
Go Check / go-check (push) Waiting to run
Go Lint / go-lint (push) Waiting to run
Go Test / unit-tests (push) Waiting to run
Go Test / cli-tests (push) Waiting to run
Go Test / example-tests (push) Waiting to run
Interop / interop-prep (push) Waiting to run
Interop / helia-interop (push) Blocked by required conditions
Interop / ipfs-webui (push) Blocked by required conditions
Sharness / sharness-test (push) Waiting to run
Spell Check / spellcheck (push) Waiting to run
* feat(config): Import.* and unixfs-v1-2025 profile
implements IPIP-499: add config options for controlling UnixFS DAG
determinism and introduces `unixfs-v1-2025` and `unixfs-v0-2015`
profiles for cross-implementation CID reproducibility.
changes:
- add Import.* fields: HAMTDirectorySizeEstimation, SymlinkMode,
DAGLayout, IncludeEmptyDirectories, IncludeHidden
- add validation for all Import.* config values
- add unixfs-v1-2025 profile (recommended for new data)
- add unixfs-v0-2015 profile (alias: legacy-cid-v0)
- remove deprecated test-cid-v1 and test-cid-v1-wide profiles
- wire Import.HAMTSizeEstimationMode() to boxo globals
- update go.mod to use boxo with SizeEstimationMode support
ref: https://specs.ipfs.tech/ipips/ipip-0499/
* feat(add): add --dereference-symlinks, --empty-dirs, --hidden CLI flags
add CLI flags for controlling file collection behavior during ipfs add:
- `--dereference-symlinks`: recursively resolve symlinks to their target
content (replaces deprecated --dereference-args which only worked on
CLI arguments). wired through go-ipfs-cmds to boxo's SerialFileOptions.
- `--empty-dirs` / `-E`: include empty directories (default: true)
- `--hidden` / `-H`: include hidden files (default: false)
these flags are CLI-only and not wired to Import.* config options because
go-ipfs-cmds library handles input file filtering before the directory
tree is passed to kubo. removed unused Import.UnixFSSymlinkMode config
option that was defined but never actually read by the CLI.
also:
- wire --trickle to Import.UnixFSDAGLayout config default
- update go-ipfs-cmds to v0.15.1-0.20260117043932-17687e216294
- add SYMLINK HANDLING section to ipfs add help text
- add CLI tests for all three flags
ref: https://github.com/ipfs/specs/pull/499
* test(add): add CID profile tests and wire SizeEstimationMode
add comprehensive test suite for UnixFS CID determinism per IPIP-499:
- verify exact HAMT threshold boundary for both estimation modes:
- v0-2015 (links): sum(name_len + cid_len) == 262144
- v1-2025 (block): serialized block size == 262144
- verify HAMT triggers at threshold + 1 byte for both profiles
- add all deterministic CIDs for cross-implementation testing
also wires SizeEstimationMode through CLI/API, allowing
Import.UnixFSHAMTSizeEstimation config to take effect.
bumps boxo to ipfs/boxo@6707376 which aligns HAMT threshold with
JS implementation (uses > instead of >=), fixing CID determinism
at the exact 256 KiB boundary.
* feat(add): --dereference-symlinks now resolves all symlinks
Previously, resolving symlinks required two flags:
- --dereference-args: resolved symlinks passed as CLI arguments
- --dereference-symlinks: resolved symlinks inside directories
Now --dereference-symlinks handles both cases. Users only need one flag
to fully dereference symlinks when adding files to IPFS.
The deprecated --dereference-args still works for backwards compatibility
but is no longer necessary.
* chore: update boxo and improve changelog
- update boxo to ebdaf07c (nil filter fix, thread-safety docs)
- simplify changelog for IPIP-499 section
- shorten test names, move context to comments
* chore: update boxo to 5cf22196
* chore: apply suggestions from code review
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
* test(add): verify balanced DAG layout produces uniform leaf depth
add test that confirms kubo uses balanced layout (all leaves at same
depth) rather than balanced-packed (varying depths). creates 45MiB file
to trigger multi-level DAG and walks it to verify leaf depth uniformity.
includes trickle subtest to validate test logic can detect varying depths.
supports CAR export via DAG_LAYOUT_CAR_OUTPUT env var for test vectors.
* chore(deps): update boxo to 6141039ad8ef
switches to 6141039ad8
changes since 5cf22196ad0b:
- refactor(unixfs): use arithmetic for exact block size calculation
- refactor(unixfs): unify size tracking and make SizeEstimationMode immutable
- feat(unixfs): optimize SizeEstimationBlock and add mode/mtime tests
also clarifies that directory sharding globals affect both `ipfs add` and MFS.
* test(cli): improve HAMT threshold tests with exact +1 byte verification
- add UnixFSDataType() helper to directly check UnixFS type via protobuf
- refactor threshold tests to use exact +1 byte calculations instead of +1 file
- verify directory type directly (ft.TDirectory vs ft.THAMTShard) instead of
inferring from link count
- clean up helper function signatures by removing unused cidLength parameter
* test(cli): consolidate profile tests into cid_profiles_test.go
remove duplicate profile threshold tests from add_test.go since they
are fully covered by the data-driven tests in cid_profiles_test.go.
changes:
- improve test names to describe what threshold is being tested
- add inline documentation explaining each test's purpose
- add byte-precise helper IPFSAddDeterministicBytes for threshold tests
- remove ~200 lines of duplicated test code from add_test.go
- keep non-profile tests (pinning, symlinks, hidden files) in add_test.go
* chore: update to rebased boxo and go-ipfs-cmds PRs
* docs: add HAMT threshold fix details to changelog
* feat(mfs): use Import config for CID version and hash function
make MFS commands (files cp, files write, files mkdir, files chcid)
respect Import.CidVersion and Import.HashFunction config settings
when CLI options are not explicitly provided.
also add tests for:
- files write respects Import.UnixFSRawLeaves=true
- single-block file: files write produces same CID as ipfs add
- updated comments clarifying CID parity with ipfs add
* feat(files): wire Import.UnixFSChunker and UnixFSDirectoryMaxLinks to MFS
`ipfs files` commands now respect these Import.* config options:
- UnixFSChunker: configures chunk size for `files write`
- UnixFSDirectoryMaxLinks: triggers HAMT sharding in `files mkdir`
- UnixFSHAMTDirectorySizeEstimation: controls size estimation mode
previously, MFS used hardcoded defaults ignoring user config.
changes:
- config/import.go: add UnixFSSplitterFunc() returning chunk.SplitterGen
- core/node/core.go: pass chunker, maxLinks, sizeEstimationMode to
mfs.NewRoot() via new boxo RootOption API
- core/commands/files.go: pass maxLinks and sizeEstimationMode to
mfs.Mkdir() and ensureContainingDirectoryExists(); document that
UnixFSFileMaxLinks doesn't apply to files write (trickle DAG limitation)
- test/cli/files_test.go: add tests for UnixFSDirectoryMaxLinks and
UnixFSChunker, including CID parity test with `ipfs add --trickle`
related: boxo@54e044f1b265
* feat(files): wire Import.UnixFSHAMTDirectoryMaxFanout and UnixFSHAMTDirectorySizeThreshold
wire remaining HAMT config options to MFS root:
- Import.UnixFSHAMTDirectoryMaxFanout via mfs.WithMaxHAMTFanout
- Import.UnixFSHAMTDirectorySizeThreshold via mfs.WithHAMTShardingSize
add CLI tests:
- files mkdir respects Import.UnixFSHAMTDirectoryMaxFanout
- files mkdir respects Import.UnixFSHAMTDirectorySizeThreshold
- config change takes effect after daemon restart
add UnixFSHAMTFanout() helper to test harness
update boxo to ac97424d99ab90e097fc7c36f285988b596b6f05
* fix(mfs): single-block files in CIDv1 dirs now produce raw CIDs
problem: `ipfs files write` in CIDv1 directories wrapped single-block
files in dag-pb even when raw-leaves was enabled, producing different
CIDs than `ipfs add --raw-leaves` for the same content.
fix: boxo now collapses single-block ProtoNode wrappers (with no
metadata) to RawNode in DagModifier.GetNode(). files with mtime/mode
stay as dag-pb since raw blocks cannot store UnixFS metadata.
also fixes sparse file writes where writing past EOF would lose data
because expandSparse didn't update the internal node pointer.
updates boxo to v0.36.1-0.20260203003133-7884ae23aaff
updates t0250-files-api.sh test hashes to match new behavior
* chore(test): use Go 1.22+ range-over-int syntax
* chore: update boxo to c6829fe26860
- fix typo in files write help text
- update boxo with CI fixes (gofumpt, race condition in test)
* chore: update go-ipfs-cmds to 192ec9d15c1f
includes binary content types fix: gzip, zip, vnd.ipld.car, vnd.ipld.raw,
vnd.ipfs.ipns-record
* chore: update boxo to 0a22cde9225c
includes refactor of maxLinks check in addLinkChild (review feedback).
* ci: fix helia-interop and improve caching
skip '@helia/mfs - should have the same CID after creating a file' test
until helia implements IPIP-499 (tracking: https://github.com/ipfs/helia/issues/941)
the test fails because kubo now collapses single-block files to raw CIDs
while helia explicitly uses reduceSingleLeafToSelf: false
changes:
- run aegir directly instead of helia-interop binary (binary ignores --grep flags)
- cache node_modules keyed by @helia/interop version from npm registry
- skip npm install on cache hit (matches ipfs-webui caching pattern)
* chore: update boxo to 1e30b954
includes latest upstream changes from boxo main
* chore: update go-ipfs-cmds to 1b2a641ed6f6
* chore: update boxo to f188f79fd412
switches to boxo@main after merging https://github.com/ipfs/boxo/pull/1088
* chore: update go-ipfs-cmds to af9bcbaf5709
switches to go-ipfs-cmds@master after merging https://github.com/ipfs/go-ipfs-cmds/pull/315
---------
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
278 lines
9.8 KiB
Go
278 lines
9.8 KiB
Go
package node
|
|
|
|
import (
|
|
"context"
|
|
"errors"
|
|
"fmt"
|
|
|
|
"github.com/ipfs/boxo/blockservice"
|
|
blockstore "github.com/ipfs/boxo/blockstore"
|
|
exchange "github.com/ipfs/boxo/exchange"
|
|
offline "github.com/ipfs/boxo/exchange/offline"
|
|
"github.com/ipfs/boxo/fetcher"
|
|
bsfetcher "github.com/ipfs/boxo/fetcher/impl/blockservice"
|
|
"github.com/ipfs/boxo/filestore"
|
|
"github.com/ipfs/boxo/ipld/merkledag"
|
|
"github.com/ipfs/boxo/ipld/unixfs"
|
|
"github.com/ipfs/boxo/mfs"
|
|
pathresolver "github.com/ipfs/boxo/path/resolver"
|
|
pin "github.com/ipfs/boxo/pinning/pinner"
|
|
"github.com/ipfs/boxo/pinning/pinner/dspinner"
|
|
"github.com/ipfs/go-cid"
|
|
"github.com/ipfs/go-datastore"
|
|
format "github.com/ipfs/go-ipld-format"
|
|
"github.com/ipfs/go-unixfsnode"
|
|
dagpb "github.com/ipld/go-codec-dagpb"
|
|
"go.uber.org/fx"
|
|
|
|
"github.com/ipfs/kubo/config"
|
|
"github.com/ipfs/kubo/core/node/helpers"
|
|
"github.com/ipfs/kubo/repo"
|
|
)
|
|
|
|
// FilesRootDatastoreKey is the datastore key for the MFS files root CID.
|
|
var FilesRootDatastoreKey = datastore.NewKey("/local/filesroot")
|
|
|
|
// BlockService creates new blockservice which provides an interface to fetch content-addressable blocks
|
|
func BlockService(cfg *config.Config) func(lc fx.Lifecycle, bs blockstore.Blockstore, rem exchange.Interface) blockservice.BlockService {
|
|
return func(lc fx.Lifecycle, bs blockstore.Blockstore, rem exchange.Interface) blockservice.BlockService {
|
|
bsvc := blockservice.New(bs, rem,
|
|
blockservice.WriteThrough(cfg.Datastore.WriteThrough.WithDefault(config.DefaultWriteThrough)),
|
|
)
|
|
|
|
lc.Append(fx.Hook{
|
|
OnStop: func(ctx context.Context) error {
|
|
return bsvc.Close()
|
|
},
|
|
})
|
|
|
|
return bsvc
|
|
}
|
|
}
|
|
|
|
// Pinning creates new pinner which tells GC which blocks should be kept
|
|
func Pinning(strategy string) func(bstore blockstore.Blockstore, ds format.DAGService, repo repo.Repo, prov DHTProvider) (pin.Pinner, error) {
|
|
// Parse strategy at function creation time (not inside the returned function)
|
|
// This happens before the provider is created, which is why we pass the strategy
|
|
// string and parse it here, rather than using fx-provided ProvidingStrategy.
|
|
strategyFlag := config.ParseProvideStrategy(strategy)
|
|
|
|
return func(bstore blockstore.Blockstore,
|
|
ds format.DAGService,
|
|
repo repo.Repo,
|
|
prov DHTProvider,
|
|
) (pin.Pinner, error) {
|
|
rootDS := repo.Datastore()
|
|
|
|
syncFn := func(ctx context.Context) error {
|
|
if err := rootDS.Sync(ctx, blockstore.BlockPrefix); err != nil {
|
|
return err
|
|
}
|
|
return rootDS.Sync(ctx, filestore.FilestorePrefix)
|
|
}
|
|
syncDs := &syncDagService{ds, syncFn}
|
|
|
|
ctx := context.TODO()
|
|
|
|
var opts []dspinner.Option
|
|
roots := (strategyFlag & config.ProvideStrategyRoots) != 0
|
|
pinned := (strategyFlag & config.ProvideStrategyPinned) != 0
|
|
|
|
// Important: Only one of WithPinnedProvider or WithRootsProvider should be active.
|
|
// Having both would cause duplicate root advertisements since "pinned" includes all
|
|
// pinned content (roots + children), while "roots" is just the root CIDs.
|
|
// We prioritize "pinned" if both are somehow set (though this shouldn't happen
|
|
// with proper strategy parsing).
|
|
if pinned {
|
|
opts = append(opts, dspinner.WithPinnedProvider(prov))
|
|
} else if roots {
|
|
opts = append(opts, dspinner.WithRootsProvider(prov))
|
|
}
|
|
|
|
pinning, err := dspinner.New(ctx, rootDS, syncDs, opts...)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
|
|
return pinning, nil
|
|
}
|
|
}
|
|
|
|
var (
|
|
_ merkledag.SessionMaker = new(syncDagService)
|
|
_ format.DAGService = new(syncDagService)
|
|
)
|
|
|
|
// syncDagService is used by the Pinner to ensure data gets persisted to the underlying datastore
|
|
type syncDagService struct {
|
|
format.DAGService
|
|
syncFn func(context.Context) error
|
|
}
|
|
|
|
func (s *syncDagService) Sync(ctx context.Context) error {
|
|
return s.syncFn(ctx)
|
|
}
|
|
|
|
func (s *syncDagService) Session(ctx context.Context) format.NodeGetter {
|
|
return merkledag.NewSession(ctx, s.DAGService)
|
|
}
|
|
|
|
// FetchersOut allows injection of fetchers.
|
|
type FetchersOut struct {
|
|
fx.Out
|
|
IPLDFetcher fetcher.Factory `name:"ipldFetcher"`
|
|
UnixfsFetcher fetcher.Factory `name:"unixfsFetcher"`
|
|
OfflineIPLDFetcher fetcher.Factory `name:"offlineIpldFetcher"`
|
|
OfflineUnixfsFetcher fetcher.Factory `name:"offlineUnixfsFetcher"`
|
|
}
|
|
|
|
// FetchersIn allows using fetchers for other dependencies.
|
|
type FetchersIn struct {
|
|
fx.In
|
|
IPLDFetcher fetcher.Factory `name:"ipldFetcher"`
|
|
UnixfsFetcher fetcher.Factory `name:"unixfsFetcher"`
|
|
OfflineIPLDFetcher fetcher.Factory `name:"offlineIpldFetcher"`
|
|
OfflineUnixfsFetcher fetcher.Factory `name:"offlineUnixfsFetcher"`
|
|
}
|
|
|
|
// FetcherConfig returns a fetcher config that can build new fetcher instances
|
|
func FetcherConfig(bs blockservice.BlockService) FetchersOut {
|
|
ipldFetcher := bsfetcher.NewFetcherConfig(bs)
|
|
ipldFetcher.PrototypeChooser = dagpb.AddSupportToChooser(bsfetcher.DefaultPrototypeChooser)
|
|
unixFSFetcher := ipldFetcher.WithReifier(unixfsnode.Reify)
|
|
|
|
// Construct offline versions which we can safely use in contexts where
|
|
// path resolution should not fetch new blocks via exchange.
|
|
offlineBs := blockservice.New(bs.Blockstore(), offline.Exchange(bs.Blockstore()))
|
|
offlineIpldFetcher := bsfetcher.NewFetcherConfig(offlineBs)
|
|
offlineIpldFetcher.SkipNotFound = true // carries onto the UnixFSFetcher below
|
|
offlineIpldFetcher.PrototypeChooser = dagpb.AddSupportToChooser(bsfetcher.DefaultPrototypeChooser)
|
|
offlineUnixFSFetcher := offlineIpldFetcher.WithReifier(unixfsnode.Reify)
|
|
|
|
return FetchersOut{
|
|
IPLDFetcher: ipldFetcher,
|
|
UnixfsFetcher: unixFSFetcher,
|
|
OfflineIPLDFetcher: offlineIpldFetcher,
|
|
OfflineUnixfsFetcher: offlineUnixFSFetcher,
|
|
}
|
|
}
|
|
|
|
// PathResolversOut allows injection of path resolvers
|
|
type PathResolversOut struct {
|
|
fx.Out
|
|
IPLDPathResolver pathresolver.Resolver `name:"ipldPathResolver"`
|
|
UnixFSPathResolver pathresolver.Resolver `name:"unixFSPathResolver"`
|
|
OfflineIPLDPathResolver pathresolver.Resolver `name:"offlineIpldPathResolver"`
|
|
OfflineUnixFSPathResolver pathresolver.Resolver `name:"offlineUnixFSPathResolver"`
|
|
}
|
|
|
|
// PathResolverConfig creates path resolvers with the given fetchers.
|
|
func PathResolverConfig(fetchers FetchersIn) PathResolversOut {
|
|
return PathResolversOut{
|
|
IPLDPathResolver: pathresolver.NewBasicResolver(fetchers.IPLDFetcher),
|
|
UnixFSPathResolver: pathresolver.NewBasicResolver(fetchers.UnixfsFetcher),
|
|
OfflineIPLDPathResolver: pathresolver.NewBasicResolver(fetchers.OfflineIPLDFetcher),
|
|
OfflineUnixFSPathResolver: pathresolver.NewBasicResolver(fetchers.OfflineUnixfsFetcher),
|
|
}
|
|
}
|
|
|
|
// Dag creates new DAGService
|
|
func Dag(bs blockservice.BlockService) format.DAGService {
|
|
return merkledag.NewDAGService(bs)
|
|
}
|
|
|
|
// Files loads persisted MFS root
|
|
func Files(strategy string) func(mctx helpers.MetricsCtx, lc fx.Lifecycle, repo repo.Repo, dag format.DAGService, bs blockstore.Blockstore, prov DHTProvider) (*mfs.Root, error) {
|
|
return func(mctx helpers.MetricsCtx, lc fx.Lifecycle, repo repo.Repo, dag format.DAGService, bs blockstore.Blockstore, prov DHTProvider) (*mfs.Root, error) {
|
|
pf := func(ctx context.Context, c cid.Cid) error {
|
|
rootDS := repo.Datastore()
|
|
if err := rootDS.Sync(ctx, blockstore.BlockPrefix); err != nil {
|
|
return err
|
|
}
|
|
if err := rootDS.Sync(ctx, filestore.FilestorePrefix); err != nil {
|
|
return err
|
|
}
|
|
|
|
if err := rootDS.Put(ctx, FilesRootDatastoreKey, c.Bytes()); err != nil {
|
|
return err
|
|
}
|
|
return rootDS.Sync(ctx, FilesRootDatastoreKey)
|
|
}
|
|
|
|
var nd *merkledag.ProtoNode
|
|
ctx := helpers.LifecycleCtx(mctx, lc)
|
|
val, err := repo.Datastore().Get(ctx, FilesRootDatastoreKey)
|
|
|
|
switch {
|
|
case errors.Is(err, datastore.ErrNotFound):
|
|
nd = unixfs.EmptyDirNode()
|
|
err := dag.Add(ctx, nd)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("failure writing filesroot to dagstore: %s", err)
|
|
}
|
|
case err == nil:
|
|
c, err := cid.Cast(val)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
|
|
offlineDag := merkledag.NewDAGService(blockservice.New(bs, offline.Exchange(bs)))
|
|
rnd, err := offlineDag.Get(ctx, c)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("error loading filesroot from dagservice: %s", err)
|
|
}
|
|
|
|
pbnd, ok := rnd.(*merkledag.ProtoNode)
|
|
if !ok {
|
|
return nil, merkledag.ErrNotProtobuf
|
|
}
|
|
|
|
nd = pbnd
|
|
default:
|
|
return nil, err
|
|
}
|
|
|
|
// MFS (Mutable File System) provider integration: Only pass the provider
|
|
// to MFS when the strategy includes "mfs". MFS will call StartProviding()
|
|
// on every DAGService.Add() operation, which is sufficient for the "mfs"
|
|
// strategy - it ensures all MFS content gets announced as it's added or
|
|
// modified. For non-mfs strategies, we set provider to nil to avoid
|
|
// unnecessary providing.
|
|
strategyFlag := config.ParseProvideStrategy(strategy)
|
|
if strategyFlag&config.ProvideStrategyMFS == 0 {
|
|
prov = nil
|
|
}
|
|
|
|
// Get configured settings from Import config
|
|
cfg, err := repo.Config()
|
|
if err != nil {
|
|
return nil, fmt.Errorf("failed to get config: %w", err)
|
|
}
|
|
chunkerGen := cfg.Import.UnixFSSplitterFunc()
|
|
maxDirLinks := int(cfg.Import.UnixFSDirectoryMaxLinks.WithDefault(config.DefaultUnixFSDirectoryMaxLinks))
|
|
maxHAMTFanout := int(cfg.Import.UnixFSHAMTDirectoryMaxFanout.WithDefault(config.DefaultUnixFSHAMTDirectoryMaxFanout))
|
|
hamtShardingSize := int(cfg.Import.UnixFSHAMTDirectorySizeThreshold.WithDefault(config.DefaultUnixFSHAMTDirectorySizeThreshold))
|
|
sizeEstimationMode := cfg.Import.HAMTSizeEstimationMode()
|
|
|
|
root, err := mfs.NewRoot(ctx, dag, nd, pf, prov,
|
|
mfs.WithChunker(chunkerGen),
|
|
mfs.WithMaxLinks(maxDirLinks),
|
|
mfs.WithMaxHAMTFanout(maxHAMTFanout),
|
|
mfs.WithHAMTShardingSize(hamtShardingSize),
|
|
mfs.WithSizeEstimationMode(sizeEstimationMode),
|
|
)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("failed to initialize MFS root from %s stored at %s: %w. "+
|
|
"If corrupted, use 'ipfs files chroot' to reset (see --help)", nd.Cid(), FilesRootDatastoreKey, err)
|
|
}
|
|
|
|
lc.Append(fx.Hook{
|
|
OnStop: func(ctx context.Context) error {
|
|
return root.Close()
|
|
},
|
|
})
|
|
|
|
return root, err
|
|
}
|
|
}
|