Commit Graph

94 Commits

Author SHA1 Message Date
Cassandra Heart
f2fa7bf57f
v2.1.0.6 (#465) 2025-11-13 04:57:52 -06:00
Cassandra Heart
c797d482f9
v2.1.0.5 (#457)
* wip: conversion of hotstuff from flow into Q-oriented model

* bulk of tests

* remaining non-integration tests

* add integration test, adjust log interface, small tweaks

* further adjustments, restore full pacemaker shape

* add component lifecycle management+supervisor

* further refinements

* resolve timeout hanging

* mostly finalized state for consensus

* bulk of engine swap out

* lifecycle-ify most types

* wiring nearly complete, missing needed hooks for proposals

* plugged in, vetting message validation paths

* global consensus, plugged in and verified

* app shard now wired in too

* do not decode empty keys.yml (#456)

* remove obsolete engine.maxFrames config parameter (#454)

* default to Info log level unless debug is enabled (#453)

* respect config's  "logging" section params, remove obsolete single-file logging (#452)

* Trivial code cleanup aiming to reduce Go compiler warnings (#451)

* simplify range traversal

* simplify channel read for single select case

* delete rand.Seed() deprecated in Go 1.20 and no-op as of Go 1.24

* simplify range traversal

* simplify channel read for single select case

* remove redundant type from array

* simplify range traversal

* simplify channel read for single select case

* RC slate

* finalize 2.1.0.5

* Update comments in StrictMonotonicCounter

Fix comment formatting and clarify description.

---------

Co-authored-by: Black Swan <3999712+blacks1ne@users.noreply.github.com>
2025-11-11 05:00:17 -06:00
Cassandra Heart
eb0b54241d
v2.1.0.3 (#449) 2025-10-23 22:43:17 -05:00
Cassandra Heart
53f7c2b5c9
v2.1.0.2 (#442)
* v2.1.0.2

* restore tweaks to simlibp2p

* fix: nil ref on size calc

* fix: panic should induce shutdown from event_distributor

* fix: friendlier initialization that requires less manual kickstarting for test/devnets

* fix: fewer available shards than provers should choose shard length

* fix: update stored worker registry, improve logging for debug mode

* fix: shut the fuck up, peer log

* qol: log value should be snake cased

* fix:non-archive snap sync issues

* fix: separate X448/Decaf448 signed keys, add onion key to registry

* fix: overflow arithmetic on frame number comparison

* fix: worker registration should be idempotent if inputs are same, otherwise permit updated records

* fix: remove global prover state from size calculation

* fix: divide by zero case

* fix: eager prover

* fix: broadcast listener default

* qol: diagnostic data for peer authenticator

* fix: master/worker connectivity issue in sparse networks

tight coupling of peer and workers can sometimes interfere if mesh is sparse, so give workers a pseudoidentity but publish messages with the proper peer key

* fix: reorder steps of join creation

* fix: join verify frame source + ensure domain is properly padded (unnecessary but good for consistency)

* fix: add delegate to protobuf <-> reified join conversion

* fix: preempt prover from planning with no workers

* fix: use the unallocated workers to generate a proof

* qol: underflow causes join fail in first ten frames on test/devnets

* qol: small logging tweaks for easier log correlation in debug mode

* qol: use fisher-yates shuffle to ensure prover allocations are evenly distributed when scores are equal

* qol: separate decisional logic on post-enrollment confirmation into consensus engine, proposer, and worker manager where relevant, refactor out scoring

* reuse shard descriptors for both join planning and confirm/reject decisions

* fix: add missing interface method and amend test blossomsub to use new peer id basis

* fix: only check allocations if they exist

* fix: pomw mint proof data needs to be hierarchically under global intrinsic domain

* staging temporary state under diagnostics

* fix: first phase of distributed lock refactoring

* fix: compute intrinsic locking

* fix: hypergraph intrinsic locking

* fix: token intrinsic locking

* fix: update execution engines to support new locking model

* fix: adjust tests with new execution shape

* fix: weave in lock/unlock semantics to liveness provider

* fix lock fallthrough, add missing allocation update

* qol: additional logging for diagnostics, also testnet/devnet handling for confirmations

* fix: establish grace period on halt scenario to permit recovery

* fix: support test/devnet defaults for coverage scenarios

* fix: nil ref on consensus halts for non-archive nodes

* fix: remove unnecessary prefix from prover ref

* add test coverage for fork choice behaviors and replay – once passing, blocker (2) is resolved

* fix: no fork replay on repeat for non-archive nodes, snap now behaves correctly

* rollup of pre-liveness check lock interactions

* ahead of tests, get the protobuf/metrics-related changes out so teams can prepare

* add test coverage for distributed lock behaviors – once passing, blocker (3) is resolved

* fix: blocker (3)

* Dev docs improvements (#445)

* Make install deps script more robust

* Improve testing instructions

* Worker node should stop upon OS SIGINT/SIGTERM signal (#447)

* move pebble close to Stop()

* move deferred Stop() to Start()

* add core id to worker stop log message

* create done os signal channel and stop worker upon message to it

---------

Co-authored-by: Cassandra Heart <7929478+CassOnMars@users.noreply.github.com>

---------

Co-authored-by: Daz <daz_the_corgi@proton.me>
Co-authored-by: Black Swan <3999712+blacks1ne@users.noreply.github.com>
2025-10-23 01:03:06 -05:00
Cassandra Heart
dbd95bd9e9
v2.1.0 (#439)
* v2.1.0 [omit consensus and adjacent] - this commit will be amended with the full release after the file copy is complete

* 2.1.0 main node rollup
2025-09-30 02:48:15 -05:00
petricadaipegsp
b728d8d76f
Centralize configuration defaults and upgrade message limits (#410)
* Apply config defaults early

* Apply engine config defaults early

* Apply P2P config defaults early

* Remove default duplicates

* Fix casing

* Add sync message size configuration
2024-12-10 19:10:49 -06:00
petricadaipegsp
667b2aa2bc
Increase gossip history and length (#401)
* Increase gossip history and length

* Increase peer outbound queue size
2024-12-03 05:00:48 -06:00
petricadaipegsp
63394edc9d
Increase subscription buffer size (#400) 2024-12-03 04:26:19 -06:00
petricadaipegsp
1b78d758f5
Prefer connected peers for sync (#395)
* Add externally reachable data peer flag

* Announce node reachability

* Go through candidates based on reachability
2024-12-01 15:07:08 -06:00
petricadaipegsp
4be1888496
Separate dialing from retrieval (#398) 2024-12-01 15:02:07 -06:00
Cassandra Heart
4753178026
deadlock 2024-11-27 00:43:07 -06:00
Cassandra Heart
9b95541be6
resolve race condition 2024-11-27 00:37:07 -06:00
Cassandra Heart
ebc7474946
use absolute 2024-11-27 00:29:55 -06:00
Cassandra Heart
ab2484206d
have to actually run the decay 2024-11-26 23:55:45 -06:00
Cassandra Heart
0242eafa3e
add decay, make validation check a little smarter 2024-11-26 23:45:20 -06:00
petricadaipegsp
f07d855970
blossomsub: Reintroduce GossipFactor (#383) 2024-11-24 17:04:33 -06:00
petricadaipegsp
a543a607be
IDONTWANT Support (#376)
* blossomsub: Remove unused mutex

* blossomsub: Add RPC queue

* blossomsub: Use RPC queue

* blossomsub: Add IDONTWANT control message to protos

* blossomsub: Add IDONTWANT tracing support

* blossomsub: Add pre-validation

* blossomsub: Add IDONTWANT feature flag

* blossomsub: Add IDONTWANT parameters

* blossomsub: Add IDONTWANT observability

* blossomsub: Send IDONTWANT control messages

* blossomsub: Handle IDONTWANT control messages

* blossomsub: Clear maps efficiently

* blossomsub: Increase IDONTWANT parameter defaults

* blossomsub: Do not send IDONTWANT to original sender

* blossomsub: Add IDONTWANT unit tests
2024-11-23 17:15:41 -06:00
petricadaipegsp
b798de5871
Trigger sync on ahead peer (#366) 2024-11-20 17:12:57 -06:00
petricadaipegsp
883f0605ae
Enable AutoNATv1 and NATPortMap (#372) 2024-11-20 17:08:19 -06:00
petricadaipegsp
803cf4b7b3
Close direct channels if the connection is fresh (#371) 2024-11-20 17:07:28 -06:00
petricadaipegsp
cbc405a3a0
Refactor peer pinging to target individual connections (#370) 2024-11-20 17:05:10 -06:00
petricadaipegsp
bc05a4d7b9
Adaptive reserved cores (#363)
* Add adaptive data worker count

* Use runtime worker count for validation workers

* Reserve cores for networking during transition application

* Automatically set GOGC and GOMEMLIMIT
2024-11-19 16:51:14 -06:00
petricadaipegsp
d6234aa328
Avoid BlossomSubRouter race condition (#364) 2024-11-19 04:42:29 -06:00
petricadaipegsp
49566c2280
Add additional P2P configuration (#352)
* Add peer discovery configuration

* Add peer monitor configuration

* Add message validation configuration

---------

Co-authored-by: Cassandra Heart <7929478+CassOnMars@users.noreply.github.com>
2024-11-16 17:54:34 -06:00
petricadaipegsp
80c7ec2889
Add initial Prometheus support (#353)
* Add Prometheus server

* Add Prometheus gRPC metrics

* Add BlossomSub metrics

---------

Co-authored-by: Cassandra Heart <7929478+CassOnMars@users.noreply.github.com>
2024-11-16 17:53:19 -06:00
petricadaipegsp
7819548b6f
Do not engage in PubSub with the bootstrappers (#355) 2024-11-16 17:51:31 -06:00
petricadaipegsp
2780b643d8
Fix BlossomSub router tracing (#343) 2024-11-13 11:36:21 -06:00
petricadaipegsp
db28f1b81e
Remove vendored gostream (#347)
* Remove vendored go-libp2p-gostream

* Remove error wrapping
2024-11-11 15:05:45 -06:00
petricadaipegsp
3dbe0723bd
Add message validators (#346) 2024-11-11 14:10:00 -06:00
Cassandra Heart
4238b3ff5a
initial testnet v2.0.3-p2 2024-11-11 03:34:28 -06:00
Cassandra Heart
67d454acb9
add light prover support 2024-11-09 14:46:53 -06:00
Cassandra Heart
7ac7fc2b67
v2.0.3-b4 2024-11-07 18:03:50 -06:00
Cassandra Heart
1361eeda8c
no parallelism for peer scan 2024-11-07 02:30:16 -06:00
Cassandra Heart
7ca0c9bd37
handle testnet 2024-11-07 01:55:03 -06:00
petricadaipegsp
30a821da09
Fix ping period (#331)
* Fix ping period

* Add missing wait group wait
2024-11-04 23:49:01 -06:00
Cassandra Heart
f50dda6848
everyone's a server on testnet 2024-11-04 21:55:19 -06:00
Cassandra Heart
ee8b344dde
more adjustments 2024-11-04 21:10:21 -06:00
petricadaipegsp
e23ad7869c
Trigger automatic peer discovery on frame stall (#328) 2024-11-04 19:25:30 -06:00
petricadaipegsp
7889f76a7e
Lookup peers via DHT (#329) 2024-11-04 19:24:27 -06:00
petricadaipegsp
8ee28eb2a7
On demand bootstrap reconnection (#327)
* Aggressive bootstrap reconnection

* Reconnect bootstraps on demand
2024-11-03 22:02:30 -06:00
petricadaipegsp
f848088c0c
Fix merge conflict (#323) 2024-11-01 15:34:19 -05:00
Cassandra Heart
4b61a00095
restore prover rings 2024-10-31 23:44:23 -05:00
Cassandra Heart
9201ccbcd9
merge conflict? 2024-10-31 19:20:57 -05:00
Cassandra Heart
ad55d280f8
a little more logic around connection management 2024-10-31 19:11:39 -05:00
Cassandra Heart
3dd9a0c5f3
get develop caught up (#322)
* Update qcommander.sh bootrap (#304)

* v2.0.1 (#308)

* roll up v2.0.1-b2 to develop

* b2-fixed

* adjust return data of fast sync so it doesn't return the earliest frame

* -b3

* fix: announce peer based on leading frame, not initial frame; fix: looping bug

* fix: last batch fails due to underflow; qol: make logging chattier

* -b4

* resolve frame cache issue

* fix: mint loop + re-migrate

* fix: register execution panic

* fix: mint loop, other side

* fix: handle unexpected return of nil status

* final -b4

* handle subtle change to migration

* qol: add heuristic to handle corruption scenario

* bump genesis

* qol: use separate channel for worker

* final parameterization, parallelize streams

* deprecate signers 10, 11, 14, 17

* adjust signatory check size to match rotated out signers

* V2.0.2.3 (#321)

* roll up v2.0.1-b2 to develop

* b2-fixed

* adjust return data of fast sync so it doesn't return the earliest frame

* -b3

* fix: announce peer based on leading frame, not initial frame; fix: looping bug

* fix: last batch fails due to underflow; qol: make logging chattier

* -b4

* resolve frame cache issue

* fix: mint loop + re-migrate

* fix: register execution panic

* fix: mint loop, other side

* fix: handle unexpected return of nil status

* final -b4

* handle subtle change to migration

* qol: add heuristic to handle corruption scenario

* bump genesis

* qol: use separate channel for worker

* final parameterization, parallelize streams

* Add direct peers to blossomsub (#309)

Co-authored-by: Tyler Sturos <tyler.john@qcommander.sh>

* chore(docker): add ca-certificates to fix x509 error. (#307)

* Update qcommander.sh bootrap (#304)

* chore(docker): add ca-certificates to fix x509 error.

---------

Co-authored-by: Tyler Sturos <55340199+tjsturos@users.noreply.github.com>

* deprecate signers 10, 11, 14, 17

* adjust signatory check size to match rotated out signers

* qol: sync by rebroadcast

* upgrade version

* more small adjustments

* wait a little longer

* fix: don't use iterator for frame directly until iterator is fixed

* change iterator, genesis for testnet

* adjust to previous sync handling

* adjust: don't grab the very latest while it's already being broadcasted

* ok, ready for testnet

* handle rebroadcast quirks

* more adjustments from testing

* faster

* temporarily bulk process on frame candidates

* resolve separate frames

* don't loop

* make worker reset resume to check where it should continue

* move window

* reduce signature count now that supermajority signed last

* resolve bottlenecks

* remove GOMAXPROCS limit for now

* revisions for v2.0.2.1

* bump version

* bulk import

* reintroduce sync

* small adustments to make life better

* check bitmask for peers and keep alive

* adjust reconnect

* ensure peer doesn't fall off address list

* adjust blossomsub to background discovery

* bump version

* remove dev check

* remove debug log line

* further adjustments

* a little more logic around connection management

* v2.0.2.3

* Fix peer discovery (#319)

* Fix peer discovery

* Make peer discovery connections parallel

* Monitor peers via pings (#317)

* Support QUILIBRIUM_SIGNATURE_CHECK in client (#314)

* Ensure direct peers are not pruned by resource limits (#315)

* Support pprof profiling via HTTP (#313)

* Fix CPU profiling

* Add pprof server support

* Additional peering connection improvements (#320)

* Lookup peers if not enough external peers are available

* Make bootstrap peer discovery sensitive to a lack of bootstrappers

---------

Co-authored-by: Tyler Sturos <55340199+tjsturos@users.noreply.github.com>
Co-authored-by: Tyler Sturos <tyler.john@qcommander.sh>
Co-authored-by: linquanisaac <33619994+linquanisaac@users.noreply.github.com>
Co-authored-by: petricadaipegsp <155911522+petricadaipegsp@users.noreply.github.com>

---------

Co-authored-by: Tyler Sturos <55340199+tjsturos@users.noreply.github.com>
Co-authored-by: Tyler Sturos <tyler.john@qcommander.sh>
Co-authored-by: linquanisaac <33619994+linquanisaac@users.noreply.github.com>
Co-authored-by: petricadaipegsp <155911522+petricadaipegsp@users.noreply.github.com>
2024-10-31 16:46:58 -05:00
Cassandra Heart
262cf5271d
adjust reconnect 2024-10-27 00:55:31 -05:00
Cassandra Heart
b8973df266
check bitmask for peers and keep alive 2024-10-27 00:51:35 -05:00
Cassandra Heart
c0396f57a9
revisions for v2.0.2.1 2024-10-26 03:32:35 -05:00
Cassandra Heart
d57757730d
upgrade version 2024-10-24 21:54:51 -05:00
Tyler Sturos
470d7f6ee4
Add direct peers to blossomsub (#309)
Co-authored-by: Tyler Sturos <tyler.john@qcommander.sh>
2024-10-24 16:59:34 -05:00