ceremonyclient/ARCHITECTURE.md
Cassandra Heart dbd95bd9e9
v2.1.0 (#439)
* v2.1.0 [omit consensus and adjacent] - this commit will be amended with the full release after the file copy is complete

* 2.1.0 main node rollup
2025-09-30 02:48:15 -05:00

577 lines
22 KiB
Markdown

# Quilibrium Architecture
## Overview
Quilibrium is a distributed protocol that leverages advanced cryptographic
techniques including multi-party computation (MPC) for privacy-preserving
compute. The system operates on a sharded network architecture where the main
process runs global consensus while data worker processes run app-level
consensus for their assigned shards, each maintaining their own hypergraph
state, storage, and networking stack.
## System Architecture
### High-Level Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Client Applications │
│ (CLI, RPC Clients, Applications) │
└───────────────────────────────┬─────────────────────────────────┘
┌───────────────────────────────┴─────────────────────────────────┐
│ RPC Layer │
│ (gRPC/REST APIs, IPCP2P Communication) │
└───────────────────────────────┬─────────────────────────────────┘
┌───────────────────────────────┴─────────────────────────────────┐
│ Main Node Process (Core 0) │
│ ┌──────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Global │ │ Global │ │ P2P Network │ │
│ │ Consensus │ │ Execution │ │ (BlossomSub) │ │
│ │ Engine │ │ Engine │ │ │ │
│ └──────────────┘ └─────────────┘ └─────────────────────┘ │
│ ┌──────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Storage │ │ Rust │ │ Global Hypergraph │ │
│ │ (PebbleDB) │ │ Libraries │ │ State │ │
│ └──────────────┘ └─────────────┘ └─────────────────────┘ │
└───────────────────────────────┬─────────────────────────────────┘
│ IPCP2P
┌───────────────────────────────┴─────────────────────────────────┐
│ Data Worker Process (Core 1) │
│ ┌──────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ App │ │Token/Hyper- │ │ P2P Network │ │
│ │ Consensus │ │graph/Compute│ │ (BlossomSub) │ │
│ │ Engine │ │ Execution │ │ │ │
│ └──────────────┘ └─────────────┘ └─────────────────────┘ │
│ ┌──────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Storage │ │ Rust │ │ Shard Hypergraph │ │
│ │ (PebbleDB) │ │ Libraries │ │ State │ │
│ └──────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
[Repeat for Core 2, 3, ...]
```
### Core Components
#### 1. Node Module (`node/`)
The main entry point containing the core system implementation with a
multi-process architecture where each process has its own complete stack.
**Process Types:**
##### Main Process (Core 0)
Runs the global consensus and coordination:
- **Global Consensus Engine**: System-wide consensus and coordination
- **Global Execution Engine**: System-level operations
- **P2P Networking**: Full BlossomSub stack for global communication
- **Storage (PebbleDB)**: Persistent storage for global state
- **Rust Library Bindings**: Access to cryptographic operations
- **Global Hypergraph State**: System-wide state management
##### Data Worker Processes (Cores 1+)
Each worker manages specific shards:
- **App Consensus Engine**: Shard-level consensus
- **Execution Engines**: Token, Hypergraph, and Compute engines
- **P2P Networking**: Full BlossomSub stack for shard communication
- **Storage (PebbleDB)**: Persistent storage for shard state
- **Rust Library Bindings**: Access to cryptographic operations
- **Shard Hypergraph State**: Shard-specific state management
**Application Layer** (`app/`):
- `Node`: Main node implementation (runs in main process)
- `DataWorkerNode`: Worker node implementation (runs in worker processes)
**Key Subsystems:**
##### Consensus Engines (`node/consensus/`)
Two distinct consensus implementations:
**Global Consensus** (`node/global/`) - Main Process:
- System-wide coordination
- Cross-shard transaction ordering
- Global state transitions
- Network-wide fork choice
**App Consensus** (`node/app/`) - Worker Processes:
- Shard-specific consensus
- Local transaction ordering
- Shard state transitions
- Integration with global consensus
**Shared Components**:
- `DifficultyAdjuster`: Dynamic difficulty adjustment
- `DynamicFeeManager`: Dynamic Fee Market management
- `ProverRegistry`: Enrolled prover registry
- `SignerRegistry`: Block signer registry
- Time Reels: VDF-based time consensus
##### Execution Engines (`node/execution/`)
Distribution across processes:
**Global Execution Engine** - Main Process:
- System-wide operations
- Cross-shard coordination
- Global state updates
**App Execution Engines** - Worker Processes:
- **Token Engine**: QUIL token operations within shards
- **Hypergraph Engine**: CRDT-based state management for shards
- **Compute Engine**: MPC-based privacy-preserving computation
##### P2P Networking (`node/p2p/`)
Each process (main and workers) has its own complete P2P stack:
- **BlossomSub**: Custom pub/sub protocol
- **Peer Management**: Discovery and connections
- **Public Channels**: Point to point authenticated message routing
- **Private Channels**: Onion-routing authenticated message channels
- **DHT Integration**: Distributed peer discovery
Main process handles global bitmask, workers handle shard-specific bitmasks.
##### Storage Layer (`node/store/`)
Each process maintains its own PebbleDB instance:
- **Clock Store**: Frame-based ordering
- **Coin Store**: Locally managed token state
- **Data Proof Store**: Cryptographic proofs
- **Hypergraph Store**: Hypergraph state persistence
- **Peer Store**: Network peer information
Storage is partitioned by process responsibility (global vs shard).
##### Hypergraph (`hypergraph/`)
Distributed graph structure:
- **Global Hypergraph**: Maintained by main process
- **Shard Hypergraphs**: Maintained by respective workers
- **CRDT Semantics**: Conflict-free updates
- **Components**:
- `Hypergraph`: Core graph structure
- `Vertex`: Node with associated data
- `Hyperedge`: Multi-way relationships
##### Cryptographic Layer (`types/crypto/`)
Shared across all processes via Rust bindings (`bls48581`):
- **Proof Trees**: Verkle tree structures
- **Channel Crypto**: Secure E2EE communication
#### 2. Client Module (`client/`)
Command-line interface for user interactions:
- **Node Management**: Install, update, service control
- **Token Operations**: Balance, transfer, mint operations
- **Hypergraph Operations**: Deploy, store, update, query operations
- **Compute Operations**: Deploy, execute, commit operations
- **Cross-Mint**: Token bridging
- **Configuration**: Network and RPC settings
#### 3. Rust Crates (`crates/`)
High-performance cryptographic libraries used by all processes:
**Core Cryptography**:
- **bls48581**: BLS signatures
- **bulletproofs**: Transaction privacy via zero-knowledge proofs
- **vdf**: Verifiable Delay Functions for time consensus
- **verenc**: Verifiable encryption schemes
**MPC Components**:
- **ferret**: Oblivious transfer protocols
- **channel**: Secure communication protocols
- **rpm**: Randomized Permutation Matrices
**Supporting**:
- **classgroup**: Class group VDF implementation
- WASM bindings for various libraries
#### 4. Supporting Libraries
**MPC and Privacy**:
- `bedlam`: Garbled circuit compiler/evaluator
- `emp-ot`, `emp-tool`: Efficient MPC primitives
- `nekryptology`: Additional cryptographic tools
**Networking**:
- `go-libp2p-blossomsub`: Custom gossip for sharding
- `go-libp2p-kad-dht`: DHT implementation
- `go-libp2p`: Core P2P functionality
**Storage**:
- `node/pebble`: High-performance key-value store
- `protobufs`: Protocol definitions
#### 5. Interfaces (`types/`)
**Interfaces**:
- `types/consensus/`: Consensus-oriented interfaces:
- **AppConsensusEngine**: The app shard level consensus engine
- **GlobalConsensusEngine**: The global level consensus engine
- **DifficultyAdjuster**: Manages the next difficulty following the prior
frame
- **EventDistributor**: The central nervous system to internal event control
flows, connects time reel events, consensus engine events, and emits
ordered control events. Manages halts, resumes, and state rewind/replay
events.
- **DynamicFeeManager**: Tracks and calculates dynamic fee market multipliers
- **RewardIssuance**: The issuance policy of the network
- **ProverRegistry**: Tracks prover information from ongoing hypergraph state
transitions
- **SignerRegistry**: Tracks general signer information, including private
messaging channel keys
- **AppFrameValidator**: Validator path for app shard frames
- **GlobalFrameValidator**: Validator path for global frames
- `types/crypto/`:
- **BlsAggregateOutput**: Methods for handling aggregate signatures
- **BlsKeygenOutput**: Methods for handling BLS keygen outputs
- **BulletproofProver**: Provides construction and verification of outputs
related to bulletproofs
- **FrameProver**: Methods for generating proofs on frames
- **InclusionProver**: Methods for generating KZG proofs
- **Multiproof**: Methods for multiproofs
- **BlsConstructor**: Methods for creating BLS keys
- **DecafConstructor**: Methods for creating Decaf 448 keys
- **DecafAgreement**: Methods specific to key agreement with Decaf 448 keys
- **Signer**: Generic signer interface to unify all managed key types
- **Agreement**: Generic key agreement interface to unify all DH agreed keys
- **VerifiableEncryptor**: Methods for producing verifiable encryption proofs
- **VerEncProof**: Methods for handling the proof outputs
- **VerEnc**: Methods for handling the compressed proof outputs for storage
- `types/execution/intrinsics/`: Intrinsics and intrinsic operation interfaces:
- **Intrinsic**: Methods for managing intrinsics
- **IntrinsicOperation**: Methods for performing operations of the intrinsics
- `types/execution/state/`: State management related interfaces:
- **State**: Methods for initializing, modifying, and deleting state, and
retrieving state transition outputs
- `types/execution/`: Execution engine encapsulation
- **ShardExecutionEngine**: Shard-specific execution container methods
- `types/hypergraph/`: Hypergraph-oriented interfaces
- `types/keys/`: Key management interfaces
- `types/p2p/`: PubSub related interfaces
- `types/store/`: KVDB store related interfaces
**Mock Implementations**:
- **MockBlsConstructor**: BLS Key Constructor, used by key manager
- **MockBLSSigner**: BLS Keypair, used in many places
- **MockBulletproofProver**: Bulletproof prover, used by token intrinsic
- **MockDecafConstructor**: Decaf 448 Key Constructor, used by key manager and
token intrinsic
- **MockVertex, MockHyperedge, MockHypergraph**: Hypergraph components
- **MockInclusionProver**: KZG prover
- **MockMultiproof**: Multiproof output
- **MockKeyManager**: Key manager, used everywhere
- **MockPubSub**: PubSub-based message system, used by consensus engines
- **MockVerifiableEncryptor, MockVerEncProof, MockVerEnc**: Verifiable
encryption, used in intrinsics, hypergraph storage
**Tries**:
All trie related constructions are contained within types for easy WASM
generation.
### Architectural Patterns
#### 1. Shared-Nothing Process Architecture
Each process is self-contained with its own:
- Consensus engine (global or app)
- Execution engine(s)
- P2P networking stack
- Storage layer
- Cryptographic libraries
Benefits:
- Process independence
- Failure isolation
- Horizontal scalability
#### 2. Sharded State Management
State is partitioned across processes:
- Main process: Global state and coordination
- Worker processes: Shard-specific state
- CRDT hypergraph for convergence
#### 3. Multi-Party Computation (MPC)
Privacy-preserving computation:
- Garbled circuits via bedlam
- Oblivious transfer for private inputs
- General-purpose privacy beyond just transactions
#### 4. Layered Consensus
Two-tier consensus system:
- Global consensus for system-wide coordination
- App consensus for shard-level operations
- Time Reel VDF for temporal ordering
#### 5. IPC Communication
Inter-process communication design:
- Structured message passing
- Minimal shared state
### Security Architecture
#### 1. Privacy Layers
**Transaction Privacy**:
- Bulletproofs for confidential transactions
- Zero-knowledge range proofs and sum checks
**Computation Privacy**:
- MPC for general computations
- Garbled circuits for function privacy
- Oblivious transfer for input privacy
**Data Privacy**:
- Verifiable encryption for settled state
#### 2. Process Isolation
- Independent process spaces
- Limited IPC surface
- Resource isolation per process
#### 3. Network Security
- Peer reputation scoring
- Sybil resistance
- Eclipse attack prevention
- Shard-aware routing
#### 4. Binary Verification
- Multi-signature verification
- Threshold signing schemes
- Automated signature updates
### System Boundaries
#### Process Boundaries
- Main process: Global operations
- Worker processes: Shard operations
- Clear IPC interfaces between processes
#### Consensus Boundaries
- Global consensus: Cross-shard coordination
- App consensus: Intra-shard operations
- Well-defined interaction protocols
#### State Boundaries
- Global state in main process
- Shard state in worker processes
- CRDT merge protocols for synchronization
#### Network Boundaries
- Each process has independent P2P stack
- Topic-based message routing
- Shard-aware peer connections
### Build and Deployment
#### Build System
- Hybrid Go/Rust via CGO
- Platform-specific optimizations
- Static linking for distribution
#### Configuration
- Per-process configuration
- Environment variable support
- YAML configuration files
#### Monitoring
- Prometheus metrics per process
- Grafana dashboards
- Structured logging
## Design Rationale
### 1. Independent Process Stacks
- **Decision**: Each process has full stack
- **Rationale**: Maximum isolation, independent scaling, failure resilience
### 2. Two-Tier Consensus
- **Decision**: Separate global and app consensus
- **Rationale**: Scalability through sharding while maintaining global
coordination
### 3. MPC for General Privacy
- **Decision**: MPC beyond just transactions
- **Rationale**: Flexible privacy for arbitrary computations
### 4. CRDT Hypergraph
- **Decision**: CRDT-based state model
- **Rationale**: Natural convergence in distributed sharded system
### 5. Rust Cryptography
- **Decision**: Rust for crypto operations
- **Rationale**: Performance and memory safety for critical operations
## Development Guidelines
### Architecture Principles
- Maintain process independence
- Respect consensus boundaries
- Minimize cross-shard communication
- Preserve privacy guarantees
### Code Organization
- Clear separation by process type
- Shared interfaces and types
- Process-specific implementations
### Testing Strategy
- Unit tests per component
- Integration tests per process
- Cross-process interaction tests
- MPC protocol verification
## Frequently Asked Questions (FAQ) with Code References
### Getting Started
**Q: Where is the main entry point?**
- Main entry: `node/main.go`
- The `main()` function handles initialization, spawns worker processes, and
manages the lifecycle
**Q: How does the multi-process architecture work?**
- Process spawning: `node/main.go` (see `spawnDataWorkers` function)
- Main process runs on core 0, spawns workers for cores 1+ using `exec.Command`
- Each worker receives `--core` parameter to identify its role
**Q: How do I run the node?**
- Build: `./node/build.sh` (creates static binary)
- Run: `./node/node` (starts main process which spawns workers)
- Configuration: `config.yml` or environment variables
### Architecture & Design
**Q: How do processes communicate (IPC)?**
- Uses IPC over private pubsub topics for structured message passing
- Main process acts as coordinator, workers connect via IPC
**Q: Where is consensus implemented?**
- App Consensus (workers): `node/consensus/app/app_consensus_engine.go`
- Factory: `node/consensus/app/factory.go`
- Consensus interface: `node/consensus/consensus_engine.go`
**Q: Where is the hypergraph implementation?**
- Main interface: `hypergraph/hypergraph.go`
- Core components:
- Vertex: `hypergraph/vertex.go`
- Hyperedge: `hypergraph/hyperedge.go`
- Atoms: `hypergraph/atom.go`
- Proofs: `hypergraph/proofs.go`
### Networking & P2P
**Q: How is P2P networking initialized?**
- BlossomSub: `node/p2p/blossomsub.go` (`NewBlossomSub`)
- Worker P2P ports: Base port + worker index
- Each process has independent P2P stack
**Q: How does cross-shard communication work?**
- Through main process coordination via P2P
- Topic-based routing in BlossomSub
- Shard-aware peer connections
- Majority of cross-shard communication requires only proof data, keeping shards
conflict-free
### Execution & State
**Q: Where are execution engines created?**
- Factory: `node/execution/engines/factory.go` (`CreateExecutionEngine`)
- Engine types: Global, Compute, Token, Hypergraph
- Batch creation: `CreateAllEngines`
**Q: How is state stored?**
- PebbleDB wrapper: `node/store/pebble.go`
- Clock store: `node/store/clock.go`
- Localized Coin store: `node/store/coin.go`
- Each process has its own DB instance
### Cryptography & Privacy
**Q: Where is MPC/garbled circuits used?**
- Bedlam compiler: `bedlam/` directory
- Compute intrinsic: `node/execution/intrinsics/compute/compute_deploy.go`
- Client deployment initializes the garbler
**Q: Where are the cryptographic libraries?**
- Rust crates: `crates/` directory
- BLS: `crates/bls48581/`
- Bulletproofs: `crates/bulletproofs/`
- VDF: `crates/vdf/`
- Go bindings: Generated via CGO in each crate
**Q: How are bulletproofs integrated?**
- Rust implementation: `crates/bulletproofs/`
- Used for transaction privacy (amount hiding)
- Go bindings in `bulletproofs/` directory
### Configuration & Operations
**Q: How do I configure the node?**
- Main config: `config/config.go` (`Config` struct)
- Engine config: `config/engine.go`
- Data worker config
- Load with YAML file or environment variables
**Q: What are the RPC/API endpoints?**
- Node RPC: `node/rpc/node_rpc_server.go`
- gRPC definitions: `protobufs/node.proto`
**Q: How do I monitor the node?**
- Prometheus metrics throughout codebase
- Grafana dashboards: `dashboards/grafana/`
- Structured logging with configurable levels
### Development
**Q: How is the project built?**
- Node build: `node/build.sh` (static linking, platform detection)
- Client build: `client/build.sh`
- Rust integration: Various `generate.sh` scripts
- Docker: Multiple Dockerfiles for different scenarios
**Q: What happens when a worker crashes?**
- Main process monitors worker processes
- Automatic restart logic in `spawnDataWorkers`
- Process isolation prevents cascade failures
**Q: How do I add a new execution engine?**
- Implement the interface in `node/execution/`
- Add to factory in `node/execution/engines/factory.go`
- Register intrinsics as needed
### Common Code Paths
**Transaction Flow:**
1. RPC endpoint receives transaction
2. Routed to appropriate shard via worker
3. App consensus orders transaction
4. Execution engine processes
5. State updated in hypergraph
6. Proof generated and stored
**Consensus Flow:**
1. Time reel provides ordering
2. App consensus in workers for shards
3. Global consensus coordinates cross-shard
4. Fork choice based on VDF proofs
**P2P Message Flow:**
1. BlossomSub receives message
2. Topic routing to appropriate handler
3. Shard-specific messages to workers
4. Global messages to main process