Application owns a ledger-forming component, a p2p “overlay” component for connecting to peers and flooding messages between peers, a set-synchronization component for arriving at likely-in-sync candidate transaction sets, a transaction processing component for applying a consensus transaction set to the ledger, a crypto component for confirming signatures and hashing results, and a database component for persisting ledger changes. Two slightly-obscurely-named components are:
“BucketList”, stored in the directory “bucket”: the in-memory and on-disk
linear history and ledger form that is hashed. A specific arrangement of
concatenations-of-XDR. Organized around “temporal buckets”. Entries tending
to stay in buckets grouped by how frequently they change.
SCP – “Stellar Consensus Protocol”, the component implementing the consensus algorithm.
Single main thread doing async I/O and forming consensus; multiple worker threads doing computation (primarily memcpy, serialization, hashing). No multithreading on the core I/O or consensus logic.
No secondary process-supervision process, no autonomous threads / complex shutdown requests. Can generally just destroy the application object (worker thread-joining is the only wait condition during shutdown).
Virtualized time, so that server can be cranked forwards at fast simulated time or have simulated time delays during testing. No real-time timeouts (except the one that synchronizes virtual and real time, in production).
Storage is split in two pieces, one bulk/cold Bucket-based store (history) kept in flat files, and one hot/indexed store (SQL DB). Both kept primarily off the validator nodes.
No direct service of public HTTP requests. HTTP and websocket frontends are on separate public/frontend servers.
Sufficiently few globals (logging, CSPRNG) that one can run multiple application instances in-process and connect them together for loopback testing.
No use of boost. Use C++11 when possible, task-specific libraries when required.
No use of custom serialization format, nor embedding in protobufs. Uses single, standard XDR for canonical (hashed) format, history, and inter-node messaging.
No use of custom datatypes (No custom time epochs, currency codes, decimal floating point, etc.)
Validators are kept as simple as possible and offload as much responsibility as they can to other parts of the system. In particular, validators do not store or serve long-term history archives; they do not operate a transactional (on disk) store for the “current state of the ledger”; they do not serve public HTTP requests directly. These roles are offloaded to servers that are better suited to these tasks, for which there are existing/better software stacks; validators should have an “even” and predictable system-load profile. Validators are also kept as stateless as possible keeping disk and memory constraints in mind.
Set of core validator nodes. Running stellar-core only. Tasked with:
SQL DB nodes. One per validator (or one + failover, however we make an SQL server sufficiently safe, eg. RDS). Directly associated with that validator. Tasked with:
Set of public HTTP nodes. Not running stellar-core. Running apache/nginx/node/HTTP stack of choice. Flexible. Tasked with:
History archives. Long term blob storage in S3/GCS/Azure. Tasked with:
Observation channel for validators notifying public HTTP nodes of tx results. TBD. Simplest technique is to use the LISTEN/NOTIFY machinery built into postgres/libpq, though that commits us to postgres pretty firmly. If unacceptable, use an external message queue. This is just to trigger wakeups on public HTTP nodes awaiting tx results. Worst case / failure mode, they can timeout/poll. Messages are idempotent, content-free pings.
(optional): Set of public validator nodes. Running stellar-core only. Tasked with: