Reducing Block Time (and Resource Usage) with Buffered Signatures
A few months ago, we launched Alto, a minimal (and wicked fast) blockchain for continuously benchmarking the Commonware Library. Today, I'm thrilled to share that this benchmarking drove a 20% reduction in block time (to ~200ms), a 20% reduction in block finality (to ~300ms), and a 65% reduction in CPU usage.
It turns out procrastinating, even in the world of consensus, can be a great strategy.
Laying the Foundation
Last August, we released the first Commonware Library primitive: p2p::authenticated. Unlike traditional p2p libraries that specialize in gossip-based messaging over a subset of peers (typically identified by randomly generated IDs), p2p::authenticated provides point-to-point messaging between a swarm of fully-connected and authenticated peers (identified by some public key on an externally synchronized list, like a staking registry).
 
            The consensus primitives we released in December (consensus::simplex) and January (consensus::threshold-simplex) built on top of p2p::authenticated but didn't do anything particularly clever with it (the point-to-point messaging alone was enough of an improvement to reach the ~250ms block time in the original Alto launch).
So, what would something clever look like?
Buffered Signature Verification
Seeking to progress through consensus as fast as possible, most consensus implementations verify signatures as soon as they are available (often in parallel to avoid slowing down the consensus state machine). This eager verification makes intuitive sense—you want to either enter a new view or finalize a block as soon as possible (and verifying a signature is a prerequisite for either).
When peers are tasked with forwarding messages in traditional p2p, the recipient of a message can never be sure what they have (and what they have from whom) until verification. A malicious peer could forward a message with a valid signature, tamper with a signature over a message that someone did send (invalid signature), or even fabricate messages to impersonate a different peer (invalid signature).
 
            With p2p::authenticated, we have the necessary functionality to do something novel: dedicated peer slots. Instead of verifying each peer signature individually as it arrives, we now buffer messages in slots dedicated to each authenticated peer. When we collect a quorum (2f+1) of signatures over some message, we perform a single multi-signature verification (with plans to support batch verification if not using BLS12-381 signatures) instead of verifying each signature individually (dramatically reducing the time spent verifying each signature).
 
            Handling Invalid Signatures
Now that we aren't verifying each message when it arrives over the wire, however, one bad apple can spoil the whole bunch.
Fortunately, we have one more capability in p2p::authenticated to make this safe: identity blocking. When verification fails, a binary search is run over all buffered signatures to identify the offender(s). In the worst case (where f signatures are invalid), this can lead to more verifications than the original eager approach in a given view. However, once identified as malicious, that peer is blocked by its staking identity, unable to send more invalid messages to us from any network address until the binary restarts. Without this blocking, it would be trivial for a set of malicious peers to undermine this optimization (and make it worse than the original eager approach on every view). With identity-based blocking, however, bisection will only be triggered on a very small number of views over the course of a typical 7-day epoch (3,000,000 views). Specifically, a validator set of 1000 could at most cause disruption in ~333 views (0.011% of views).
 
            Reducing Block Time (and Resource Usage)
By transitioning from individual signature verifications to aggregate signature verification, the performance of Alto improved significantly (while using less resources):
- Block time: 255ms → 200ms (20% reduction)
- Finality: 375ms → 300ms (20% reduction)
- CPU usage: 26% → 9% on a c7g.xlarge (65% reduction)
The footprint of p2p::authenticated and consensus::threshold-simplex is now less than half a core, leaving plenty of compute available for your application to do more of whatever it does best. Reproduce our results (and start building your own blockchain) with the free and open-source code today.
Consensus is often the main attraction. At Commonware, we've been working hard to ensure it is a sideshow.