A Critique of Awemany's Critique of Canonical Transaction Ordering
Awemany's recent paper has been used by many people as evidence to suggest that ABC's canonical transaction ordering is misguided and rushed. I think Awemany is a very smart and capable developer but I think he makes some mistakes in his analysis which I will highlight here. But before I begin I want the reader to understand the motivation for canonical ordering. The primary issue that the current ordering consensus rule is burdensome for implementations seeking to parallelize much of the work of validating and creating blocks. As was mentioned by Shammah Chancellor in his recent article, while it's true that Moore's law suggests that computers are getting faster, the way they are getting faster is by adding more CPU cores not really by increasing the speed of each individual core. All current Bitcoin and Bitcoin Cash implementations are written in such a way that most of the work is done on a single CPU core meaning the software is not taking advantage of the underlying resources available to it and it's also not well positioned to take advantage of Moore's law. Simply buying more expensive hardware will not improve the performance of a Bitcoin node as that money will buy you more cores but the software simply will not use them. Craig Wright's alleged 2015 experiment showing 2.6M transactions per second was run on a machine with 265,450 cores. This is funny because while you can spend an arm and a leg on a machine with 265,450 cores, the software will still continue to use only one. Yes, you can parallelize part of the work to validate a block if you keep the existing ordering rule, but you will never be able to parallelize all of it as you'll always need to do at least some work serially to validate the current consensus rule (more on this below). So the primary argument coming from ABC is that the current ordering rule needs to go. For the purpose of parallelization simply removing the ordering rule and allowing transactions to be in any order would work, but if you've already accepted the removal of the existing ordering rule, it's a much smaller mental leap to see that introducing a new, canonical, ordering would provide additional benefits (like graphene and exclusion proofs) beyond what you would get with a no-ordering rule. So with that said let's turn to Awemany's arguments. This completely overlooks the key reason why blocks are ordered the way they are and is the root of most of the problems with the arguments that follow from here. They are clearly NOT ordered like this because of an accident, they are ordered like this because it is the natural order. It is the order by which transactions can be generated, if one would have an omniscient view. [...] If you think of Bitcoin as an incentivized timestamping system, then all data that comes into the system naturally follows what is named TTOR by the authors (and what is named natural order herein). Blocks serve as structured time stamps of this data and giving a unique order to what uncertainty remains in the system due to physical laws and its consequences (propagation delays etc.). I think it's very debatable that blocks are not ordered the way they are currently by accident given that there is zero lines of code dedicated to validating the order. Rather, it's a byproduct of Satoshi's single threaded software design from version 0.1. His discussion of block timestamps highlights an inconsistency with the current rules. The block's timestamp says "all these transactions occurred simultaneously from the point of view of the network" while the ordering rule says "transaction B came after transaction A". In either case, even if we accept the argument that the current ordering rule is "natural" (which is a very misleading description) it does not follow that natural ordering is necessarily good.
Updating the UTXO set can only happen by following the partial order of transaction dependence. This is completely independent of a parallel or a serial single core implementation, it always has to happen with honoring the partial order of transactions! [...] The partial order is, just like, for example, the preconditions when running a Makefile, essential to be followed when updating the UTXO set. This is wrong. ABC have demonstrated conclusively that you do not need to follow the partial order to update the UTXO set. You simply loop through all the outputs in the block and add them, then loop through all the inputs and delete them. The order does not matter. Even Thomas Zander's Flowee code that does parallelization more or less uses this same technique. If you keep the ordering rule then you have to spend a bunch of extra CPU cycles doing preprocessing to sort the transactions into two buckets ― those that can be processed in parallel and those that have to be processed serially to validate the ordering consensus rule. By removing the ordering rule you eliminate the need to sort transactions into these two buckets as all transactions can be processed in parallel. A miner can reorder under the relatively weak constraint of the natural (partial) order, as it is required for validation. This would greatly reduce the extra cost to propagate ordering, such as done in the current BU Graphene implementation and could be done without a change of the validation rules.
Let's remember what I said in the opening. The primary goal is to remove the ordering requirement to improve parallelization. Once you recognize this is beneficial it's a small mental leap to realize that introducing a new canonical ordering rule would provide secondary benefits like Graphene. What we're talking about here is a secondary benefit, not the primary one. Further, while you can do graphene under the current ordering rule, it's not as efficient and requires more data to be transmitted than it would under a canonical ordering rule. one might want to point out here that a Graphene block still following the natural order has a key advantage for the receiving end that is dropped completely in this analysis: He gets the block in an order he can validate it in, without further processing or reconstruction necessary. We've already shown above you do not need the block to be in the natural order to validate it. The successful propagation by natural order comes along with a successful validation in natural order by the network. When transactions go into the mempool, they can be validated for all the required, transaction-specific rule adherence at that point in time already. Later on, the authors talk about having a smooth and continuous use of computing and network resources being desirable for transaction processing( see below). But this is also the easiest to achieve when the transactions that come into the network naturally follow a partial order as necessary for validation! Resorting them when blocks come in is busy work that isn’t necessary with the current approach. As stated several times above, it is not necessary to re-sort them for validation as validation can be done in any order. If you insist on having your node implementation validate transactions in a single thread, then yes, you will need to re-sort them. But this is precisely what ABC is trying to improve on. I think the above demonstrates why we should all really ask ourselves whether we want to go forward with this in November, or postpone this to have a second look. Especially the in-the-field observation by Tom Zander that at least one current implementation suffers from a reduction in validation speed concerns me greatly and tells me that it is rather a good idea to wait a bit more with a further analysis and proper addressing of the above criticisms. I've had a look at Zander's code. His implementation in Flowee is slower precisely because of the current ordering requirement. As I mentioned above, his node loops through the outputs and adds them to the UTXO set, then loops through his "to delete" UTXOs. This is more or less similar to how ABC plans to approach block validation. But Zander's node is constrained by the requirement to pre-sort transactions into two buckets ― those that can be processed in parallel and those that have to be processed serially. Not only is he spending CPU cycles doing this but a percentage of the block will always need to be validated serially. If we're talking about a 1GB block, we could expect somewhere around 10k-20k transactions to need to be processed serially. This is the equivalent of waiting for 10 1MB blocks to be validated on a single core before the full block validation can finish. Further it's pretty absurd to say that canonical ordering would make his implementation slower. That would only be the case if he insisted on keeping the two bucket approach. Then, yes, he would need to re-sort in order to validate using his current work flow. But it would make no sense to do that as it's the two bucket approach that is slowing down his implementation! It would make much more sense to just validate all transactions in parallel. This is what CTOR allows for.
3 of 3 reviewers say it's worth paying for
0 of 3 reviewers say it's not worth paying for