The infinite stream of transactions attack that isn't
There has been some talk by Amaury Séchet, the lead developer of Bitcoin Cash's reference implementation, about the bitcoin block header not including a transaction count value covered by Proof-of-Work. He claims that one would be needed though as there is a denial-of-service attack vector as soon as the block size gets increased:
This isn't the first time he mentioned the idea of extending the block header or embedding the transaction count in the coinbase transaction. Back when there was the Bangkok meeting (on August 30th, 2018) he presented this change as part of the roadmap to the miners using the described attack vector as reasoning. Amaury's presentation was followed by Dr. Craig Wright calling it "bullshit" and leaving the room. This news spread quickly across the media, and many people concluded at this point that a split was unavoidable, and at the end it did happen.
Three months later, David Jerry (@digitsu) wrote an article where he recaps on what happened during the meeting. David also touches on the topic of how having the transaction count covered by PoW is not necessary. I believe that it's a bit difficult for the non-technical reader to understand and also didn't really delve into the details (though that wasn't the goal of his article). Thus I am trying to describe step-by-step (without much math) what the alleged attack vector is and how nodes can protect against it without changing anything about bitcoin's core design.
The attack vector
First of all, let's have a look at how the attack described by Amaury would look like in practice. The idea is that as soon as a new block is getting mined by a miner, an attacker relays the block header to you but instead of sending the real transaction ids she sends a bunch of fake transaction ids. The argument being made is that you as the receiver don't know for sure the number of transaction ids that you are going to receive (as the attacker can tell you any number) hence you happily accept transaction ids for as long as the attacker sends some:
How would this have to look like on the network level? We have a miner finding a new block and then an attacker sitting between you and the miner relaying a modified block:
The attentive reader might already have noticed that this isn't how the bitcoin network actually looks like: Connectivity in bitcoin is incentivised, as Dr. Wright likes to point out. Miners form what is known as a small-world network where every miner is connected to every other miner, you never receive a block at second hand. Miners are incentivised to do so: If you find a new block, you want the majority of the hashpower to receive your block as fast as possible to minimise the risk of your block getting orphaned and you in turn losing block rewards. In the same way, you want to receive a new block as fast as possible to be one of the first to start building on this new block. If you are more interested in bitcoin's network structure, I can recommend you reading Alex Fauvel's (@2ndEntropy) awesome "Sybil and Satoshi" article.
Can the attack still be performed with respect to bitcoin's small-world network structure? Let's have a look:
Now when the miner finds the block, you will receive the block from the miner before an attacker can send you a manipulated version. If the attacker sends you a manipulated block, it's the second time you receive it (but with a different transaction count this time). You can simply detect that it has been manipulated from the original one and close the connection with the attacker.
We have shown that the described attack can not actually work in practice, but let's dive in a bit deeper and show that this attack can not even work neglecting all the incentives around bitcoin as an economic system.
Merkle tree magic
I'm assuming that you know the basics of how bitcoin's merkle tree works. Read my previous article "Scaling bitcoin's merkle tree" to gather some basic understanding about this data structure.
Let us first remember how a merkle tree looks like and define some technical terms around this data structure:
The merkle tree is called a tree because - well it looks like a tree when you flip it upside down: You start with the root, followed by several nodes and then you got the leaves. In bitcoin our root is the merkle root hash in the block header and every leaf represents one transaction id. The height / depth is the distance from the root to the leaf (2 in this picture) and the leaf count is the number of leaves. When drawing some of these merkle trees, you'll quickly notice that when you double the number of leaves, your tree grows in height by only 1. If you take half the number of leaves, you reduce the height by 1. A tree where every human on this planet gets one leaf would be only 30 in height. In red I have highlighted the merkle tree proof for the rightmost leaf. These are the only two values one needs to know to verify that the leaf is part of the tree. The length of the merkle tree proof is equal to the tree's height.
Knowing these basics, let us play Sherlock and see how much information we can gather about a tree only knowing the root with as little information as possible:
Where to start? Well, first let us get one leaf to know how high the tree is. Let's simply ask for the transaction behind the first leaf and a merkle tree proof for this leaf:
Perfect! Now, let us:
- Hash the coinbase transaction,
- append the first part of the merkle tree proof on the right,
- hash it again,
- append the second part of the merkle tree proof on the right,
- hash it again,
- append the third part of the merkle tree proof on the right
- and hash it again.
We ended up with our merkle root hash, good. Since height of a tree and length of the merkle tree proof are equal, we can conclude that our tree has a height of 3. We can simply calculate that a tree with a height of 3 has up to 8 leaves. But not less than 5 because a tree with 4 leaves would have a height of 2.
This gives us a pretty good idea about what number of leaves we can expect. But what about trees larger than that one, e.g. a tree with a height of 26? Is it still accurate to say: The tree contains between 33 million and 67 million leaves? We can do better!
So we've got the leftmost leaf to determine the height. If we have the rightmost leaf, it should be pretty easy to tell how many transactions are in the block. So let us request the rightmost leaf:
The first merkle tree proof is still highlighted in red while the second one is now colored in orange for distinction. Great, let's check the merkle proof for this leaf. We append the first part of the merkle tree proof on the right, hash it, append the second part of the merkle tree on the right, hash it again and prepend the third part of the merkle tree proof and hash it again. Awesome, we ended up with the root hash. That means our block has 5 transactions, right? Well, can we be sure that this really is the rightmost leaf? We might have received any leaf inside the block.
Actually, it's possible. To understand this, we need to have a short look at how bitcoin hashes trees where you don't have a nice number of leaves, i.e. 2, 4, 8, 16, 32, ... leaves (power of two). Basically how the software work is that when there is an odd number of leaves/nodes in any single layer of the tree, the last leaf/node gets duplicated:
Many people consider this a vulnerability in bitcoin's merkle tree (e.g. CVE-2012-2459), which is trivial to protect against, but in fact it enables us to develop a simple algorithm to check whether a leaf is the rightmost one in the tree. This algorithm is based on the fact that for each node between the merkle root and the last leaf, one node can never appear on the left side of the hashing unless it simultaneously acts as the right side as well (i.e. it has been copied). This means there are only ever two possible constellations for every single node between the last leaf and the merkle root:
- The node is on the left side of the hashing and a copy of said node is on the right side
- The node is on the right side of the hashing and any value except a copy of the node itself can be on the left
How can we apply this to our picture from above?
Starting from the bottom going to the top:
a) Compare the leaf with our first part of the merkle tree proof. If they are not the same hashes, our leaf is not the rightmost leaf as there is another one that is right from it, namely the one that has been supplied to us as first part merkle tree proof. If they are the same, concatenate them, hash them and go to step b. b) Compare the node with our second part of the merkle tree proof. If they are not the same, there must be something hidden below the node that has been supplied to us as the second part of the merkle tree proof. If they are the same, concatenate them, hash them and go to step c. c) Nothing to check here as the node appeared on the right side which is perfectly valid. We can concatenate the left and the right, hash that and if we end up with the same root hash as in the block header, we just verified that the provided leaf is indeed the rightmost leaf. It's the fifth and last one, therefore our tree has five leaves.
What have we shown?
We have shown that it's possible to prove the number of leaves in a merkle tree without revealing the whole content. By simply leveraging compact merkle tree proofs. There is no need to change the current protocol: We don't need to put the number of transactions in the block header (which would break any mining ASIC) or into the coinbase. The merkle tree root hash already covers the exact transaction count today!
How would miners apply this in practice?
Miners can require each other to announce a block the following way: Besides announcing the block header, they must attach the following data:
- The coinbase transaction (this protects against us against a second-preimage attack)
- A merkle tree proof for the coinbase transaction
- The id of the last transaction in the block
- A merkle tree proof for the last transaction in the block
This allows the receiver to verify the number of transactions in the block.
Some people might be wondering why we don't simply reduce the data to the last transaction (instead of last transaction id) and the corresponding merkle tree proof. The simple reason is that the coinbase transaction is most probably smaller than the last transaction in the block.
Now, test it yourself!
First, the program will ask you for a block height to enter. Go to your favourite block explorer and pick a random block, here is one.
The program will request the header for the specified block and ask for a transaction id in that block next. Let us choose the coinbase transaction: 55b8aa3851ac45fe1ec0af8f5cb191b7dd3ad1cd72c227552600278d4e36ee7f
After hitting enter the program will test the merkle tree proof for the coinbase transaction and also check whether a transaction is either the first or the last one in the block. The coinbase transaction is the first one, of course.
Now let us pick the last transaction from the block: 32164b26980f2a104ad2c09096a2da2dc678be571900e6f8c130e727f9136854 The program again checks the merkle tree proof for given transaction id as well as the position. It also finds out that the transaction is the last one in the block using the method described above:
Since our last transaction is #60, our block must include 60 transactions. The block explorer confirms this:
You can check any transaction anywhere inbetween the first and the last one and you'll quickly notice that the program does not identify them as being either the first or the last one!