OP_RETURN is an op code that has long been known as a mechanism for embedding data in the blockchain is a way that allows miners to ignore if they want to so as to save on storage if they deemed necessary. It has a long and checkered history that involved the usual story of core adding limits to 'protect' users from themselves. These limits have a very significant place in bitcoin's history causing many serious projects and people to abandon Bitcoin BTC. Ethereum itself only exists today because this limit drove Vitalik away from bitcoin to seek an alternative. They could have and would have done everything they did on the BTC blockchain is they'd been allowed as far as I know from the publicly available narrative.
Of course in Bitcoin SV our driving principle is to return to the original protocol, which includes removing limits on things like OP_RETURN and allowing economics to govern usage. So it was our goal from day one of the Bitcoin SV project to unshackle the data storage use case.
A bit of background
On November 18th Neil Kyuupichan submitted a pull request to eliminate the restriction on OP_RETURN outputs in bitcoin-sv: https://github.com/bitcoin-sv/bitcoin-sv/pull/16
Removing this restriction was part of the roadmap for Bitcoin-SV from day one. It was just a matter of time. There are many scaling issues to be addressed and we have a huge list of them to feed into our development backlog. In planning out our year the key challenge has been what order to deal with them in?
There are 2 key issues that we are hearing massive demand for from the ecosystem:
- Removing the limit of 25 unconfirmed ancestors for a transaction which is problematic for many current on chain apps
- Removing OP_RETURN limits to enable larger amounts of data to be stored in transactions which has immediate use cases from tokens to general on-chain content and metanet
Point 1 is blocked in the short term because it has serious performance implications and as you can see from Daniel’s post today we’ve been heavily focussed on dealing with scaling issues within Bitcoin SV. We are working on fixing the code that blocks this limit from being removed but it will take time. We will have more news on this soon…
Point 2 we would like to incorporate into Bitcoin SV ASAP. However due to limits on how many hours we can do risk analysis and code in a day (and Christmas) we won’t be able to put Neil’s PR through our QA process in time for our next software release planned for 11th Feb. We intend to put it in the next release and I’ll probably get in trouble from Daniel for saying this, but I’m hoping we can get it out about 6 weeks later.
The limit explained (a bit)
The current limit on OP_RETURN data is actually a soft limit that miners are free to change. It defaults to 223 bytes but miners can raise it. I hadn’t previously considered asking miners to raise this limit manually because I had misunderstood 2 key points about how bitcoin works. The first point is a simple assumption that was just plain wrong and I don’t really know why I made it. That you could only make one data push operation after the OP_RETURN. I also made the assumption that 1 min blocks would be better than 10 minutes blocks back in 2011. We all start out as idiots and get a tiny bit wiser over time
The second is a bit more in depth and worth working through the detail because it helps understand a few things about the intricacies of bitcoin script validation. This will lead you down a rabbit hole so bear with me. But it is helpful to first frame the lifecycle of a script in two stages. First the script becomes part of the bitcoin ledger when it is created. Second it ceases to have meaning when it is spent.
How I screwed up
There is a hard consensus rule that limits the size of data a bitcoin script can work with to 520 bytes. Since OP_RETURN outputs typically use a bitcoin script op code called OP_PUSHDATA to add data into the script, this is that data you want to store and I had assumed this 520 bytes limit would apply. I’ve done plenty of blockchain archeology over the years, replaying the whole blockchain and picking bits of interesting data out of transactions to analyse patterns, quantify historical activity (e.g. seeing if certain op codes have been used) and find patterns of behaviour. In many of these archaeological expeditions I’ve come across scripts that are complete nonsense. Not just scripts that won’t ever work but scripts that can’t even be parsed. Literally just garbage data. It occurred to me a few times that maybe bitcoin doesn’t ever try to run these scripts but I didn’t make a key connection. Bitcoin never runs an output script until you try to spend it. Of course when you think it through that’s the time when you’d expect it run but it’s not immediately obvious why transaction validation wouldn’t check this.
Well the answer is more obvious once you think of it with the right terminology. We commonly refer to scriptPubKey (the output script) and scriptSig (which you provide to spend it). Some people call them the locking script and the unlocking script respectively. Another way to think of them is the puzzle script and the solution script. The puzzle script is created in when the UTXO is created and the solution script is provided when it is spent. In a standard P2PKH script the puzzle is “provide a valid signature using a public key that hashes to this value”. The solution script is that signature and public key. Another (completely insecure) example would be “provide a number that when added to 2 equals 5”. The solution script would be a “3”. But the point of bitcoin is that usually you can’t work out the solution by looking at the puzzle unless you know a secret (the private key) that no one else knows. Otherwise we’d all be stealing each other’s bitcoins in an endless cycle.
Brain fixed at last
Before trolls ask, yes I knew that you can’t successfully run a puzzle script without the solution. What I had assumed was that bitcoin does some other checks for validity of output scripts before accepting a transaction without ever thinking it through. What I learned was to stop assuming where consensus rules are concerned. In fact it is quite possible to make completely invalid output scripts and where OP_RETURN is concerned this has one important implication. The 520 byte limit is never checked after an OP_RETURN and therefore not enforced. The key reason for this is because OP_RETURN fails the script automatically when executed, so it can never be spent. This part isn't news to me and is the common understanding of how and why OP_RETURN works as a data carrier. But, in the absence of any of other checks at the time the output is created, nothing after the OP_RETURN ever gets tested for validity. That has a few implications. Firstly the 520 byte limit is never tested. Secondly, NO consensus rules at all are checked on the bytes that follow the script. Not at the time the script is created, not at the time it is attempted to be spend, never…
This little revelation dawned on me in the last few days and I shared with other BSV devs and asked them confirm my interpretation of the code. The result was awesome. We don’t have to wait for a code change to raise this limit to a level people were actually asking for. We can unfuck this right away if we can just get the miners to agree. So a flurry of emails to miners ensued to explain why we could do this now and that I would personally be willing to make 8c worth of transactions to enrich their coffers. Presumably 8c wasn't enough to convince and they understood the long term revenue implications. And now here we are. To my knowledge at least 99% of the hashrate has now raised the limit to 100kb. This is enough for a decent sized web page. I posted a test transaction this morning: https://whatsonchain.com/tx/21a5c896f23bea81ae5018dfeb8801ddc68691d0186a7e2d8c011e65e0a396d9 and it was quickly spotted by @_unwriter who quickly turned it into this homage to the looking glass: https://alice.bitdb.network/
This a post full of mea culpas and here I declare I fucked up again… Due a coding error I posted UTF8 data and most block exploreres can currently only handle ASCII so a lot of the text displayed as garbage characters, I also split this 17kb transaction into multiple 900 byte chunks. @_unwriter noticed and so I thought we need to demonstrate that you can actually break the PUSHDATA rules and do it in a single transaction PUSHDATA operation (this time the complete version of the prequel "Alice in wonderland"): https://whatsonchain.com/tx/ef21e71d00b9fce174222e679640b09e29ac8a55f321c93e64b16cc3109959f8
Ecosystem is rocking this change
The folks at whatsonchain.com where very accommodating and updated their code within a few hours to display the output data as UTF8. I believe they have added other updates to intelligently handle other known forms of OP_RETURN data like memo.cash.
Currently not all nodes support this and won’t relay your transactions but there is a groundswell of support for this with people making nodes OP_RETURN friendly all over the place. Support will be much more widespread in days and will be widespread enough to make it effectively complete within weeks.
We’re about to take the red pill and see how deep the rabbit hole of on chain data storage really goes. Tokenized is ready for this, yours.org is ready for this, @_unwriter and his plethora of tools are ready for this as are others. Calvin Ayre has previously spoken of the ‘Cambrian explosion’ of creativity and I believe that unbounded-by-anything-except-fees data storage will be a key trigger for this. The unfuckening continues, one limit at a time.
17 of 17 reviewers say it's worth paying for
0 of 17 reviewers say it's not worth paying for