# A simple doublespend-relay mining probability simulator

To explore a very specific question about whether widespread double-spend relay will result in miners more likely to mine DS, I wrote a crude simulator. It does not account for lots of variation in a real network, but I think it's a place to start.

Apologies for writing in Matlab, but this is the only thing I'm good enough at to cook up in an afternoon. It *should* run on Octave, but let me know if it doesn't.

The preliminary results might surprise you! Whether relaying immediately or with "trickle" logic, I found that DS relay does not seem to increase risk of tx2 mined over tx1. It might even improve it a tiny bit, though it's too thin to be significant.

I'll encourage people to give me ideas on how to improve this simulator as well as try running it themselves; one glaring problem is that inventory states are not communicated between two ends of a connection, but i think it's good enough for this purpose. Who knows, a keen observer might notice something in my code that renders the results dubious - I'll encourage that - but I figured we should do more data analysis instead of yelling at each other on Reddit and endless Github threads.

**Reviews**

3 of 3 reviewers say it's worth paying for

0 of 3 reviewers say it's not worth paying for

**Comments**

Since nodes are processed sequentially there's an advantage to node 1, even with no delay, because it's always processed before node 2 (the source of double spend). Node 1's advantage is related to the proportion between connections and nodes which means the higher the number of connections, or the smaller the number of nodes, the closer the result gets to 100%.

[0.57, 0.05] = ds_simulator(100, 100, 5, 5, 0, false, true) [0.62, 0.12] = ds_simulator(100, 100, 5, 20, 0, false, true) [0.74, 0.07] = ds_simulator(100, 100, 5, 40, 0, false, true) [0.87, 0.04] = ds_simulator(100, 100, 5, 60, 0, false, true) [0.96, 0.01] = ds_simulator(100, 100, 5, 80, 0, false, true) [0.99, 0.00] = ds_simulator(100, 100, 5, 99, 0, false, true)

This bias can be clearly seen when all nodes are miners and connected (last line) since the expected result is around 50%.Simply by permuting the order of the nodes in each propagation round we get completely different results, and more in line with intuition. Each loop results in either tx1 or tx2 for all nodes but 1 so the standard deviation is high.

[0.60, 0.54] = ds_simulator_perm(100, 100, 5, 99, 0, false, true)

Other simulations...

[0.40, 0.14] = ds_simulator_perm(100, 5, 5, 5, 0, false, true) [0.48, 0.18] = ds_simulator_perm(100, 5, 5, 5, 0, false, false)[0.88, 0.18] = ds_simulator_perm(100, 5, 5, 5, 1, false, true) [0.96, 0.09] = ds_simulator_perm(100, 5, 5, 5, 1, false, false)

___
Note 1: This model assumes that all miners have the same hash rate and therefore the percentage of miners seeing tx2 represents the probability of double spend. I believe no simulation should use more that 5 miners.

Note 2: In Octave there seems to be no unique(A, 'stable'). You can replace that with sunique(A) as defined bellow.

function output = sunique(input) [~, i, ~] = unique(input, 'first'); output = input(sort(i)); end

@Cláudio Gil right, there can probably be some shuffling of connections; at very high number of nodes and relatively low connections (2000/8 that I used) it doesn't matter very much though.

If we want to simulate a 100/99 ("complete graph" scenario) we'll probably need a mechanism to shuffle the connections among nodes... which is not conceptually hard, but not on my urgent to do list yet.