** Update **
So for the last few re-releases the common theme has been:
- No signs of a problem with abn weight, or gsc client side transactions, or creation of the pog points
- No signs of problems with actual POBH heat mining (IE no forks resulting from POW)
- No signs of problems on the *monthly* superblock budget system or its creation of budget data
- Every re-release so far and enhancement resulted from a problem related to the Daily superblock contract
(not a problem with the actual contract, a problem with the sancs not *finding* the contract, some find it, some lose it)
So I've been trying to hone in on this for the last 2 days - because at first I believed running only 3 sancs meant that one sanc out of 3 was not syncing properly- so we added rules to strengthen the minimum requirements of the voting required for a superblock to pass. Yesterday however, all 3 sancs could see the votes on the superblock (we had 4), but only 2 of them had the actual gobject in memory (that is obviously a problem).
Let me explain this a different way, so we can all understand how a soft consensus can still fork the chain in testnet.
Sanc #1 and Sanc #2 had the gobject in memory (this is the GSC contract with its payments and addresses). Sanc 1,2,and 3 had the *vote* for the gobject in memory and they all agreed the superblock was good and should pass.
Here is what happened on block 6990:
Sanc #1 found both the gobject and the trigger votes, and it said: Valid, accepted
Sanc #2 found the votes only, and said "valid superblock at this height" but actually rejected the block because: IsBlockValueValid, too few superblock payments <-- this is because it was missing the gobject with the hex data
So lets see what happens; Sanc #2 goes on its own chain now, Sanc 1 and 3 continue on the chain with the higher work. This actually splits the testnet network because now Sanctuary votes for 1 cause (mnpayments to decrease), IE do to our mnpayments relying on chainheights- this becomes a hard split - and in a small
testnet network we cant have this, so therefore we have to fix the root of the problem.
This ends up causing a hard fork, because #2 will never re-correct its view of this block until completely resynced as it has now marked the block as a fail.
First, lets stop testing again until I can release a plan to fix this.
(It does not appear to be watchman).
It appears to be more of a timing issue. In Dash, the gobjects are synced over a longer time span (as we normally have a month to get all the data for the monthly votes), but in our case, we are trying to sync (with 2 minute blocks) all the superblock data within 190 blocks (IE < 3 hours)-- so I can see if one sanc misses a gobject, its going to make a bad call on the next superblock.
So, my plan is this next:
1. I will investigate all code that syncs gobjects, and see if we are lacking something (Ie timing issues) and beef this up
2. I will do something special to allow us to Monitor the next daily gsc superblock, like a command line that will alert us that a node is going to make a bad call on the next block
3. We will make the sanc go out of its way to Pull the data it needs to make a good call on the block manually
Then we regroup here and do an in-depth analysis on the actual superblock heights as they are about to tick by and post something on the forum to see what the state of each one is as they occur.
Please hang on for the next version, it might take a day, it might take less depending on how complex the timing issue is.