Skip to content

Commit

Permalink
Fix reconciliation algo
Browse files Browse the repository at this point in the history
For months now the reconciliation algo has been plagued by bugs
surrounding the ethereumjs/trie library. We've opened many tickets on
Github:

- #146
- #131
- ethereumjs/ethereumjs-monorepo#3264
- ethereumjs/ethereumjs-monorepo#3645

The pattern of the problem was always that somehow that `trie.root()`
couldn't be found using `trie.checkRoot`, which seemed almost like a
contradiction, especially when doing `await
trie.checkRoot(trie.root())`.

We had initially introduced the checkpointing of the trie because of
some rather theoretical problem regarding what would happen if during
the reconciliation the trie updates and, at the same times, sends level
comparisons to a peer. So to use checkpointing for us was primarily used
to implement atomicity when storing data. We wanted to just store the
remote trie's leaves in batches as to make sure not to interrupt the
algorithm to compare the trie's levels.

At the same time, the insertion of new leaves into such a trie is costly
as a big part of its hashes have to be recomputed to arrive at a new
root.

However, I think what has happened with our implementation of the
sync.put method is that the checkpointing led to the trie writes not
being processed sequentially which also lead to all sorts of problems
in the reconciliation.

The reconciliation is purposefully built in a way where it first
synchronizes old leaves and only then new leaves. While a working
reconciliation doesn't have any issues with storing comments, a
fundamentally asynchronous reconciliation will attempt to store comments
where the original upvote hasn't been made yet, leading to the message
not being processed initially.

Another big problem ended up being that the ethereumjs/trie library
isn't mature with regards to handling the application shutting down, and
so a lot of the above mentioned issues actually describe the
ethereumjs/trie library reaching a non-recoverable state.

Funnily enough, however, all it took to fix all of the above problems
was to remove all notions of checkpointing and commits. While it does
make the reconciliation algorithm MUCH slower (because it is now
synchronous), it also made it much more reliable and almost free of
errors during interaction.
  • Loading branch information
TimDaub committed Sep 11, 2024
1 parent 8d5ca9c commit 7b5eef2
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion src/store.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,6 @@ async function atomicPut(trie, message, identity, accounts, delegations) {
// NOTE on 2024-09-10: Not sure why we're nesting try catchs here, I'm pretty
// sure this wasn't initially intended so if we end up touching this part of
// the code again, it may make sense to remove it.
let enhancedMessage;
try {
await trie.put(Buffer.from(index, "hex"), canonical);
try {
Expand Down

0 comments on commit 7b5eef2

Please sign in to comment.