Skip to content

Commit ce0d464

Browse files
fmolettaOppen
andauthored
fix(l1): resume sync upon node restart (#2415)
**Motivation** After PR #2303 sync cycles are restarted automatically by the same process that started the sync, removing the notion of "pending syncs". This works fine until the node is restarted during the sync. If the node is restarted, no one is checking whether there was an active sync progress before, and the node continues unknowingly operating on the invalid state left from the unfinished sync. This PR aims to fix this by resuming the sync process as soon as the sync manager is created <!-- Why does this pull request exist? What are its goals? --> **Description** * When creating the SyncManager, check if there are checkpoints from an active sync process leftover and start the next sync cycle if needed * in `SyncManager::start_sync` check that the latest fcu head is not the default before starting a cycle and wait around fro the next fcu update if needed <!-- A clear and concise general description of the changes this PR introduces --> <!-- Link to issues: Resolves #111, Resolves #222 --> Closes #issue_number --------- Co-authored-by: Mario Rugiero <mrugiero@gmail.com>
1 parent f3e432b commit ce0d464

File tree

1 file changed

+22
-4
lines changed

1 file changed

+22
-4
lines changed

crates/networking/p2p/sync_manager.rs

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,12 @@ use std::sync::{
66
use ethrex_blockchain::Blockchain;
77
use ethrex_common::H256;
88
use ethrex_storage::{error::StoreError, Store};
9-
use tokio::sync::Mutex;
9+
use tokio::{
10+
sync::Mutex,
11+
time::{sleep, Duration},
12+
};
1013
use tokio_util::sync::CancellationToken;
11-
use tracing::warn;
14+
use tracing::{info, warn};
1215

1316
use crate::{
1417
kademlia::KademliaTable,
@@ -46,12 +49,21 @@ impl SyncManager {
4649
cancel_token,
4750
blockchain,
4851
)));
49-
Self {
52+
let sync_manager = Self {
5053
snap_enabled,
5154
syncer,
5255
last_fcu_head: Arc::new(Mutex::new(H256::zero())),
53-
store,
56+
store: store.clone(),
57+
};
58+
// If the node was in the middle of a sync and then re-started we must resume syncing
59+
// Otherwise we will incorreclty assume the node is already synced and work on invalid state
60+
if store
61+
.get_header_download_checkpoint()
62+
.is_ok_and(|res| res.is_some())
63+
{
64+
sync_manager.start_sync();
5465
}
66+
sync_manager
5567
}
5668

5769
/// Creates a dummy SyncManager for tests where syncing is not needed
@@ -109,6 +121,12 @@ impl SyncManager {
109121
};
110122
*sync_head
111123
};
124+
// Edge case: If we are resuming a sync process after a node restart, wait until the next fcu to start
125+
if sync_head.is_zero() {
126+
info!("Resuming sync after node restart, waiting for next FCU");
127+
sleep(Duration::from_secs(5)).await;
128+
continue;
129+
}
112130
// Start the sync cycle
113131
syncer
114132
.start_sync(current_head, sync_head, store.clone())

0 commit comments

Comments
 (0)