Skip to content

Commit 9a771ea

Browse files
authored
feat(l1): add --input_dir and --output_dir options to archive sync (#3962)
**Motivation** Allow writing dump state to files while archive syncing and using them for archive syncing without the need of an active archive node . This PR adds the following flags: * `--ipc_path`: replaces the previously required `archive_node_ipc` required unnamed arg * `--output_dir`: outputs the state data received during the sync to the given directory * `--input_dir`: allows fetching state data from a previous archive sync execution instead of an archive node * `--no_sync`: skips state rebuild, only usable with `--output_dir` to speed up state data writing when syncing the current node is not needed <!-- Why does this pull request exist? What are its goals? --> **Description** * Adds new CLI flags `--ipc_path`, `--input_dir`, `--output_dir`, `--no_sync` * Abstracts archive sync main behaviour into structs to accommodate new features: * DumpReader: Allows reading state data from either an ipc connection or a directory * DumpProcessor: Processes state data by either using it to rebuild the state and/or writing it to a file <!-- A clear and concise general description of the changes this PR introduces --> <!-- Link to issues: Resolves #111, Resolves #222 --> Closes #issue_number
1 parent e4d9716 commit 9a771ea

File tree

4 files changed

+418
-131
lines changed

4 files changed

+418
-131
lines changed

crates/common/serde_utils.rs

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,24 @@ pub mod u256 {
77
use ethereum_types::U256;
88
use serde_json::Number;
99

10+
pub mod dec_str {
11+
use super::*;
12+
pub fn deserialize<'de, D>(d: D) -> Result<U256, D::Error>
13+
where
14+
D: Deserializer<'de>,
15+
{
16+
let value = String::deserialize(d)?;
17+
U256::from_dec_str(&value).map_err(|e| D::Error::custom(e.to_string()))
18+
}
19+
20+
pub fn serialize<S>(value: &U256, serializer: S) -> Result<S::Ok, S::Error>
21+
where
22+
S: Serializer,
23+
{
24+
serializer.serialize_str(&value.to_string())
25+
}
26+
}
27+
1028
pub fn deser_number<'de, D>(d: D) -> Result<U256, D::Error>
1129
where
1230
D: Deserializer<'de>,
@@ -33,14 +51,6 @@ pub mod u256 {
3351
}
3452
}
3553

36-
pub fn deser_dec_str<'de, D>(d: D) -> Result<U256, D::Error>
37-
where
38-
D: Deserializer<'de>,
39-
{
40-
let value = String::deserialize(d)?;
41-
U256::from_dec_str(&value).map_err(|e| D::Error::custom(e.to_string()))
42-
}
43-
4454
pub fn deser_hex_str<'de, D>(d: D) -> Result<U256, D::Error>
4555
where
4656
D: Deserializer<'de>,

crates/networking/rpc/clients/beacon/types.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ pub struct GetBlockResponseData {
1515
// Actual response has many more fields, but we only care about `slot` for now
1616
#[derive(Deserialize, Debug)]
1717
pub struct GetBlockResponseMessage {
18-
#[serde(deserialize_with = "ethrex_common::serde_utils::u256::deser_dec_str")]
18+
#[serde(with = "ethrex_common::serde_utils::u256::dec_str")]
1919
pub slot: U256,
2020
}
2121

tooling/archive_sync/README.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,27 @@ This executable takes 3 arguments:
5454
With these arguments you can run the following command from this directory:
5555

5656
```bash
57-
cargo run --release IPC_PATH BLOCK_NUMBER
57+
cargo run --release BLOCK_NUMBER --ipc_path IPC_PATH
5858
```
5959

6060
And adding `--datadir DATADIR` if you want to use a custom directory
6161

6262
While archive sync is faster than the alternatives (snap, full) it can still take a long time on later blocks of large chains
63+
64+
## Usage without an active archive node connection
65+
66+
We can avoid relying on an active archive node connection once we have already performed the first sync by writing the state dump to a directory. Note that this will still require an active archive node for the first step.
67+
68+
### Step 1: Run the `archive_sync` executable as usual with `--output_dir` flag
69+
70+
```bash
71+
cargo run --release BLOCK_NUMBER --ipc_path IPC_PATH --output_dir STATE_DUMP_DIR
72+
```
73+
74+
If we don't need the node to be synced (for example if we plan to move the state dump to another server after the sync) we can also add the flag `--no_sync` to skip the state sync and only write the state data to files.
75+
76+
### Step 2: Run the `archive_sync` executable with `--input_dir` instead of `--ipc-path`
77+
78+
```bash
79+
cargo run --release BLOCK_NUMBER --input_dir STATE_DUMP_DIR
80+
```

0 commit comments

Comments
 (0)