Skip to content
Aurelie Herbelot edited this page Jan 23, 2021 · 2 revisions

Benchmarking information

Processing .wet files

For your information, the sequential processing a single Common Crawl file on one core takes around 20 mins on a not-too-old laptop. Expect an output file of 150M uncompressed.

On the Raspberry Pi 4, processing of a single file takes around 27 mins (with 10ms sleep between each document to preserve the CPU, which then runs at around 80% capacity).

Clone this wiki locally