Skip to content

Commit e2cd0e0

Browse files
authored
chore: Merge pull request #7 from commoncrawl/dev
chore: Merges fixes, features and refactors for version 0.5.2 Fixes issue feat: Add a User Agent to cc-downloader #6 - Introduces refactors so that linter check are all passed - Introduces a rust workflow for ensuring that the code compiles and test are passed in the dev and main branches - Introduces changes to the contributing policy so that PRs are merged to the dev branch - Introduces slight updates to the documentation
2 parents ac7af85 + 35db859 commit e2cd0e0

File tree

7 files changed

+203
-95
lines changed

7 files changed

+203
-95
lines changed

.github/workflows/rust.yml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
name: Rust
2+
3+
on:
4+
push:
5+
branches: [ "main", "dev" ]
6+
pull_request:
7+
branches: [ "main", "dev" ]
8+
9+
env:
10+
CARGO_TERM_COLOR: always
11+
12+
jobs:
13+
build:
14+
15+
runs-on: ubuntu-latest
16+
17+
steps:
18+
- uses: actions/checkout@v4
19+
- name: Build
20+
run: cargo build --verbose
21+
- name: Run tests
22+
run: cargo test --verbose
23+
- name: Run clippy
24+
run: cargo clippy --verbose
25+

CONTRIBUTING.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# How to contribute to cc-downloader?
2+
23
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](CODE_OF_CONDUCT.md)
34

45
`cc-downloader` is an open source project, so all contributions and suggestions are welcome.
@@ -12,6 +13,7 @@ In order to facilitate healthy, constructive behavior in an open and inclusive c
1213
our [code of conduct](CODE_OF_CONDUCT.md).
1314

1415
## How to work on an open Issue?
16+
1517
You have the list of open Issues at: [https://github.com/commoncrawl/cc-downloader/issues](https://github.com/commoncrawl/cc-downloader/issues)
1618

1719
Some of them may have the label `help wanted`: that means that any contributor is welcomed!
@@ -36,13 +38,14 @@ If you would like to work on any of the open Issues:
3638
git remote add upstream git@github.com:commoncrawl/cc-downloader.git
3739
```
3840

39-
3. Create a new branch to hold your development changes:
41+
3. Switch to the `dev` branch and then create a new branch to hold your development changes:
4042

4143
```bash
44+
git checkout dev
4245
git checkout -b a-descriptive-name-for-my-changes
4346
```
4447

45-
**do not** work on the `main` branch.
48+
**do not** work on the `main` or `dev` branches.
4649

4750
4. Develop the features on your branch.
4851

@@ -58,17 +61,16 @@ If you would like to work on any of the open Issues:
5861
5962
```bash
6063
git fetch upstream
61-
git rebase upstream/main
64+
git rebase upstream/dev
6265
```
6366
64-
9. Once you are satisfied, push the changes to your fork repo using:
67+
6. Once you are satisfied, push the changes to your fork repo using:
6568
6669
```bash
6770
git push -u origin a-descriptive-name-for-my-changes
6871
```
6972
70-
Go the webpage of your fork on GitHub. Click on "Pull request" to send your to the project maintainers for review.
71-
73+
Go the webpage of your fork on GitHub. Click on "Pull request" to send your to the project maintainers for review, and select the `dev` branch as the brach you'd like to merge your changes into.
7274

7375
Thank you for your contribution!
7476

Cargo.toml

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "cc-downloader"
3-
version = "0.5.1"
3+
version = "0.5.2"
44
edition = "2021"
55
authors = ["Pedro Ortiz Suarez <pedro@commoncrawl.org>"]
66
description = "A polite and user-friendly downloader for Common Crawl data."
@@ -12,16 +12,24 @@ repository = "https://github.com/commoncrawl/cc-downloader"
1212
documentation = "https://docs.rs/cc-downloader"
1313

1414
[dependencies]
15-
clap = { version = "4.5.23", features = ["derive"] }
15+
clap = { version = "4.5.29", features = ["derive"] }
1616
flate2 = "1.0.35"
1717
futures = "0.3.31"
18-
indicatif = "0.17.9"
19-
reqwest = { version = "0.12.9", default-features = false, features = [
18+
indicatif = "0.17.11"
19+
reqwest = { version = "0.12.12", default-features = false, features = [
2020
"stream",
2121
"rustls-tls",
2222
] }
2323
reqwest-middleware = "0.4.0"
2424
reqwest-retry = "0.7.0"
25-
tokio = { version = "1.42.0", features = ["full"] }
25+
tokio = { version = "1.43.0", features = ["full"] }
2626
tokio-util = { version = "0.7.13", features = ["compat"] }
2727
url = "2.5.4"
28+
29+
[dev-dependencies]
30+
serde = { version = "1.0.217", features = ["derive"] }
31+
reqwest = { version = "0.12.12", default-features = false, features = [
32+
"stream",
33+
"rustls-tls",
34+
"json",
35+
] }

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ This is an experimental polite downloader for Common Crawl data writter in `rust
55
## Todo
66

77
- [ ] Add Python bindings
8-
- [ ] Add tests
8+
- [ ] Add more tests
99
- [ ] Handle unrecoverable errors
1010

1111
## Installation

SECURITY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ Only the latest minor version is being supported
1111

1212
## Reporting a Vulnerability
1313

14-
To report a security vulnerability, please contact: info@commoncrawl.org
14+
To report a security vulnerability, please contact: info[at]commoncrawl[dot]org

0 commit comments

Comments
 (0)