Skip to content

Conversation

2color
Copy link
Collaborator

@2color 2color commented May 19, 2025

What

This is a new option to delete existing uploads in the Storacha space that the storacha-proof has permission for.

Why

Storacha doesn't deduplicate across uploads. This means that every upload will use up space for the full size of the CAR. If each build is large, this can easily add up to a lot of space.

Open question

  • Since we use the same proof/did for uploads, it may need additional permissions when getting created as we currently recommend in the README.md: w3 delegation create did:key:DID_OF_KEY -c space/blob/add -c space/index/add -c filecoin/offer -c upload/add --base64
    • We need to add space/blob/remove and upload/remove to make it possible.

@2color 2color requested review from lidel and Copilot May 19, 2025 08:50
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds an option to delete old uploads from Storacha before new builds are uploaded. Key changes include:

  • Introducing a new input "storacha-delete-old-builds" in action.yml.
  • Adding a new step in action.yml to delete old builds using the Storacha CLI.
  • Updating the README to document the new input.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
action.yml Added a new input and step to delete old builds from Storacha
README.md Updated input table with the new "storacha-delete-old-builds" option

@2color
Copy link
Collaborator Author

2color commented May 26, 2025

I just got confirmation from Storacha that the space/blob/remove and upload/remove capabilities are needed for deletions.

Comment on lines +58 to +59
storacha-delete-old-builds:
description: 'Delete old builds from Storacha before uploading new one. Use "true" or "false" (as strings). Be careful with this option as it will delete all uploads to the space, not just the ones from this build.'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some key/tag with storacha we could use to ensure only certain items are removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think storacha has an api/option to filter entries, but I suppose we could do client side filtering on the ls results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should replace it with something that "keeps N latest items" or "purge items older than T"
See #36 (comment)

Comment on lines +190 to +193
w3 ls | while read -r cid; do
if [[ -n "$cid" ]]; then
echo "Removing $cid"
w3 rm "$cid"
Copy link
Contributor

@lidel lidel Aug 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem

I worry this will lead to race conditions in default state where people use our action without going extra mile to set up concurrency limits in their workflow.

Running on multiple PRs during busy day could lead to state where recently published preview on one PR is deleted by build from other PR.

Fix

I think if we are able to keep last N CIDs, or only delete "older than a week" then we could remove very old ones, but dont risk PRs rug-pulling each other by removing recently created ones.

Missing timestamps in w3 ?

@alanshaw perhaps i'm looking at sources of https://github.com/storacha/w3cli / https://github.com/storacha/upload-service with sleepy eyes, but the API seems to return timestamps (insertedAt, updatedAt), but then w3 CLI filters them out, even when I pass w3 ls --json (line 172-175 in lib.js explicitly maps only { root, shards })

Could pin creation timestamps be exposed, or am I missing something?

We need them to

  • identify "old" uploads - timestamps exist in the API but aren't accessible via CLI
  • implement "keep last N by age" - no way to sort by time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants