Skip to content
Discussion options

You must be logged in to vote

Follow the Iceberg specification, an equality delete file must be applied to a data file when all of the following are true:

1. The data file's data sequence number is strictly less than the delete's data sequence number
2. The data file's partition (both spec id and partition values) is equal to the delete file's partition or the delete file's partition spec is unpartitioned

So we should add partition fields as the join key as well to ensure that the data files and the delete files applied to them located in the same partition.

Referring to here. Hope this can help you.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by squalud
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants