-
Notifications
You must be signed in to change notification settings - Fork 5.5k
[parquet] Support us precision Time type in batch reader #23542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[parquet] Support us precision Time type in batch reader #23542
Conversation
2447d11
to
cf488ca
Compare
Nit suggestion for release note entry:
|
72a0341
to
2597100
Compare
presto-parquet/src/main/java/com/facebook/presto/parquet/batchreader/decoders/Decoders.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedTestBase.java
Outdated
Show resolved
Hide resolved
2597100
to
4720f2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. Lgtm, one little nit.
...ebook/presto/parquet/batchreader/decoders/rle/Int64TimeMicrosRLEDictionaryValuesDecoder.java
Outdated
Show resolved
Hide resolved
9eccb5c
to
a937f57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZacBlanco I see the Time decoders are exactly the same with existing Timestamp ones. There's no point to duplicate so much code. Maybe we can rename the existing Timestamp decoders like Int64TimestampMicrosRLEDictionaryValuesDecoder to Int64TimeAndTimestampMicrosRLEDictionaryValuesDecoder?
presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedTestBase.java
Show resolved
Hide resolved
22b54d4
to
45d8680
Compare
Thanks for the review @yingsu00 . I agree - let's try not to duplicate code. I updated the naming of the decoders across the module and updated the conditionals to check |
45d8680
to
370ed94
Compare
Iceberg timestamps are written with microsecond precision as defined by the spec. Before this change, the parquet batch reader optimization did not support reading microsecond-precision `time` types. This led to Presto returning incorrect results. This change adds proper support in the batch reader.
370ed94
to
0f0cedb
Compare
Description
Iceberg timestamps are written with microsecond precision as defined by the spec. Before this change, the parquet batch reader optimization did not properly support reading microsecond-precision
time
types. Presto returned results which differed depending on whether the optimization was enabledThis change adds proper support in the batch reader.
Motivation and Context
Previously Presto would return wrong results in the Iceberg connector when parquet batch read optimizations are enabled.
Behavior before this change:
Behavior after this change:
Impact
Proper
time
type support.Test Plan
Added test to verify equivalent return values with and without the optimization in the Iceberg connector.
Contributor checklist
Release Notes