Skip to content

fix: correctly return number of bytes read from chunked streams #1386

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ianbotsf
Copy link
Contributor

Issue #

Fixes #1285

Description of changes

This change updates AwsChunkedSource and AwsChunkedByteReadChannel to return the actual number of bytes read from the underlying stream, not the number of bytes written to the sink. It does this by adding read count tracking to AwsChunkedReader which must be read and reset by the calling source/channel. Also updates various tests to correctly assert the number of bytes read and the number of bytes written.

This PR is an alternate implementation of #1294.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@ianbotsf ianbotsf requested a review from a team as a code owner August 14, 2025 23:12

This comment has been minimized.

@@ -47,11 +47,12 @@ abstract class AwsChunkedByteReadChannelTestBase : AwsChunkedTestBase(AwsChunked
val sink = SdkBuffer()

val bytesRead = awsChunked.readAll(sink)
writeJob.join()
// writeJob.join()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to join the write job?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't hurt but it's not strictly necessary. The readAll call cannot return until all the data's been consumed which means the write job will have finished. I've uncommented that line just for a sanity check.

"issues": [
"awslabs/smithy-kotlin#1285"
],
"requiresMinorVersionBump": true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a note so we don't forget to remove requiresMinorVersionBump

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

This comment has been minimized.

This comment has been minimized.

consumeAndWriteChunk(destination)
}
if (rc == -1L) {
check(input.eof) { "Got back -1 bytes but input adapter is not EOF" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would the rc be -1 but the input not eof?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a sanity check. It was useful while debugging the implementation but could provide a valuable fail-fast clue if a future bug involves streams which return -1 bytes read but the stream is not yet closed/exhausted.

if (hasLastChunkBeenSent) {
return -1 // this stream is exhausted
} else {
// Write any remaining data as final chunk(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would only have one final chunk in the buffer at this point not multiple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, corrected the comment.

Copy link

Affected Artifacts

Changed in size
Artifact Pull Request (bytes) Latest Release (bytes) Delta (bytes) Delta (percentage)
aws-signing-common-jvm.jar 70,970 70,697 273 0.39%
aws-signing-tests-jvm.jar 461,757 460,112 1,645 0.36%
runtime-core-jvm.jar 835,049 834,754 295 0.04%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Progress listening with chunked encoding reads more bytes transferred than file size
4 participants