Skip to content

Conversation

vhsu14
Copy link
Contributor

@vhsu14 vhsu14 commented May 8, 2025

Description

This PR depends on #25020. Changes include:

  • Use IDL generated by coordinator and make relevant changes to enable thrift for TaskStatus, TaskUpdateRequest, and TaskInfo on native worker.
  • Remove old pipeline of using python, json, and chevron templates for producing C++ code.

Motivation and Context

We observed that coordinator can spend too much cpu/heap memory on json serde for taskUpdateRequest.

Impact

Test Plan

Verifier

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Improve communication between coordinator and worker with thrift serde.

@vhsu14 vhsu14 requested a review from a team as a code owner May 8, 2025 21:40
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

@shangm2 shangm2 changed the title Migrate TaskStatus, TaskUpdateRequest, TaskInfo to Thrift with JSON fields in CPP [native] Migrate TaskStatus, TaskUpdateRequest, TaskInfo to Thrift with JSON fields in CPP May 8, 2025
@vhsu14 vhsu14 force-pushed the export-D72886878 branch 4 times, most recently from 283fdee to 56f1f9e Compare May 9, 2025 02:45
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

vhsu14 added a commit to vhsu14/presto that referenced this pull request May 9, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from fcc64a3 to 6bb7dd4 Compare May 9, 2025 21:59
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

vhsu14 added a commit to vhsu14/presto that referenced this pull request May 10, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from 6bb7dd4 to 93997c0 Compare May 10, 2025 00:23
vhsu14 added a commit to vhsu14/presto that referenced this pull request May 12, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from 93997c0 to 89a3461 Compare May 12, 2025 23:03
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

shangm2 pushed a commit to shangm2/presto that referenced this pull request May 13, 2025
vhsu14 added a commit to vhsu14/presto that referenced this pull request May 13, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from 89a3461 to 69f7779 Compare May 13, 2025 05:39
vhsu14 added a commit to vhsu14/presto that referenced this pull request May 13, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from 69f7779 to fa072a5 Compare May 13, 2025 07:10
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

shangm2 pushed a commit to shangm2/presto that referenced this pull request May 13, 2025
shangm2 pushed a commit to shangm2/presto that referenced this pull request May 13, 2025
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from fa072a5 to 942e648 Compare May 14, 2025 02:22
vhsu14 added a commit to vhsu14/presto that referenced this pull request May 14, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

vhsu14 added a commit to vhsu14/presto that referenced this pull request May 14, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

vhsu14 added a commit to vhsu14/presto that referenced this pull request May 15, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from 2c09f18 to bd1aaaf Compare May 15, 2025 04:12
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

vhsu14 added a commit to vhsu14/presto that referenced this pull request May 15, 2025
…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Differential Revision: D72886878
@vhsu14 vhsu14 force-pushed the export-D72886878 branch from bd1aaaf to 014f577 Compare May 15, 2025 04:16
yuandagits
yuandagits previously approved these changes May 15, 2025

namespace facebook::presto::thrift {

void toThrift(const std::string& proto, std::string& thrift);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are ProtocolToThrift files auto generated?

Copy link
Contributor Author

@vhsu14 vhsu14 May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the changes in ProtocolToThrift files are now hand-written. there used to be a complicated workflow to use python and chevron templates to auto-gen it, but it does not support some functionalities and we removed it as part of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I misread. This class is responsible for creating Thrift objects from presto::protocol objects and the other way around. The thrift objects are auto generated, right?

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions, otherwise looks good to me

auto acceptHeader = headers.getSingleOrEmpty(proxygen::HTTP_HEADER_ACCEPT);
auto useThrift =
const auto& acceptHeader = headers.getSingleOrEmpty(proxygen::HTTP_HEADER_ACCEPT);
const auto sendThrift =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the same for TaskInfo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. Needs Java changes. Will be added as a follow up


namespace facebook::presto::thrift {

void toThrift(const std::string& proto, std::string& thrift);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I misread. This class is responsible for creating Thrift objects from presto::protocol objects and the other way around. The thrift objects are auto generated, right?

enum ErrorType {
USER_ERROR = 0,
INTERNAL_ERROR = 1,
INSUFFICIENT_RESOURCES = 2,
EXTERNAL = 3,
}

enum ErrorCause {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is auto generated based on idl, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline.

The IDL is generated based on Java classes. IDL will be generated automatically once we land changes to Drift. CMake generates thrift:: protocol classes during build process, they are not commited.

As a follow up we will also be generating the ProtocolToThrift.h

…ields in CPP (prestodb#25079)

Summary: Pull Request resolved: prestodb#25079

Reviewed By: yuandagits

Differential Revision: D72886878
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D72886878

@tdcmeehan
Copy link
Contributor

@vhsu14 can you help to describe clearly how the code generation works? Based on the PR description, the coordinator somehow does this--can you explain?

@shangm2
Copy link
Contributor

shangm2 commented May 19, 2025

@vhsu14 can you help to describe clearly how the code generation works? Based on the PR description, the coordinator somehow does this--can you explain?

@tdcmeehan we are using the idl generator from drift to generate the idl file from drift annotation. And we use the generate d idl file to generate cpp classes. This step is still manual and we will make it automatic in next step. There are also some issues with current pipeline, the one which use python scripts to do code generation, on how to support optional fields in thrift. And we will fix this in next step as well. Hope this answer your question.

shangm2 added a commit that referenced this pull request May 19, 2025
…elds (#25020)

## Description
1. We are enabling thrift for task update request, and task info for
critical api communication between coordinator and worker. We have two
config toggles for task update request sent to worker and the task info
returned to coordinator,
2. However, there are some classes that are java interface/polymorphic
fields. We keep them as json encoding for now and will migrate them in
the next step.
3. We are also doing proper change for native worker: #25079 

## Motivation and Context
1. We observed that coordinator can spend too much cpu/heap memory on
json serde for taskUpdateRequest.

## Impact
<!---Describe any public API or user-facing feature change or any
performance impact-->

## Test Plan
1. Passed verifier tests

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* Improve communication between coordinator and worker with thrift serde.
```
@shangm2 shangm2 merged commit 5b4b71c into prestodb:master May 19, 2025
105 of 107 checks passed
@ZacBlanco ZacBlanco mentioned this pull request May 29, 2025
21 tasks
vhsu14 added a commit that referenced this pull request Jun 5, 2025
## Description
<!---Describe your changes in detail-->
In #25079, ProtocolToThrift files
were manually written. This PR generates these files using pythons
scripts and chevron templates.

## Motivation and Context
<!---Why is this change required? What problem does it solve?-->
<!---If it fixes an open issue, please link to the issue here.-->

## Impact
<!---Describe any public API or user-facing feature change or any
performance impact-->

## Test Plan
<!---Please fill in how you tested your change-->
Unit test + Verifier (test_id 225961)
## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==
General Changes
* Update ProtocolToThrift files to be generated for cpp thrift serde.
AnuragKDwivedi pushed a commit to AnuragKDwivedi/presto-1 that referenced this pull request Jul 8, 2025
…elds (prestodb#25020)

## Description
1. We are enabling thrift for task update request, and task info for
critical api communication between coordinator and worker. We have two
config toggles for task update request sent to worker and the task info
returned to coordinator,
2. However, there are some classes that are java interface/polymorphic
fields. We keep them as json encoding for now and will migrate them in
the next step.
3. We are also doing proper change for native worker: prestodb#25079 

## Motivation and Context
1. We observed that coordinator can spend too much cpu/heap memory on
json serde for taskUpdateRequest.

## Impact
<!---Describe any public API or user-facing feature change or any
performance impact-->

## Test Plan
1. Passed verifier tests

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* Improve communication between coordinator and worker with thrift serde.
```
AnuragKDwivedi pushed a commit to AnuragKDwivedi/presto-1 that referenced this pull request Jul 8, 2025
…th JSON fields in CPP (prestodb#25079)

## Description
This PR depends on prestodb#25020.
Changes include:
- Use IDL generated by coordinator and make relevant changes to enable
thrift for TaskStatus, TaskUpdateRequest, and TaskInfo on native worker.
- Remove old pipeline of using python, json, and chevron templates for
producing C++ code.

## Motivation and Context
We observed that coordinator can spend too much cpu/heap memory on json
serde for taskUpdateRequest.

## Impact
<!---Describe any public API or user-facing feature change or any
performance impact-->

## Test Plan
Verifier

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* Improve communication between coordinator and worker with thrift serde.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants