-
Notifications
You must be signed in to change notification settings - Fork 597
[Excutor] Increase buffer size to prevent address corruption; add forward metadata debug tool #3404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for your contribution! |
gongshaotian
previously approved these changes
Aug 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
yuanlehome
previously approved these changes
Aug 15, 2025
9d363a9
EmmonsCurse
reviewed
Aug 15, 2025
gongshaotian
approved these changes
Aug 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
0.背景为调试时经常需要通过观察ForwardMeta来辅助判断,所以增加了打印ForwardMeta的工具,方便观察地址,shape,设备类型等,简单更改后可以直接打出ForwardMetadata中的Tensor内容,方便调试。有效的处理了列表包含Tensor,成员变量的成员变量包含Tensor等嵌套情况,

以下为一个打印结果
为了这部分新增功能的覆盖率,增加了一个单元测试test/model_executor/test_forward_meta_str.py,同时单元测试环境中scripts/unittest_requirement.txt增加了一个依赖包partial_json_parser,原本就存在部分测试由于依赖这个包而单元测试环境中没有而失败。
1.ForwardMetadata相关的变量batch_id_per_token,cu_seqlens_q,cu_seqlens_k虽然使用了copy_固定地址,但是由于buffer分配的不够大,导致拷贝的变量shape更大时会重新分配地址,使得用copy_固定地址失去了意义,通过一开始就申请更大的Buffer来解决。
2.ForwardMetadata相关变量kv_num_blocks在prefill时是cpu tensor,在decode时为gpu tensor。在cuda层面其明确了是cpu tensor,发现原因为decode分支中设置place错误,先修改decode分支也为cpu tensor。虽然目前没有影响到程序的正确性但在python层如果对kv_num_blocks进行了一些设备类型敏感的操作会导致程序挂掉并且难以排查原因。
3.AppendAttentionMetadata中的以下相关变量
也存在地址变动,其中一些与prefill阶段有关,为了适配将来prefill进Cudagraph,这个问题也需要解决,目前代码已实现,预计会和改动后的Cudagraph一起合并。