Skip to content

fix(core): avoid blocking THP when send buffer is full #5618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

romanz
Copy link
Contributor

@romanz romanz commented Aug 22, 2025

The issue can be reproduced using T3W1 over USB, by the following steps:

  1. run trezorctl ping 123 -b &
  2. wait until ping confirmation appears
  3. run killall -9 trezorctl
  4. confirm the ping on the UI

the device should try to retransmit - but since trezorctl is killed, it will timeout after 5 seconds, restart the event loop, and will be ready for the next THP interaction:

[15:54:26.449] 834.713 trezor.wire.thp.channel INFO [WebUSB] (cid: 1871) write message: Success
[15:54:26.451] 834.715 trezor.wire.thp.channel DEBUG [WebUSB] (cid: 1871) encrypt
[15:54:26.453] 834.717 trezor.wire.thp.crypto DEBUG enc (key: 097c33bcf26d9a03ec95938d8d781c5ebc4f7bda3d85c9650b901f0ac05bd211, nonce: 6)
[15:54:26.455] 834.719 trezor.wire.thp.channel DEBUG [WebUSB] (cid: 1871) New nonce_send: 7
[15:54:26.456] 834.720 trezor.wire.thp.channel DEBUG [WebUSB] (cid: 1871) write_encrypted_payload_loop
[15:54:26.695] 834.959 trezor.loop DEBUG finish: <generator object '_write_packets' at 0x201d6320>
[15:54:26.796] 835.060 trezor.wire.thp.channel WARNING Retransmit after 100 ms
[15:54:28.545] 836.809 trezor.loop DEBUG finish: <generator object 'timer_task' at 0x201d6a60>
[15:54:31.796] 840.061 trezor.loop DEBUG finish: <generator object '__iter__' at 0x201d75d0>
[15:54:31.797] 840.061 trezor.wire.thp.channel ERROR Sending is stuck for 5000 ms
[15:54:31.798] 840.062 trezor.wire.thp.session_context ERROR [WebUSB] exception:
[15:54:31.798] Traceback (most recent call last):
[15:54:31.798]   File "trezor/wire/thp/session_context.py", in handle
[15:54:31.798]   File "trezor/wire/message_handler.py", in handle_single_message
[15:54:31.798]   File "trezor/wire/thp/channel.py", in write
[15:54:31.798]   File "trezor/wire/thp/channel.py", in write_encrypted_payload
[15:54:31.798] Timeout: THP retransmission timeout
[15:54:31.798] 840.062 trezor.workflow DEBUG joining 0 workflows
[15:54:31.800] 840.064 trezor.loop DEBUG finish: <generator object '__iter__' at 0x201cb960>
[15:54:31.800] 840.064 trezor.loop DEBUG finish: <generator object 'handle_session' at 0x201c9a00>
[15:54:31.801] 840.065 trezor.loop DEBUG finish: <generator object '__iter__' at 0x201d7a20>
[15:54:31.801] 840.066 trezor.loop DEBUG finish: <generator object 'handle_session' at 0x201ca980>
[15:54:32.058] 840.066 session DEBUG Restarting main loop

Found during debugging #5614 debugging.

@trezor-bot trezor-bot bot added this to Firmware Aug 22, 2025
@github-project-automation github-project-automation bot moved this to 🔎 Needs review in Firmware Aug 22, 2025
Copy link

github-actions bot commented Aug 22, 2025

en main(all)

model device_test click_test persistence_test
T2T1 test(all) main(all) test(all) main(all) test(all) main(all)
T3B1 test(all) main(all) test(all) main(all) test(all) main(all)
T3T1 test(all) main(all) test(all) main(all) test(all) main(all)
T3W1 test(all) main(all) test(all) main(all) test(all) main(all)

Latest CI run: 17175955327

@romanz romanz force-pushed the romanz/thp-write-timeout branch from 4245044 to 1784401 Compare August 23, 2025 12:52
@romanz romanz self-assigned this Aug 23, 2025
@romanz romanz changed the base branch from romanz/thp-errors to main August 23, 2025 12:57
@romanz romanz added the T3W1 label Aug 23, 2025
Stop retransmission if writes are blocked - e.g. due to USB flow control.
It allows restarting the event loop to handle other THP channels.

`loop.race()` is used, since it schedules a single timeout task per message - instead of `loop.wait.timeout_ms`, which schedules a timeout task once per packet.

[no changelog]
@romanz romanz force-pushed the romanz/thp-write-timeout branch from 1784401 to 547cfc6 Compare August 23, 2025 13:21
@romanz romanz marked this pull request as ready for review August 23, 2025 14:00
@romanz romanz requested a review from obrusvit as a code owner August 23, 2025 14:00
@romanz romanz requested review from matejcik and M1nd3r August 23, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🔎 Needs review
Development

Successfully merging this pull request may close these issues.

1 participant