-
-
Notifications
You must be signed in to change notification settings - Fork 202
Description
Hello!
I'm attempting to use pwru to help debug another issue I'm experiencing in cilium.
Unfortunately, I'm struggling to use --filter-trace-tc
. I end up getting an error like this:
root@rk102:/# /tmp/pwru --filter-trace-tc
2025/02/02 01:16:35 Attaching tc-bpf progs...
2025/02/02 01:16:35 failed to trace TC progs: failed to trace bpf progs: failed to find address for function tail_no_service of bpf prog SchedCLS(tail_no_service)#32
root@rk102:/# /tmp/pwru --filter-trace-tc
2025/02/02 01:18:08 Attaching tc-bpf progs...
2025/02/02 01:18:08 failed to trace TC progs: failed to trace bpf progs: failed to find address for function cil_from_overla of bpf prog SchedCLS(cil_from_overla)#31
root@rk102:/#
root@rk102:/# /tmp/pwru --filter-trace-tc
2025/02/02 01:18:11 Attaching tc-bpf progs...
2025/02/02 01:18:11 failed to trace TC progs: failed to trace bpf progs: failed to find address for function __send_drop_not of bpf prog SchedCLS(__send_drop_not)#103
root@rk102:/#
You'll notice the function name changes each time (I'm guessing because it's used as a map in the pwru source).
Digging into where addr2name comes from: it looks like /proc/kallsyms when using --backend kprobe
(which I am, because I'm missing one of the dependencies for kprobe-multi).
and indeed, there's nothing interesting in my /proc/kallsyms: (anything with cil looks xfs related)
root@rk102:/# grep cil /proc/kallsyms
ffffc03a32381088 T __traceiter_xfs_log_cil_wait
ffffc03a32381108 T __traceiter_xfs_log_cil_return
ffffc03a323813f8 T __traceiter_xfs_cil_whiteout_mark
ffffc03a32381468 T __traceiter_xfs_cil_whiteout_skip
ffffc03a323814d8 T __traceiter_xfs_cil_whiteout_unpin
ffffc03a323a0c28 T __probestub_xfs_log_cil_wait
ffffc03a323a0c58 T __probestub_xfs_log_cil_return
ffffc03a323a2ff8 T __probestub_xfs_cil_whiteout_mark
ffffc03a323a3028 T __probestub_xfs_cil_whiteout_skip
ffffc03a323a3058 T __probestub_xfs_cil_whiteout_unpin
ffffc03a323ea928 T xfs_dir_cilookup_result
ffffc03a324623a8 t xlog_cil_ctx_switch
ffffc03a32462498 t xfs_cil_prepare_item
ffffc03a32462570 t xlog_cil_order_cmp
ffffc03a324625c8 t xlog_cil_ctx_alloc
ffffc03a324626a8 t xlog_cil_cleanup_whiteouts
ffffc03a32462788 t xlog_cil_order_write
ffffc03a32462930 t xlog_cil_write_commit_record
ffffc03a32462a90 t xlog_cil_push_now.isra.0
ffffc03a32462b70 t xlog_cil_ail_insert
ffffc03a324630a8 t xlog_cil_committed
ffffc03a324632e8 t xlog_cil_push_work
ffffc03a32463c70 T xlog_cil_init_post_recovery
ffffc03a32463d30 T xlog_cil_process_committed
ffffc03a32463dd8 T xlog_cil_set_ctx_write_state
ffffc03a32464048 T xlog_cil_empty
ffffc03a324640e8 T xlog_cil_commit
ffffc03a32464da8 T xlog_cil_flush
ffffc03a32464e88 T xlog_cil_force_seq
ffffc03a32465150 T xlog_cil_init
ffffc03a32465370 T xlog_cil_destroy
ffffc03a32f06f58 T spi_new_ancillary_device
ffffc03a336ff2e8 T i2c_new_ancillary_device
ffffc03a33e912a8 T x25_parse_facilities
ffffc03a33e916f0 T x25_create_facilities
ffffc03a33e91930 T x25_negotiate_facilities
ffffc03a33e91ad0 T x25_limit_facilities
ffffc03a34086b70 t qede_ptp_ancillary_feature_enable
I check all the dependencies from the README:
root@rk102:/# zgrep -E 'CONFIG_DEBUG_INFO_BTF=|CONFIG_KPROBES=|CONFIG_PERF_EVENTS=|CONFIG_BPF=|CONFIG_BPF_SYSCALL=|CONFIG_FUNCTION_TRACER=|CONFIG_FPROBE=' /proc/config.gz
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_PERF_EVENTS=y
CONFIG_KPROBES=y
CONFIG_DEBUG_INFO_BTF=y
CONFIG_FUNCTION_TRACER=y
(except, CONFIG_FPROBES, considering I'm on kernel 6.12, and not using kprobe-multi).
I think there's a hidden/missing dependency here that's causing my error?
I also noticed that using kubectl exec ds/cilium
and trying to run pwru gave an even worse result:
failed to trace TC progs: failed to trace bpf progs: failed to get entry function name: failed to get program instructions: parse func info: offset 0: type ID 0 is a *btf.Void, but expected a Func
but that was fixed by using either kubectl debug --profile=sysadmin
, or the sample YAML file in the README. I get the same --filter-trace-tc
results with either.
I've tried with the latest commit - 7c481ae - to ensure I picked up #484 which sounded suspiciously close to my issue, but no dice!
I assume I'm one of the only folks hitting this particular issue, or otherwise there'd be another report! Is there anything I could be missing, kernel module wise, or some tuneable that's making this unaccessible?
Thanks for the help!
[edit]
I found #460 (comment) which sounded a little similar -- I have now set both sysctls and still no luck. I see the same error and note that /proc/kallsyms does not look more populated.
/ # cat /proc/sys/kernel/perf_event_paranoid
-1
/ # cat /proc/sys/kernel/kptr_restrict
0