-
Notifications
You must be signed in to change notification settings - Fork 21
Profiling
pprof
helps diagnose which functions use most of the CPU resources.
Note that this works better by compiling lib with a test file rather than linking against it.
- compile with
-g
and link with-lprofiler
- run with
LDPRELOAD=/usr/lib/libprofiler.so CPUPROFILE=cpu.profile ./bin
- run
google-pprof --text cpu.profile ./bin
(switch to --gv, --pdf, --svg, ... depending on what output you prefer)
cachegrind
helps diagnose data & instructions cache misses.
Note that this works better by compiling lib with a test file rather than linking against it.
- compile with
-g
- run
valgrind --tool=cachegrind ./bin
- run
cg_annotate cachegrind.out.xxxx
to get a global report by file - run
cg_annotate cachegrind.out.xxxx /abs/path/to/file.cpp
to get a line-by-line report for a specific file - run
cg_annotate cachegrind.out.xxxx --auto=yes
to get a full line-by-line report for all files
callgrind
helps diagnose which instructions are executed the most.
Note that this works better by compiling lib with a test file rather than linking against it.
- compile with
-g
- run
valgrind --tool=callgrind ./bin
- run
callgrind_annotate callgrind.out.xxxx
to get a global report by file - run
callgrind_annotate callgrind.out.xxxx --auto=yes
to get a full line-by-line report for all files
kcachegrind
provides is an alternative to cg_annotate
and callgrind_annotate
to navigate through the reports. It provides a graphic interface that can be much more convenient depending on the context.
- run
kcachegrind callgrind.out.xxx
orkcachegrind cachegrind.out.xxx
Top-down approach can be interesting to focus on the relevant areas.
Typically:
- identify slow components with
pprof
- once identified, find root cause using
cachegrind
andcallgrind
Need more information? Open an issue.