Skip to content

Profiling

Simon Ninon edited this page Mar 8, 2018 · 10 revisions

pprof

pprof helps diagnose which functions use most of the CPU resources. Note that this works better by compiling lib with a test file rather than linking against it.

  • compile with -g and link with -lprofiler
  • run with LDPRELOAD=/usr/lib/libprofiler.so CPUPROFILE=cpu.profile ./bin
  • run google-pprof --text cpu.profile ./bin (switch to --gv, --pdf, --svg, ... depending on what output you prefer)

cachegrind

cachegrind helps diagnose data & instructions cache misses. Note that this works better by compiling lib with a test file rather than linking against it.

  • compile with -g
  • run valgrind --tool=cachegrind ./bin
  • run cg_annotate cachegrind.out.xxxx to get a global report by file
  • run cg_annotate cachegrind.out.xxxx /abs/path/to/file.cpp to get a line-by-line report for a specific file
  • run cg_annotate cachegrind.out.xxxx --auto=yes to get a full line-by-line report for all files

callgrind

callgrind helps diagnose which instructions are executed the most. Note that this works better by compiling lib with a test file rather than linking against it.

  • compile with -g
  • run valgrind --tool=callgrind ./bin
  • run callgrind_annotate callgrind.out.xxxx to get a global report by file
  • run callgrind_annotate callgrind.out.xxxx --auto=yes to get a full line-by-line report for all files

kcachegrind

kcachegrind provides is an alternative to cg_annotate and callgrind_annotate to navigate through the reports. It provides a graphic interface that can be much more convenient depending on the context.

  • run kcachegrind callgrind.out.xxx or kcachegrind cachegrind.out.xxx

Approach

Top-down approach can be interesting to focus on the relevant areas.

Typically:

  • identify slow components with pprof
  • once identified, find root cause using cachegrind and callgrind
Clone this wiki locally