pdf
|<<
<
>
>>|
/
{{#x|Flame Graphs}} for {{#i|Online}} Performance Profiling ☺{{#author|agentzh@gmail.com}}☺ {{#author|Yichun Zhang (agentzh)}} {{img src="images/cloudflare2.gif" width="141" height="71"}} {{#date|2013.06.01}} ---- {{#x|♡}} {{#x|Flame Graphs}} is a kind of visualization for analyzing how time or some other resource is {{#ci|distributed}} among all the code paths. {{img src="images/small-flamegraph.png" width="240" height="64"}} ---- {{img src="images/my-day-flame-graph2.png" width="806" height="403"}} ---- {{#x|♡}} {{#c|Colors}} in Flame Graphs do {{#x|not}} matter; they are picked up by random. {{img src="images/colors2.png" width="320" height="160"}} ---- {{img src="images/my-day-samples2.png" width="792" height="490"}} ---- {{img src="images/my-day-merge2.png" width="784" height="368"}} ---- {{#x|♡}} Box {{#ci|widths}} are equal to the {{#i|number}} of the corresponding samples; sample count is proportional to {{#x|time}}. {{img src="images/box-width2.png" width="276" height="131"}} ---- {{#x|♡}} For Flame Graphs in the {{#i|software}} world, {{#ci|code paths}} are defined as {{#x|backtraces}}. {{img src="images/backtrace2.png" width="311" height="63"}} ---- {{img src="images/software-stack2.png" width="563" height="410"}} ---- IO::Select::select IO::Socket::connect IO::Socket::INET::connect IO::Socket::INET::configure IO::Socket::new IO::Socket::INET::new Test::Nginx::Socket::send_request Test::Nginx::Socket::run_test_helper Test::Nginx::Util::run_test Test::Nginx::Util::run_tests ---- 0x3880ef2877 : socket+0x7/0x30 [/usr/lib64/libc-2.15.so] 0x537445 : Perl_pp_socket+0x233/0x376 [/opt/perl/bin/perl] 0x4d24ab : Perl_runops_standard+0x17/0x40 [/opt/perl/bin/perl] 0x43d8cc : S_run_body+0x1a2/0x1ac [/opt/perl/bin/perl] 0x43d363 : perl_run+0xae/0x475 [/opt/perl/bin/perl] 0x41e34c : main+0xc0/0x146 [/opt/perl/bin/perl] 0x3880e21735 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.15.so] 0x41e1a9 : _start+0x29/0x2c [/opt/perl/bin/perl] ---- 0xffffffff81632f81 : _raw_spin_unlock_irqrestore+0x11/0x20 [kernel] 0xffffffff8108e98e : __wake_up_sync_key+0x5e/0x80 [kernel] 0xffffffff8119d340 : pipe_write+0x3c0/0x540 [kernel] 0xffffffff81194737 : do_sync_write+0xa7/0xe0 [kernel] 0xffffffff81194dec : vfs_write+0xac/0x180 [kernel] 0xffffffff81195132 : sys_write+0x52/0xa0 [kernel] 0xffffffff8163baa7 : tracesys+0xdd/0xe2 [kernel] ---- {{#x|♡}} We {{#ci|gather}} various kinds of backtraces on Linux via {{#x|systemtap}}. {{img src="images/systemtap2.png" width="180" height="169"}} ---- {{img src="images/how-systemtap-works2.png" width="806" height="403"}} ---- {{#x|♡}} At every Linux {{#ci|system tick}} (controlled by {{#x|CONFIG_HZ}}, 1000 on my side), if the current process {{#ci|on CPU}} is the process we are interested in, sample a backtrace, and {{#x|aggregate}} it immediately. ---- {{#x|♡}} The {{#ci|DWARF}} debug information is the {{#x|map}} for the cold {{#c|binary world}}. {{img src="images/dwarf_logo2.gif" width="134" height="182"}} ---- {{#v|$}} gcc {{#c|-g}} ... {{#v|$}} sh Configure -Doptimize={{#c|-g}} -des -Dprefix=/opt/perl {{#v|$}} yum install xxx-{{#c|debuginfo}} {{#v|$}} apt-get install xxx-{{#c|dbg}} ---- {{#x|♡}} Simple wrapper {{#ci|tools}} based on systemtap are ready for {{#x|everyday use}}. ---- {{#x|♡}} Generating {{#ci|Perl}}-land Flame Graphs with just {{#x|2}} commands. ---- {{img src="images/github-perl-stap-toolkit2.png" width="818" height="381"}} ---- {{#cm|# assuming the perl process is of pid 1302.}} {{#v|$}} pl-sample-bt -p {{#x|1302}} -t 5 > {{#x|a.bt}} WARNING: Sampling 1302 (/opt/perl/bin/perl) for Perl-land backtraces... Please wait for 5 seconds. ---- Test::Nginx::Socket::send_request Test::Nginx::Socket::run_test_helper Test::Nginx::Util::run_test Test::Nginx::Util::run_tests {{#x|58}} Test::Nginx::Util::error_log_data Test::Nginx::Socket::check_error_log Test::Nginx::Socket::run_test_helper Test::Nginx::Util::run_test Test::Nginx::Util::run_tests {{#x|54}} ... ---- {{img src="images/github-flamegraph2.png" width="787" height="371"}} ---- {{#v|$}} stackcollapse-stap.pl {{#x|a.bt}} | flamegraph.pl - > {{#x|a.svg}} ---- {{http://agentzh.org/misc/flamegraph/perl-test-nginx-socket.svg}} {{img src="images/perl-test-nginx-socket2.png" width="960" height="258"}} ---- {{img src="images/interactive-svg.png" width="673" height="517"}} ---- {{img src="images/svg-all-samples.png" width="567" height="532"}} ---- {{#x|♡}} I just ported perl 5's {{#ci|pp_caller}} opcode's implementation over to the {{#x|systemtap}} scripting language. ---- {{#x|♡}} Generating user-space {{#ci|C}}-land Flame Graphs for the {{#x|same}} perl process with another 2 commands. ---- {{img src="images/github-nginx-stap-toolkit2.png" width="813" height="371"}} ---- {{#cm|# assuming the perl process is of pid 1302.}} {{#v|$}} ngx-sample-bt -p {{#x|1302}} -t 5 {{#x|-u}} > {{#x|a.bt}} WARNING: Tracing 1302 (/opt/perl/bin/perl) in user-space only... WARNING: Time's up. Quitting now...(it may take a while) ---- {{#v|$}} stackcollapse-stap.pl {{#x|a.bt}} | flamegraph.pl - > {{#x|a.svg}} ---- {{http://agentzh.org/misc/flamegraph/perl-vm-test-nginx.svg}} {{img src="images/perl-vm-test-nginx2.png" width="804" height="280"}} ---- {{img src="images/pp-aassign.png" width="681" height="540"}} ---- {{img src="images/pp-entersub.png" width="571" height="336"}} ---- {{img src="images/pp-method-named.png" width="618" height="325"}} ---- {{#x|♡}} We can {{#x|profile}} on the Perl 5 {{#ci|opcode}} level via the userspace C-land flamegraphs. ---- {{#x|♡}} We may make clever use of the high-level Perl language constructs to {{#ci|eliminate}} specific hot Perl 5 {{#x|opcodes}}. ---- {{#x|♡}} We may help Perl 5 {{#x|porters}} to find hot places within the perl VM that can be further {{#ci|optimized}}. ---- {{#x|♡}} Actually we are already doing {{#ci|both}} for {{#x|LuaJIT}} at CloudFlare. ---- {{img src="images/luajit-fnew.png" width="563" height="369"}} ---- {{#x|lj_BC_CAT}} --> switch to string arrays + concat {{#x|lj_BC_FNEW}} --> reduce creating anonymous functions ---- {{img src="images/pcre_compile2.png" width="432" height="358"}} ---- {{#x|pcre_compile2}} --> cache the compiled regexes ---- {{img src="images/lua-yield.png" width="259" height="386"}} ---- {{#x|lua_yield}} --> LuaJIT {{#ci|internal}} optimizations by Mike Pall ---- {{img src="images/luajit-newkey.png" width="282" height="202"}} ---- {{#x|lj_tab_newkey}} --> new LuaJIT {{#ci|primitive}} table.new() for pre-allocation ---- {{#x|♡}} Generating {{#ci|kernel}}-space Flame Graphs for the {{#x|same}} perl process with 2 similar commands. ---- {{#cm|# assuming the perl process is of pid 1302.}} {{#v|$}} ngx-sample-bt -p {{#x|1302}} -t 5 {{#x|-k}} > {{#x|a.bt}} WARNING: Tracing 1302 (/opt/perl/bin/perl) in kernel-space only... WARNING: Time's up. Quitting now...(it may take a while) ---- {{#v|$}} stackcollapse-stap.pl {{#x|a.bt}} | flamegraph.pl - > {{#x|a.svg}} ---- {{http://agentzh.org/misc/flamegraph/kernel-test-nginx.svg}} ---- {{img src="images/kernel-test-nginx2.png" width="804" height="516"}} ---- {{#x|♡}} {{#ci|off-CPU}} time Flame Graphs ---- {{#x|♡}} File {{#ci|I/O}} Flame Graphs ---- {{#x|♡}} Special {{#x|thanks}} go to Brendan Gregg for {{#ci|inventing}} Flame Graphs. ---- ☺ {{#ci|Any questions}}? ☺