Over the past few weeks I annoyed my environment with praising DTrace whenever possible. Yesterday, during a break at the Barcamp Munich, I gave Wolfram a short introduction on his Mac and decided to put some stuff here:
DTrace is a toolkit available on Solaris (Solaris 10 or OpenSolaris), recent MacOS versions and FreeBSD for mightier than tools like truss or strace but with way less impact. DTrace allows you to "hook" (called "probes") into the system and allows to do some analysis then.
I guess all that works best by showing an example first: PHP uses a wrapper over the system's memory allocation using a function called _emalloc (which is wrapped by a CPP macro called emalloc) so it might be interesting to see how often that function is being called. For doing that we can use a D-script (D being the DTrace scripting language, not DigitalMars's D) like that:
#!/usr/sbin/dtrace -s
#pragma D option quiet
pid$target::_emalloc:entry
{
printf("_emalloc was called!\n");
}
We can now simply call that script and tell DTrace to start a PHP interpreter and run a PHP script. DTrace will then change the running program in memory so that the message is printed whenever the system for the process, with the PID $target, enters the function _emalloc. $target is a special variable referring to a process started by DTrace using -c or a PID provided using -p.
$ ./script1.d -c "php script.php"
_emalloc was called!
_emalloc was called!
_emalloc was called!
...
That's nice but not really useful in any way, yet. As we'd like to at least know the size of the allocated memory area, which is the first parameter to _emalloc. The pid-provider helps us by providing the parameters to the functions as D-variables, so we can simply change our action to print that variable:
printf("_emalloc called, allocating %i bytes\n", arg0);
running the script now gives us the sizes:
./script2.d -c "php script.php"
_emalloc was called, allocating 5 bytes
_emalloc was called, allocating 6 bytes
_emalloc was called, allocating 5 bytes
...
The output is quite long and still rather useless, for making use from this information we at least need some aggregation, but DTrace helps there, too, so let's create an aggregation variable collecting the data in a usable way:
#!/usr/sbin/dtrace -s
#pragma D option quiet
pid$target::_emalloc:entry
{
@mallocsize["emalloc"] = quantize(arg0);
}
mallocsize and
emalloc are there freely chosen identifiers. Depending on your script the output now looks something like the following:
emalloc
value ------------- Distribution ------------- count
0 | 0
1 | 83
2 |@@ 1122
4 |@@@@@@@@ 5141
8 |@@@@@@ 4032
16 |@@@@@@@@@@@@@@@@@@ 11881
32 |@@@@@@ 3694
64 |@ 806
128 | 27
256 | 66
512 | 1
1024 | 1
2048 | 1
4096 | 1
8192 | 4
16384 | 0
32768 | 1
65536 | 0
131072 | 1
262144 | 0
This tells us that the most used allocation size is between 9 and 16 bytes and the largest space allocated is somewhere between 65536 and 131072 bytes.
For a deeper analysis we can now add a predicate to our probe so the action triggers only for that allocation. Such predicates are writing between slashes between the probe name and the action. Additionally I'm adding a ustack() call to the action, this will print the systems userspace backtrace -- which is C level, not PHP space.
#!/usr/sbin/dtrace -s
#pragma D option quiet
pid$target::_emalloc:entry
/ arg0 > 131072 /
{
printf("_emalloc(%i)\n", arg0);
ustack();
}
$ ./script4.d -c "php script.php"
emalloc(261900)
php`_emalloc
php`zend_vm_stack_new_page+0x19
php`zend_vm_stack_init+0xf
php`init_executor+0xf5
php`zend_activate+0x12
php`php_request_startup+0x7a
php`main+0xd86
php`_start+0x7d
So we see we're in the startup of PHP allocating some space on it's stack. One question now might be about the costs of an _emalloc call, one important factor there are syscalls to the operating system. As DTrace is made for utilizing the whole system that can be done quite easy using the syscall provider. Me might now use syscall:::entry as probe to be triggered on every call, but that will be quite a lot. As we're only interested in syscalls from _emalloc we'll use a thread-local variable as a flag and check that flag in the predicate condition:
#!/usr/sbin/dtrace -s
#pragma D option quiet
pid$target::_emalloc:entry
/ arg0 > 131072 /
{
self->inemalloc = 1;
}
pid$target::_emalloc:return
/ arg0 > 131072 /
{
self->inemalloc = 0;
}
syscall:::entry
/ self->inemalloc /
{
printf("%s", probefunc);
}
$ ./script4.d -c "php script.php"
brk
brk
So we're calling brk two times. brk is the syscall to "change the amount of space allocated for the calling process's data segment" which is exactly what we expect, but why is it called two times? Adding a ustack call to the syscall action can tell us where it happens, using the source this can then probably be optimized. That's left as an exercise to the interested reader.
In summary: No need to change the code and lots of information, I plan to write an additional article showing how to get interesting facts system-wide, not only for a specific process but all running ones, which is especially interesting when searching for a problem on production systems (DTrace is made to be used on productive systems!) or problems related to concurrent processes/threads.