Difference between revisions of "Stracing a program"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
m (Removed first heading -- it's introduction anyway)
(Fixed grammar and in some cases wording)
Line 1: Line 1:
Suppose some program on your system refuses to work or it works
+
Suppose some program on your system refuses to work or it works,
but musch slower then you've expected. How can you investigate yourself the reason
+
but much slower then you've expected. How can you investigate the reason
of such undesirable behaviour? First of all you can read and analyze logs that  
+
of such undesirable behaviour? First of all, you can read and analyze logs that  
this program does (if does). It's very usefull to run unkindly program with
+
this program does (if it does). It's very useful to run unkindly program with
<tt>--verbose</tt> argument or somehow else increase log level, if the program allows that.
+
<tt>--verbose</tt> argument or somehow else increase the log level, if the program allows that.
 
But what can you do, if logs are poor and it's impossible to infer the reason of badness from them?
 
But what can you do, if logs are poor and it's impossible to infer the reason of badness from them?
 
One way is to use strace program to follow system calls performed by given process.
 
One way is to use strace program to follow system calls performed by given process.
Line 10: Line 10:
 
Commonly to use strace you should give the following command:
 
Commonly to use strace you should give the following command:
  
<tt>strace -o strace.out -ff touch /tmp/file</tt>
+
<pre>
 
+
strace -o strace.out -ff touch /tmp/file
<tt>-o strace.out</tt> option means that strace program will output all investigated information
+
</pre>
to the file named <tt>strace.out</tt>
 
 
 
<tt>touch /tmp/file</tt> is the program with arguments to strace
 
  
If the process under strace (touch in our case) forks child processes, then, thanks to <tt>-ff</tt> option, child processes
+
Here
are straced too, and ouputs are placed in <tt>strace.out.PID</tt> files appropriately.
+
* <tt>-o strace.out</tt> option means that strace program will output all information
 +
to the file named <tt>strace.out</tt>;
 +
* <tt>-ff</tt> means to strace the forked children of the program. Child straces outputs will be placed to <tt>strace.out.PID</tt> files, where PID is a pid of the child.
 +
* <tt>touch /tmp/file</tt> is the program with arguments which is to be straced.
  
 
== Strace results ==
 
== Strace results ==
So what we have in strace.out:
+
So this is what we have in <tt>strace.out</tt>:
 
<pre>
 
<pre>
 
execve("/usr/bin/touch", ["touch", "/tmp/file"], [/* 51 vars */]) = 0
 
execve("/usr/bin/touch", ["touch", "/tmp/file"], [/* 51 vars */]) = 0
Line 61: Line 61:
 
</pre>
 
</pre>
  
Strace.out in conjunction with source code is realy very powerfull thing. So if you can't analyze strace yurself - send it
+
Analysis of <tt>strace.out</tt>, in conjunction with reading the source code, is realy very powerful thing. Even if you can not understand strace output yourself &mdash; send it to the developers, it'll be really helpful.
to developers, it'll be really helpful.
 
  
 
== Additional notes ==
 
== Additional notes ==
Sometimes (for example program works slow) you are also intersted in time, that program spends in syscalls. Than
+
Sometimes (for example if a program works slow) you are also intersted in time that the program spends in syscalls. Than
 
options <tt>-tt</tt> and <tt>-T</tt> are helpful.
 
options <tt>-tt</tt> and <tt>-T</tt> are helpful.
  
As you've seen strace program shows arguments of syscalls. But if arguments are a pointers it shows us only 32 bytes from these pointers.
+
As you've seen strace program shows arguments of syscalls. But if arguments are a pointers it shows us only 32 bytes from these pointers. To increase this number use <tt>-s</tt> option.
To increase this number use <tt>-s</tt> option
 
  
 
[[Category: Troubleshooting]]
 
[[Category: Troubleshooting]]

Revision as of 16:00, 9 June 2006

Suppose some program on your system refuses to work or it works, but much slower then you've expected. How can you investigate the reason of such undesirable behaviour? First of all, you can read and analyze logs that this program does (if it does). It's very useful to run unkindly program with --verbose argument or somehow else increase the log level, if the program allows that. But what can you do, if logs are poor and it's impossible to infer the reason of badness from them? One way is to use strace program to follow system calls performed by given process.

Use of strace

Commonly to use strace you should give the following command:

strace -o strace.out -ff touch /tmp/file

Here

  • -o strace.out option means that strace program will output all information

to the file named strace.out;

  • -ff means to strace the forked children of the program. Child straces outputs will be placed to strace.out.PID files, where PID is a pid of the child.
  • touch /tmp/file is the program with arguments which is to be straced.

Strace results

So this is what we have in strace.out:

execve("/usr/bin/touch", ["touch", "/tmp/file"], [/* 51 vars */]) = 0
uname({sys="Linux", node="dhcp0-138", ...}) = 0
brk(0)                                  = 0x804f000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=47843, ...}) = 0
mmap2(NULL, 47843, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f1a000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360V\1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1227872, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f19000
mmap2(NULL, 1142148, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e02000
mmap2(0xb7f13000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x110) = 0xb7f13000
mmap2(0xb7f17000, 7556, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f17000
close(3)                                = 0
mprotect(0xb7f13000, 4096, PROT_READ)   = 0
munmap(0xb7f1a000, 47843)               = 0
open("/dev/urandom", O_RDONLY)          = 3
read(3, "v\0265\313", 4)                = 4
close(3)                                = 0
brk(0)                                  = 0x804f000
brk(0x8070000)                          = 0x8070000
open("/tmp/file", O_WRONLY|O_NONBLOCK|O_CREAT|O_NOCTTY|O_LARGEFILE, 0666) = -1 EACCES (Permission denied)
utime("/tmp/file", NULL)                = -1 EACCES (Permission denied)
write(2, "touch: ", 7)                  = 7
write(2, "cannot touch `/tmp/file\'", 24) = 24
write(2, ": Permission denied", 19)     = 19
write(2, "\n", 1)                       = 1
exit_group(1)                           = ?

In this case we see, that the problem is in access to /tmp/file:

open("/tmp/file", O_WRONLY|O_NONBLOCK|O_CREAT|O_NOCTTY|O_LARGEFILE, 0666) = -1 EACCES (Permission denied)
utime("/tmp/file", NULL)                = -1 EACCES (Permission denied)

Analysis of strace.out, in conjunction with reading the source code, is realy very powerful thing. Even if you can not understand strace output yourself — send it to the developers, it'll be really helpful.

Additional notes

Sometimes (for example if a program works slow) you are also intersted in time that the program spends in syscalls. Than options -tt and -T are helpful.

As you've seen strace program shows arguments of syscalls. But if arguments are a pointers it shows us only 32 bytes from these pointers. To increase this number use -s option.