Gdb调试多进程程序 02月14日

最近需要调试memcached等程序,发现看代码是一件很悲剧的事情,最后还是借助gdb吧,这个算一劳永逸了。

Gdb调试多进程程序

程序经常使用fork/exec创建多进程程序。多进程程序有自己独立的地址空间,这是多进程调试首要注意的地方。Gdb功能强大,对调试多线程提供很多支持。

方法1:调试多进程最土的办法:attach pid

Attach是调试进程的常用办法,只要有可执行程序以及相应PID,即可工作。当然,为方便调试,可以在进程启动后,设定sleep一段时间,如30s,这样即可有充足的时间来attach。

方法2: set follow-fork-mode child + main断点

当设置set follow-fork-mode child,gdb将在fork之后直接执行子进程,知道碰到断点后停止。如何设置子进程的断点呢?在父进程中是无法知道子进程的地址空间的(只有等程序载入后方可知)。Gdb提供一个很方便的机制:main函数的断点将被子进程继承(毕竟main是任何程序的入口)。

注意:程序在main停下后,可尝试设置断点。断点是否有效,取决于gdb是否已经载入目标程序的地址空间。

方法3: set follow-fork-mode child + catch exec

Cache点是一种特殊的breakpoint。Gdb能够catch的事件很多,如throw/catch/exception/syscall/exec/fork/vfork等。其中和多进程关系最大的就是exec/fork事件。

举例:

GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
(gdb) catch exec
Catchpoint 1 (exec)
(gdb) set follow-fork-mode child
(gdb) r  -d ***
Catchpoint 1 (exec'd /****/binary), 0x0000003c68800a70 in _start ()
from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0  0x0000003c68800a70 in _start () from /lib64/ld-linux-x86-64.so.2
#1  0x0000000000000003 in ?? ()
#2  0x00007fff65c6e85a in ?? ()
#3  0x00007fff65c6e85d in ?? ()
#4  0x00007fff65c6e860 in ?? ()
(gdb) b lib.cc:8720
No symbol table is loaded.  Use the "file" command.
(gdb) c
Continuing
(gdb) bt
#0  0x0000003c68800a70 in _start () from /lib64/ld-linux-x86-64.so.2
#1  0x0000000000000002 in ?? ()
#2  0x00007fff1af7682a in ?? ()
#3  0x0000000000000000 in ?? ()
(gdb)  b lib.cc:8720
Breakpoint 2 at 0x15f9694: file lib.cc, line 8720.
(gdb) c
Continuing.
[Thread debugging using libthread_db enabled]
[Thread 0x40861940 (LWP 12602) exited]
[Switching to process 12630]
0x0000003c6980d81c in vfork () from /lib64/libpthread.so.0
Warning:
Cannot insert breakpoint 2.
Error accessing memory address 0x15f9694: Input/output error.
(gdb) bt
#0  0x0000003c6980d81c in vfork () from /lib64/libpthread.so.0
#1  0x000000000040c3fb in ?? ()
#2  0x00002adeab604000 in ?? ()
#3  0x01000000004051ef in ?? ()
#4  0x00007fffff4a42f0 in ?? ()
#5  0x686365746e6f6972 in ?? ()
#6  0x0000000d0000000c in ?? ()
#7  0x0000000b0000000a in ?? ()
#8  0x0000000000000000 in ?? ()
(gdb) delete 2  --此处当breakpoint无效时,必须删除,否则程序无法继续
(gdb) c
Continuing.
[New process 12630]
Executing new program: /****/binary
warning: Cannot initialize thread debugging library: generic error
[Switching to process 12630]
Catchpoint 1 (exec'd /****/binary), 0x0000003c68800a70 in _start ()
from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0  0x0000003c68800a70 in _start () from /lib64/ld-linux-x86-64.so.2
#1  0x0000000000000009 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) b lib.cc:8720
Breakpoint 4 at 0x15f9694: file lib.cc, line 8720.
(gdb) b type.cc:32
Breakpoint 5 at 0x1693050: file type.cc, line 32.
(gdb) c
Continuing.
(gdb)  -- 和正常程序调试一样

说明:catch exec后,程序将在fork/vfork/exec处停下。并非每次停下后,设置断点都是有效的。如提供断点无效,需要删除,否则程序无法继续。要能够在新进程中设置断点,一定要等到新进程的地址空间被载入后,设置断点是才有效(exec将改变原程序的地址空间)。上述例子,主要想展示如何对新进程设置断点!

注意:
1)程序地址非常重要(代码和数据地址一样重要)。使用gdb时,多多注意和利用地址信息。
2)On some systems, when a child process is spawned by vfork, you cannot debug the child or parent until an exec call completes.

方法4info inferiors/inferiors inferiors

设置set detach-on-fork off/set follow-exec-mode new

If you choose to set `detach-on-fork’ mode off, then gdb will retain control of all forked processes (including nested forks). You can list the forked processes under the control of gdb by using the info inferiors command, and switch from one fork to another by using the inferior command.

所使用的gdb不支持set detach-on-fork off/set follow-exec-mode new/info inferiors。不清楚。