Breaking on an application error
You will fine lot of instances when an unknown application ( or an application which you are not much familiar ) fails while starting up, logs some error message on the terminal , and exits. Since itdoesn't crash or raise any kind of OS level software interrupt ( signals ) , its hard to guess the reason for the failure . Running it under a debugger also doesn't help much in this case.
It would be helpful if we could break in the debugger , at the point the application is about to write on the terminal . From there , we can trace back callers to find the actual place which triggered the error.
There are ways in gdb to do exactly that , stop execution at the exact point when the application tries to write a particular error string to the stdout. It requires a little bit of knowledge of parameter passing and calling conventions as defined in the ABI for the architecture/platform. The architecture we have followed is the x86/x86_64 , but this same stuff should be adapted to any other architecture or under any other debugger.
The way we do it , with gdb 7.0 and above , you can set conditional breakpoint on a syscall like write(), as below :
(gdb) catch syscall write
Catchpoint 1 (syscall 'write' [4])
(gdb) condition 1 $rdi==1 && strcmp((char*)($rsi), "The specified object is attached to another target") == 0
Couple of things to note here :
1. The write syscall has the below declaration
ssize_t write(int fd, const void *buf, size_t count);
Since we are writing to stdout , the fd parameter would be 1 and the string would be in the second parameter buf .
2. On the Intel x86_64 architecture , according to argument passing rules defined in the ABI , the first parameter is passed in RDI , the second parameter is passed in register RSI and the return value is passed in RAX. We can also additional use the return value in many cases if required.
On 32 bit x86 systems , its not that straight-forward since the parameters are passed on the stack , so we would need to cast the stack locations holding the arguments. I would show in some later examples.
3. gdb lets us call strcmp() function which we can use to compare it to the required string. Please note that the string might also have some extra characters , and might not exactly match sometimes. In that case we can use any of the other string functions , eg, strstr() etc .
Additional Examples :
1. We can also use the break command instead of the catch command :
(gdb) break foo if strcmp(baz,"hello") == 0(gdb) break foo if ((int)strcmp(baz,"hello")) == 0
on some implementations, gdb might not know the return type of strcmp. That means you have to cast, otherwise it would always evaluate to true.
2. Difference between 32-bit and 64-bit
x86 32-bit mode
(gdb) break write if 1 == *(int*)($esp + 4) && strcmp((char*)*(int*)($esp + 8), "your string") == 0
x86 64 bit mode
(gdb) break write if 1 == $rdi && strcmp((char*)($rsi), "your string") == 0
3. Break when opening certain file:
x86 32-bit mode
(gdb) break open
(gdb) condition 1 strcmp(((char**)$esp)[1], "bar") == 0
x86 64-bit mode
(gdb) condition 1 strcmp((char *)$rsi, "bar") == 0
Valgrind/GDB Integration
Using GDB with Valgrind
You can do this with the version of Valgrind (3.8.1) and above :
Start your executable activating the gdbserver at startup:
valgrind --vgdb-error=0 ....
Then in another window, connect a gdb to Valgrind (following the indications given by Valgrind). Then put a break-point at a relevant place (e.g. at the end of main) and use the gdb continue command till the breakpoint is reached. Then do a leak search from gdb:
gdb> monitor leak_check full reachable any
Then list the address(es) of the reachable blocks of the relevant loss record nr
gdb> monitor block_list
You can then use gdb features to examine the memory of the given address(es). Note also the potentially interesting command "who_points_at" if you are searching who has kept a pointer to this memory.
Start your executable activating the gdbserver at startup:
valgrind --vgdb-error=0 ....
Then in another window, connect a gdb to Valgrind (following the indications given by Valgrind). Then put a break-point at a relevant place (e.g. at the end of main) and use the gdb continue command till the breakpoint is reached. Then do a leak search from gdb:
gdb> monitor leak_check full reachable any
Then list the address(es) of the reachable blocks of the relevant loss record nr
gdb> monitor block_list
You can then use gdb features to examine the memory of the given address(es). Note also the potentially interesting command "who_points_at" if you are searching who has kept a pointer to this memory.
gdb> monitor leak_check
This gives you an summary, exactly as at the end of any Valgrind-run,
but at precisely this point in the execution. You can add various
options to the command to get more out of it, but since this is
sufficient for the purpose of this blog, those are left as an excercise
to the reader.
gdb> who_points_at
This will give you hints on from where this particular memory is
referenced. Usually these are the places that are just overwritten, and
the memory at gets lost.
Emulated hardware watchpoints
If you use GDB 7.4 together with Valgrind 3.7.0, then you have unlimited "emulated" hardware watchpoints.
Start your program under Valgrind, giving the arguments --vgdb=full --vgdb-error=0 then use GDB to connect to it (target remote | vgdb). Then you can e.g. watch or awatch or rwatch a memory range by doing rwatch (char[100]) *0x5180040
link : http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserve
Start your program under Valgrind, giving the arguments --vgdb=full --vgdb-error=0 then use GDB to connect to it (target remote | vgdb). Then you can e.g. watch or awatch or rwatch a memory range by doing rwatch (char[100]) *0x5180040
link : http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserve
More GDB hacks
1. Here is how to make it stop when rawbuf has the value "FLUSH LOGS". At the (gdb prompt) enter the command:(gdb) cond 14 strstr (rawbuf, "FLUSH LOGS")
Then type the command "c" and watch it run until breakpoint 14 is reached AND the rawbuf variable contains the [sub]string "FLUSH LOGS".
2. Make breakpoint 12 stop only if the variable opt->name matches the string "user":
(gdb) cond 14 strcmp (opt->name, "user") == 0
There might be an issue in the above form where strcmp needs to call malloc to allocate space for the "user" string every time it breaks
We can try to do it the below way too :
(gdb) set $str = "hello"
(gdb) cond 1 strcmp (s.whatever, $str) == 0
3. To set a break-point in a library when it's loaded you can set a break-point on the symbol _dl_open. This function is called when a new library is loaded. You can set your break-point after you see that
your library has been loaded.
4. set step-mode on
The set step-mode on command causes the step command to stop at the first instruction of a function which contains no debug line information rather than stepping over it.
This is useful in cases where you may be interested in inspecting the machine instructions of a function which has no symbolic info and do not want gdb to automatically skip over this function.
5. Attach to a process when its launched
$ gdb --waitfor=PROCNAME
Where PROCNAME is the process name to continuously poll for until it has been launched. Note that some instructions will have been executed before GDB attaches this way because polling is not exactly instant however close it might seem to be.
6. A very useful technique is to know how to redirect stdout (file descriptor 1) and/or stderr (file descriptor 2) after a program has started. We are going to exploit the fact that print can invoke system calls like dup2 and open in our case. After attaching to the process do the following:
(gdb) p (void) dup2((int) open("/tmp/out.txt", 0x201, 0640), 1)
(gdb) p (void) dup2((int) open("/tmp/err.txt", 0x201, 0640), 2)
(gdb) detach
(gdb) q
In short it will redirect stdout to "/tmp/stdout.txt" and stderr to "/tmp/stderr.txt". The system call open is used to open a file for writing in our case. The mode "0x201" actually means "write only and create file if nonexistent" since O_CREAT | O_WRONLY = 0x200 | 0x1 = 0x201 (see the fcntl.h header file for details). "0640" is the umask (user has RW and group has R). After opening a file and retrieving its FD we need to redirect the device in question to it (stdout and stderr in this case). This is achieved using dup2 that creates an alias to the FD, does redirection and closes the old FD.
Additionally, it's important to cast types to enforce GDB to behave correctly. If this is not the case it could argue giving the following error message:
Unable to call function "open" at 0x7fff906e3fe4: no return type information available.
To call this function anyway, you can cast the return type explicitly (e.g. 'print (float) fabs (3.0)')
Another scenario would be to completely turn off output to stdout/stderr:
(gdb) p (void) close(1)
(gdb) p (void) close(2)
(gdb) det
(gdb) q