Summary: | Step will skip subsequent statements for malloc functions | ||
---|---|---|---|
Product: | gdb | Reporter: | Anonymous <iamanonymous.cs> |
Component: | breakpoints | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | iamanonymous.cs, jiangyy, vries |
Priority: | P2 | ||
Version: | HEAD | ||
Target Milestone: | 11.1 | ||
Host: | Target: | ||
Build: | Last reconfirmed: | 2021-01-06 00:00:00 |
Description
Anonymous
2021-01-05 15:31:42 UTC
I managed to reproduce this on ubuntu 20. Configurations: - gcc-10, system gdb, - gcc-10, gdb build from current trunk. The problem goes away when small.c is build with fcf-protection=none. I tried to reproduce this on my usual setup, openSUSE Leap 15.2, by forcing fcf-protection=full. Didn't reproduce. Copied Leap executable to ubuntu, and tried using gdb there. Didn't reproduce. Then copied ubuntu executable to Leap. Reproduced. So, sofar this seems specific to the ubuntu executable. The two executables have similar line info and insns for main. There is a difference in the plt. For leap, we have: ... 00000000000005f0 <malloc@plt>: 5f0: ff 25 32 0a 20 00 jmpq *0x200a32(%rip) \ # 201028 <malloc@GLIBC_2.2.5> 5f6: 68 02 00 00 00 pushq $0x2 5fb: e9 c0 ff ff ff jmpq 5c0 <.plt> ... For ubuntu, we have: ... 0000000000001090 <malloc@plt>: 1090: f3 0f 1e fa endbr64 1094: f2 ff 25 35 2f 00 00 bnd jmpq *0x2f35(%rip) \ # 3fd0 <malloc@GLIBC_2.2.5> 109b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ... Using "set debug infrun 1", with leap we have: ... [infrun] handle_signal_stop: stop_pc=0x5555555545f0 [infrun] process_event_stop_test: stepped into dynsym resolve code ... where: ... (gdb) info sym 0x5555555545f0 malloc@plt in section .plt of /home/vries/gdb_versions/devel/a.leap.out ... But with ubuntu we have: ... [infrun] handle_signal_stop: stop_pc=0x555555555090 [infrun] process_event_stop_test: stepped into subroutine [infrun] insert_step_resume_breakpoint_at_sal_1: inserting step-resume breakpoint at 0x7ffff7df0710 ... where: ... (gdb) info sym 0x555555555090 malloc@plt in section .plt.sec of /home/vries/gdb_versions/devel/a.out ... and: ... (gdb) info sym 0x7ffff7df0710 malloc in section .text of /lib64/ld-linux-x86-64.so.2 ... Looking for the "stepped into dynsym resolve code" in the gdb sources, we find in_solib_dynsym_resolve_code, which returns false with the ubuntu exec, and true with the leap exec. This fixes it: ... diff --git a/gdb/objfiles.h b/gdb/objfiles.h index b9bb80b7a62..2afd2f80154 100644 --- a/gdb/objfiles.h +++ b/gdb/objfiles.h @@ -786,7 +786,8 @@ extern int pc_in_section (CORE_ADDR, const char *); static inline int in_plt_section (CORE_ADDR pc) { - return pc_in_section (pc, ".plt"); + return (pc_in_section (pc, ".plt") + || pc_in_section (pc, ".plt.sec")); } /* Keep a registry of per-objfile data-pointers required by other GDB ... *** Bug 27120 has been marked as a duplicate of this bug. *** *** Bug 25565 has been marked as a duplicate of this bug. *** (In reply to Tom de Vries from comment #2) > This fixes it: > ... > diff --git a/gdb/objfiles.h b/gdb/objfiles.h > index b9bb80b7a62..2afd2f80154 100644 > --- a/gdb/objfiles.h > +++ b/gdb/objfiles.h > @@ -786,7 +786,8 @@ extern int pc_in_section (CORE_ADDR, const char *); > static inline int > in_plt_section (CORE_ADDR pc) > { > - return pc_in_section (pc, ".plt"); > + return (pc_in_section (pc, ".plt") > + || pc_in_section (pc, ".plt.sec")); > } > > /* Keep a registry of per-objfile data-pointers required by other GDB > ... Thanks a lot. I have try this commit and it indeed fix all similar problems. This is slightly more convoluted. I tried to reproduce the problem on openSUSE Factory. Using -fcf-protection=full, there I managed to get a .plt.sec section. But gdb handled it ok. It did not take the "stepped into dynsym resolve code" path, but handled things fine along another path. So I debugged once more the ubuntu exec on leap. I found that at some point we do: ... /* If we are in a function call trampoline (a stub between the calling routine and the real function), locate the real function. That's what tells us (a) whether we want to step into it at all, and (b) what prologue we want to run to the end of, if we do step into it. */ real_stop_pc = skip_language_trampoline (frame, stop_pc); ... and end up in objc_language::skip_trampoline, and then in gdbarch_skip_trampoline_code, and then in find_solib_trampoline_target: ... /* If PC is in a shared library trampoline code stub, return the address of the `real' function belonging to the stub. Return 0 if PC is not in a trampoline code stub or if the real function is not found in the minimal symbol table. We may fail to find the right function if a function with the same name is defined in more than one shared library, but this is considered bad programming style. We could return 0 if we find a duplicate function in case this matters someday. */ CORE_ADDR find_solib_trampoline_target (struct frame_info *frame, CORE_ADDR pc) { struct minimal_symbol *tsymbol = lookup_solib_trampoline_symbol_by_pc (pc); if (tsymbol != NULL) { for (objfile *objfile : current_program_space->objfiles ()) { for (minimal_symbol *msymbol : objfile->msymbols ()) { ... So, we find that the pc is a trampoline for malloc, and start iterating over the minsyms in the objfiles. With openSUSE Leap (glibc 2.26), we find this as first match: ... $ nm /lib64/ld-linux-x86-64.so.2 | grep malloc 0000000000019710 W malloc ... With openSUSE Factory (glibc 2.32), we have instead rtld_malloc so skip_language_trampoline returns 0. The master branch has been updated by Tom de Vries <vries@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=5fae2a2c66ca865f54505adb37be6bd51fecb6cd commit 5fae2a2c66ca865f54505adb37be6bd51fecb6cd Author: Tom de Vries <tdevries@suse.de> Date: Thu Jan 14 10:35:34 2021 +0100 [gdb/breakpoint] Handle .plt.sec in in_plt_section Consider the following test-case small.c: ... #include <stdio.h> #include <stdlib.h> #include <string.h> int main (void) { int *p = (int *)malloc (sizeof(int) * 4); memset (p, 0, sizeof(p)); printf ("p[0] = %d; p[3] = %d\n", p[0], p[3]); return 0; } ... On Ubuntu 20.04, we get: ... $ gcc -O0 -g small.c $ gdb -batch a.out -ex start -ex step Temporary breakpoint 1, main () at small.c:6 6 int *p = (int *) malloc(sizeof(int) * 4); p[0] = 0; p[3] = 0 [Inferior 1 (process $dec) exited normally] ... but after switching off the on-by-default fcf-protection, we get the desired behaviour: ... $ gcc -O0 -g small.c -fcf-protection=none $ gdb -batch a.out -ex start -ex step Temporary breakpoint 1, main () at small.c:6 6 int *p = (int *) malloc(sizeof(int) * 4); 7 memset (p, 0, sizeof(p)); ... Using "set debug infrun 1", the first observable difference between the two debug sessions is that with -fcf-protection=none we get: ... [infrun] process_event_stop_test: stepped into dynsym resolve code ... In this case, "in_solib_dynsym_resolve_code (malloc@plt)" returns true because "in_plt_section (malloc@plt)" returns true. With -fcf-protection=full, "in_solib_dynsym_resolve_code (malloc@plt)" returns false because "in_plt_section (malloc@plt)" returns false, because the section name for malloc@plt is .plt.sec instead of .plt, which is not handled in in_plt_section: ... static inline int in_plt_section (CORE_ADDR pc) { return pc_in_section (pc, ".plt"); } ... Fix this by handling .plt.sec in in_plt_section. Tested on x86_64-linux. [ Another requirement to be able to reproduce this is to have a dynamic linker with a "malloc" minimal symbol, which causes find_solib_trampoline_target to find it, such that skip_language_trampoline returns the address for the dynamic linkers malloc. This causes the step machinery to set a breakpoint there, and to continue, expecting to hit it. Obviously, we execute glibc's malloc instead, so the breakpoint is not hit and we continue to program completion. ] gdb/ChangeLog: 2021-01-14 Tom de Vries <tdevries@suse.de> PR breakpoints/27151 * objfiles.h (in_plt_section): Handle .plt.sec. Patch committed, marking resolved-fixed. No test-case. Triggering the error condition depends on external factors, so I'm not sure I'll be able to make one. BTW, my guess is that there are already test-cases that fail because of this on Ubuntu 20.04. |