This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: glibc 2.22 -- Final testing for 32-bit x86 failing?


On 08/02/2015 01:23 PM, H.J. Lu wrote:
> On Sun, Aug 2, 2015 at 8:51 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 08/01/2015 04:03 PM, H.J. Lu wrote:
>>> On Sat, Aug 1, 2015 at 12:27 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>>>> Community,
>>>>
>>>> Has anyone else done 32-bit x86 testing and had it work? Top of master
>>>> is showing some kind of problem with memory corruption.
>>>>
>>>> memory clobbered past end of allocated block
>>>>
>>>> Program received signal SIGABRT, Aborted.
>>>> 0xf7ffdc10 in __kernel_vsyscall ()
>>>> (gdb) bt
>>>> #0  0xf7ffdc10 in __kernel_vsyscall ()
>>>> #1  0xf7e7548f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
>>>> #2  0xf7e76ec0 in __GI_abort () at abort.c:89
>>>> #3  0xf7eb4f99 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0xf7fbaf3f "%s") at ../sysdeps/posix/libc_fatal.c:175
>>>> #4  0xf7eb4fd4 in __GI___libc_fatal (message=0xf7fbe128 "memory clobbered past end of allocated block\n")
>>>>     at ../sysdeps/posix/libc_fatal.c:186
>>>> #5  0xf7ec2570 in mabort (status=MCHECK_TAIL) at mcheck.c:362
>>>> #6  0xf7ec264b in checkhdr (hdr=hdr@entry=0x5657a018) at mcheck.c:113
>>>> #7  0xf7ec2c18 in checkhdr (hdr=0x5657a018) at mcheck.c:185
>>>> #8  freehook (ptr=0x5657a030, caller=0xf7e6c86b <_nl_load_locale_from_archive+1419>) at mcheck.c:186
>>>> #9  0xf7ec0395 in __GI___libc_free (mem=0x5657a030) at malloc.c:2936
>>>> #10 0xf7e6c86b in _nl_load_locale_from_archive (category=category@entry=5, namep=namep@entry=0xffffcd9c) at loadarchive.c:190
>>>> #11 0xf7e6b730 in _nl_find_locale (locale_path=0x0, locale_path_len=0, category=category@entry=5, name=name@entry=0xffffcd9c)
>>>>     at findlocale.c:154
>>>> #12 0xf7e6ae51 in __GI_setlocale (category=5, locale=0x80882e4 "") at setlocale.c:417
>>>> #13 0x0804a428 in ?? ()
>>>> #14 0xf7e60480 in __libc_start_main (main=0x804a3d0, argc=2, argv=0xffffcfc0, init=0x8088200, fini=0x8088270,
>>>>     rtld_fini=0x56565ab0 <_dl_fini>, stack_end=0xffffcfbc) at libc-start.c:289
>>>> #15 0x0804abb7 in ?? ()
>>>> (gdb)
>>>>
>>>> Has anyone seen this? This appears to be new. I'll see if I can track
>>>> this down to an environment change in my build box (Fedora 21).
>>>
>>> I only saw
>>>
>>> FAIL: math/test-float
>>>
>>> on Fedora 22.
>>
>> Could you do me a favour and run valgrind on your build of localedef?
>>
>> valgrind --leak-check=full --show-leak-kinds=all  $PWD/elf/ld.so --library-path $PWD:$PWD/elf ./locale/localedef --list-archive
>>
>> The most important thing I see is this:
>>
>> ==20823== Warning: set address range perms: large range [0x809a000, 0x2809a000) (noaccess)
>> ==20823== Invalid read of size 4
>> ==20823==    at 0x496BEF9: _nl_archive_subfreeres (in /home/carlos/scratch/build/glibc-pristine-i686/libc.so)
>> ==20823==    by 0x496BC43: free_mem (in /home/carlos/scratch/build/glibc-pristine-i686/libc.so)
>> ==20823==    by 0x496C3A1: __libc_freeres (in /home/carlos/scratch/build/glibc-pristine-i686/libc.so)
>> ==20823==    by 0x4801528: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-x86-linux.so)
>> ==20823==    by 0x48D0CDD: _Exit (_exit.S:29)
>> ==20823==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
>> ==20823==
>> ==20823==
>> ==20823== Process terminating with default action of signal 11 (SIGSEGV)
>> ==20823==  Access not within mapped region at address 0x10
>> ==20823==    at 0x496BEF9: _nl_archive_subfreeres (in /home/carlos/scratch/build/glibc-pristine-i686/libc.so)
>> ==20823==    by 0x496BC43: free_mem (in /home/carlos/scratch/build/glibc-pristine-i686/libc.so)
>> ==20823==    by 0x496C3A1: __libc_freeres (in /home/carlos/scratch/build/glibc-pristine-i686/libc.so)
>> ==20823==    by 0x4801528: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-x86-linux.so)
>> ==20823==    by 0x48D0CDD: _Exit (_exit.S:29)
>> ==20823==  If you believe this happened as a result of a stack
>> ==20823==  overflow in your program's main thread (unlikely but
>> ==20823==  possible), you can try to increase the size of the
>> ==20823==  main thread stack using the --main-stacksize= flag.
>> ==20823==  The main thread stack size used in this run was 8388608.
>>
>> This is not actually where it fails in glibc, which detects the failure
>> earlier via malloc checking.
>>
>> c.
> 
> I can reproduce it on both ia32 and x86-64.
> 
> struct __locale_data *
> internal_function
> _nl_load_locale_from_archive (int category, const char **namep)
> {
> 
> has
> 
>  for (cnt = 0; cnt < __LC_LAST; ++cnt)
>     if (cnt != LC_ALL)
>       {
>         lia->data[cnt] = _nl_intern_locale_data (cnt,
>                                                  results[cnt].addr,
>                                                  results[cnt].len);
>         if (__glibc_likely (lia->data[cnt] != NULL))
>           {
>             /* _nl_intern_locale_data leaves us these fields to initialize.  */
>             lia->data[cnt]->alloc = ld_archive;
>             lia->data[cnt]->name = lia->name;
> 
>             /* We do this instead of bumping the count each time we return
>                this data because the mappings stay around forever anyway
>                and we might as well hold on to a little more memory and not
>                have to rebuild it on the next lookup of the same thing.
>                If we were to maintain the usage_count normally and let the
>                structures be freed, we would have to remove the elements
>                from archloaded too.  */
>             lia->data[cnt]->usage_count = UNDELETABLE;
>           }
>       }
> 
> lia->data[cnt] can be NULL, which happens to en_US.UTF-8 with
> LC_COLLATE.  But this won't happen if glibc is configured with
> --enable-hardcoded-path-in-tests, which I have been using.
> 
> This patch fixes it.

Your patch does fix this problem, but it doesn't solve the issue
I'm seeing on F21. I only see this problem in F21, but not in RHEL7,
so it likely indicate a compiler<->glibc interaction. Given that
I don't see it on a better tested platform with more stable tools
I'm going to ignore this as a blocker.

Your patch looks good though, and should go in for 2.23.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]