13119 – Erroneous "libgcc_s.so.1 must be installed for pthread_cancel to work" message

Bug 13119 - Erroneous "libgcc_s.so.1 must be installed for pthread_cancel to work" message

Summary: Erroneous "libgcc_s.so.1 must be installed for pthread_cancel to work" message

Status:	NEW

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	nptl (show other bugs)
Version:	2.14

Importance:	P2 minor
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-08-19 21:35 UTC by Tavian Barnes
Modified:	2021-07-11 16:45 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security-

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tavian Barnes 2011-08-19 21:35:20 UTC

This is a bit awkward to reproduce, but I found it when attempting to run my raytracer with a ridiculous number (50,000) of threads.  Here's a reduced testcase:

$ cat foo.c#include <pthread.h>

void *
bar(void *ptr)
{
  return NULL;
}

void *
foo(void *ptr)
{
  pthread_t thread;
  while (1) {
    if (pthread_create(&thread, NULL, bar, NULL) != 0) {
      pthread_exit(NULL);
    }
  }
  return NULL;
}

int
main()
{
  pthread_t thread;
  pthread_create(&thread, NULL, foo, NULL);
  pthread_join(thread, NULL);
  return 0;
}
$ gcc foo.c -pthread && ./a.out
libgcc_s.so.1 must be installed for pthread_cancel to work

The error comes from nptl/sysdeps/pthread/unwind-forcedunwind.c, which tries to "__libc_dlopen (LIBGCC_S_SO);".  But in elf/dl-load.c, _dl_map_object_from_fd(), line 1281, the __mmap() fails with ENOMEM, presumably because the thousands of zombie threads have left no available memory.  Thus the dlopen fails, and that message gets printed.

Obviously this isn't a very important issue, but the error message is not exactly informative, since there's no usage of pthread_cancel() anywhere and libgcc_s.so.1 is certainly installed.

Comment 1 Florian Weimer 2014-06-27 12:23:55 UTC

Read barriers have been added to the cancellation initialization code.  Can you check if this fixes this issue?  If it does not, fixing this would have to use the double-checked locking idiom to prevent duplicating work.

Comment 2 Tavian Barnes 2014-06-27 18:09:59 UTC

Still reproduces against git glibc from today.

Comment 3 Vlad Frolov 2016-02-12 14:28:32 UTC

This bug is still reproducible on glibc 2.22.

Comment 4 Adhemerval Zanella 2016-10-14 20:20:30 UTC

This is not really due the fact process has no virtual memory left neither due synchronization issues on libgcc_s.so loading, but rather due Linux overcommit making subsequent mprotect on returned mmap memory fail with ENOMEM.

With vm.overcommit_memory=0 (default on my system) I get:

[pid 28817] mmap(NULL, 2185488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc0b850b000
[pid 28817] mprotect(0x7fc0b8521000, 2093056, PROT_NONE) = -1 ENOMEM (Cannot allocate memory)
[pid 28817] mmap(0x7fc0b8720000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = -1 ENOMEM (Cannot allocate memory)

Not considering overcommit this should fail, since mmap requested a memory of size 0x215910 (2185488) and kernel returned 0x7fc0b850b000.  So a mprotect of (0x7fc0b8521000, 0x7fc0b8720000) should not fail since it is within (0x7fc0b850b000, 0x7fc0b8720910).

The test pass with vm.overcommit_memory=2 (always check, never overcommit):

[pid 29891] mmap(NULL, 2185488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f71c0028000
[pid 29891] mprotect(0x7f71c003e000, 2093056, PROT_NONE) = 0
[pid 29891] mmap(0x7f71c023d000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = 0x7f71c023d000
[...]
[pid 29891] exit(0)                     = ?
[pid 29889] <... futex resumed> )       = 0
[pid 29891] +++ exited with 0 +++
exit_group(0) 

So I do not think this is an issue with glibc itself and afaik there is not non-portable mmap flag to use to avoid overcommit in such mmap call.

Comment 5 Cliff Wheatley 2021-07-11 16:38:59 UTC

This error "libgcc_s.so.1 must be installed for pthread_cancel to work" is not repeatable.  It has happened in iconv/tst-gconv-init-failure and nptl/tst-default-attr.  libgcc_s.so.1 is installed.  Configure invocation command line was
  $ ../../Source/glibc/configure --prefix=/usr/local/share/glibc-2.33 --enable-bind-now --enable-stack-protector=all --enable-add-ons=crypt,libidn,linuxthreads --enable-kernel=5.5

It happened both when running "make -j 6 check" and "make check".

Comment 6 Florian Weimer 2021-07-11 16:45:10 UTC

(In reply to Cliff Wheatley from comment #5)
> This error "libgcc_s.so.1 must be installed for pthread_cancel to work" is
> not repeatable.  It has happened in iconv/tst-gconv-init-failure and
> nptl/tst-default-attr.  libgcc_s.so.1 is installed.  Configure invocation
> command line was
>   $ ../../Source/glibc/configure --prefix=/usr/local/share/glibc-2.33
> --enable-bind-now --enable-stack-protector=all
> --enable-add-ons=crypt,libidn,linuxthreads --enable-kernel=5.5

Have you installed libgcc_s.so.1 under /usr/local/share/glibc-2.33?