This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler


http://sourceware.org/bugzilla/show_bug.cgi?id=13845

             Bug #: 13845
           Summary: Infrequent random stop in futex_wait using printf
                    inside alarm signal handler
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: unassigned@sourceware.org
        ReportedBy: jon@jonshouse.co.uk
                CC: drepper.fsp@gmail.com
    Classification: Unclassified


Excuse the background, but I feel it may be useful.

Recently i've been developing an application. A single (non threaded) process
receives UDP broadcast packets from multiple sources, mixes them into a single
audio stream and plays them via ALSA or PULSE audio.

I build 3 versions of the rx process, one driving ALSA, one driving PULSE and
one "dummy" version that discards the audio but is in every other respect
identical (conditional compiles from the same source).  I build these 3
versions for two architectures, intel and ARM.

The "dummy" version on both intel and ARM machines run forever....
The pulse version also runs forever (as best as I can tell)
The alsa version randomly freezes.

I not unreasonably assumed it to be an ALSA bug.

Once I had found the time to track the lockups down they seem to be caused by
printf stalling in a handler for a once a second alarm signal.

I checked the documentation for "alarm" and several other places but cant see
any warning that suggest that printf can not be used in that context.

I fixed my application by adding a simple flag.
The freezing behaviour I have seen on all linux/gclic versions i've tried so
far.

The faster the machine the less frequently the stop happens.  On my 3.8Ghz AMD
64 it would typically occur only every few days of runtime, on a 200Mhz ARM the
event would occur typically within 3 hours

Attached a stack backtrace of my process running on a 200mhz ARM board.

The process is making heavy use of recvfrom(UDP) and lighter but frequenct use
of ALSA snd_pcm_writen.

Process stall after 2hours, running under gdb:
ARM udp-many-way-audio-rx:  My IP Address [10.10.10.111]  Runtime:[   0D  2H 
5M 57S]
Decoding my own audio packets
Playback ALSA lib:1.0.25 writen underruns: 0  overruns (handled):0
Main loops per second = 51      EAGAIN per second = 52
Total packets received 2546534 , errors 0, Per Second 341
Last error ()

 1  [      10.10.10.6  ]  PLS 00000  PC 0000064638 PPS 43  BWD  2  OLP 0
 2  [    10.10.10.111  ]  PLS 00000  PC 0000325200 PPS 42  BWD  2  OLP 2
 3  [     10.10.10.66  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 4  [     10.10.10.65  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 5  [     10.10.10.64  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 6  [     10.10.10.63  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 7  [     10.10.10.62  ]  PLS 00000  PC 0000318855 PPS 42  BWD  2  OLP 1
 8  [     10.10.10.61  ]  PLS 00000  PC 0000318855 PPS 42  BWD  2  OLP 0

PLS=Packet Last Seen    PC=Packet Counter       PPS=Packets per second
BWD=Buffers with data  OLP=oldest packet buffer
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA^C
Program received signal SIGINT, Interrupt.
0x000ddbc0 in __lll_lock_wait_private ()
(gdb) backtrace
#0  0x000ddbc0 in __lll_lock_wait_private ()
#1  0x000b7a4c in vfprintf ()
#2  0x000bb9f8 in printf ()
#3  0x0000a930 in once_a_second_alarm_handler () at udp-many-way-audio-rx.c:359
#4  <signal handler called>
#5  0x000c24a0 in fflush ()
#6  0x0000ac20 in slot_buffers_firstfree (slot=3) at
udp-many-way-audio-rx.c:433
#7  0x0000c0e4 in main (argc=1, argv=0xbefffe14) at udp-many-way-audio-rx.c:692
(gdb)

An strace of the process stalling on intel is here:
http://www.jonshouse.co.uk/download/a_stop.txt.gz

Thanks,
Jon

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]