This is the mail archive of the
ecos-discuss@sources.redhat.com
mailing list for the eCos project.
Re: DEBUG: Circular MBUF
- From: "Kevin S. Martin" <ksmartin1 at sbcglobal dot net>
- To: ecos-discuss at sources dot redhat dot com
- Date: Thu, 15 Jul 2004 08:47:59 -0700 (PDT)
- Subject: [ECOS] Re: DEBUG: Circular MBUF
>
> Hi,
>
> Could you try another experiment?
>
> Under unix, logged in as root, type this command:
>
> ping -s 1000 -f YOUR_ECOS_BOARDS_IP
>
> This will send 1000 byte packets to your eCOS board as fast as
> your unix box will send them. For me, that would *always* kill
> the board and cause it to lockup. We gave up entirely on eCos
> because of it's IP stack unreliability.
>
> My personal belief is that, under heavy loads, eCos' TCP/IP stack
> experiences a memory leak but we never traced it down, which is why
> we gave up on it.
>
> -Rich
I tried doing a flood of pings as you suggested:
ping -s 1000 -f MY_ECOS_BOARDS_IP
This works fine. I let it run for 5 minutes with no problems. Then I
tried:
ping -s 3000 -f MY_ECOS_BOARDS_IP
Then I get the same symptom as when my application tries to send that
amount of data.
i.e.
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
I get one of these messages for every ping. After getting some number
these messages (not after just one) I find that the Network thread is
hung.
I did some experimenting and found the exact size that the ping fails
on:
ping -s 1472 MY_ECOS_BOARDS_IP -> this work fine
ping -s 1473 MY_ECOS_BOARDS_IP -> this fails immediately
Once in a while I get a slightly different symptom:
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
DEBUG: Circular MBUF 0x004d1480!
DEBUG: Circular MBUF 0x004d1280!
ASSERT FAIL: <3>support.c[124]cyg_panic() eth_drv_send: no header mbuf
ASSERT FAIL: <3>support.c [ 124] cyg_panic()
eth_drv_send: no
header mbuf
TRACE: <1>prestart.cxx [ 78] void cyg_prestart()
'This is the
system de
fault cyg_prestart()'
TRACE: <1>pkgstart.cxx [ 88] void cyg_package_start()
'This is the
system de
fault cyg_package_start()'
TRACE: <4>main.cxx [ 97] int main()
'This is the
system-su
pplied default main()'
TRACE: <4>invokemain.cxx [ 116] void cyg_libc_invoke_main()
'main() has
returned w
ith code 0. Calling exit()'
TRACE: <4>exit.cxx [ 93] void exit()
'Calling
fflush( NULL
)'
Scheduler:
Lock: 0
Current Thread: Network support
Threads:
Idle Thread pri = 31 state = R id = 1
stack base = 004a1800 ptr = 00000000 size =
00004000
sleep reason NONE wake reason NONE
queue = 00000000 wait info = 00000000
Network alarm support pri = 6 state = S id = 2
stack base = 00536f80 ptr = 00000000 size =
000036c0
sleep reason WAIT wake reason NONE
queue = 00536de8 wait info = 0053a3d4
Network support pri = 7 state = R id = 3
stack base = 004ab440 ptr = 00000000 size =
000036c0
sleep reason NONE wake reason DONE
queue = 00000000 wait info = 004ae894
pthread.00000800 pri = 15 state = X id = 4
stack base = 004a6ce8 ptr = 00000000 size =
00003268
sleep reason NONE wake reason DONE
queue = 00000000 wait info = 00000000
This seems to indicate that the problem isn't while my application but
with the network stack, since I get the same problem just ping my eCos
node.
Thanks,
Kevin
Kevin S. Martin
Richard Wicks wrote:
>Hi,
>
>Could you try another experiment?
>
>Under unix, logged in as root, type this command:
>
>ping -s 1000 -f YOUR_ECOS_BOARDS_IP
>
>This will send 1000 byte packets to your eCOS board as fast as your
unix box
>will send them. For me, that would *always* kill the board and cause
it to
>lockup. We gave up entirely on eCos because of it's IP stack
unreliability.
>
>My personal belief is that, under heavy loads, eCos' TCP/IP stack
>experiences a memory leak but we never traced it down, which is why we
gave
>up on it.
>
>-Rich
>
>----- Original Message -----
>From: "Kevin S. Martin" <ksmartin@fnal.gov>
>To: <ecos-discuss@sources.redhat.com>
>Sent: Wednesday, July 14, 2004 12:15 PM
>Subject: [ECOS] Re: DEBUG: Circular MBUF
>
>
>
>
>>>>On Fri, 2004-06-25 at 10:38, Kevin S. Martin wrote:
>>>>
>>>>
>>>>I have a application that opens a TCP/IP socket connection and then
at
>>>>1Hz writes a bunch of data out on that connection. If "bunch" is <
>>>>
>>>>
>1000
>
>
>>>>bytes (approx) then everything works fine however when "bunch" >
1000
>>>>(i.e. 2000 or 8000+ bytes) then very quickly I get a series of
messages
>>>>on the console like:
>>>>
>>>>DEBUG: Circular MBUF 0x004c7e80!
>>>>DEBUG: Circular MBUF 0x004c8500!
>>>>DEBUG: Circular MBUF 0x004c7e00!
>>>>DEBUG: Circular MBUF 0x004c7c80!
>>>>
>>>>After I get these messages I assume that the network thread is in
an
>>>>infinite loop because all threads with lower priority never run
again
>>>>and all networking to/from the target stops.
>>>>
>>>>I'm using a i386 PCMB target with a fairly recent version of eCos
>>>>(April 2004) from the CVS repository. Also, I'm using the FreeBSD
>>>>networking stack. I've tried increasing the amount of memory
designated
>>>>for networking buffers but this didn't help.
>>>>
>>>>
>>>My guess is that you have stack overflow. Where is the buffer (that
>>>contains 'bunch' of data)?
>>>
>>>If this isn't [obviously] the problem try running with asserts
enabled
>>>and see if it helps debug the failure.
>>>
>>>--
>>>Gary Thomas <gary@mlbassoc.com>
>>>MLB Associates
>>>
>>>
>>I've finally been able to get back to working on this problem. I
looked
>>
>>
>into Gary Thomas's suggestion that it might be a stack problem. My
send
>buffer isn't a stack variable rather it is a statically defined
global. I
>increased the size of this buffer as well as the size of my thread's
stack
>quite a bit (to 65K) to rule out buffer/stack overflows. This didn't
help. I
>have also turned on asserts and tracing but this did not help either.
No
>asserts get hit.
>
>
>>Is it possible that the Network thread's stack is overflowing? If so
how
>>
>>
>does one increase it's size?
>
>
>>Any other ideas?
>>
>>Thanks,
>>Kevin
>>
>>--
>>Kevin S. Martin
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss