This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: I know CVS is hosed


On Thu, Sep 06, 2001 at 11:59:13AM -0400, Christopher Faylor wrote:
>On Thu, Sep 06, 2001 at 07:49:03PM +0400, egor duda wrote:
>>Hi!
>>
>>Thursday, 06 September, 2001 Christopher Faylor cgf@redhat.com wrote:
>>
>>CF> While attempting to cut down on the size of the 1.3.3 DLL, I uncovered a few
>>CF> problems with cygheap.  I couldn't track them down before I went to bed.
>>
>>CF> The symptom is that applications die in cfree.  The problem manifests quickly
>>CF> in a process which execs a process which execs a process.
>>
>>CF> I haven't seen the problem that Egor reported with free but it probably is not
>>CF> related to the cygheap problem.  It probably is somehow related to the new
>>CF> code in sigproc which allocates the zombie array dynamically.
>>
>>yes, it crashes at 'free (zombies)'. i've built malloc-debuggibg version
>>but somehow it doesn't get to free(zombies), but crash earlier at
>>_cmalloc () when one of buckets contains invalid nonzero pointer.
>
>Yep.  The _cmalloc stuff is what I'm going to track down.

Just to summarize what is going wrong in cygwin:

Cygwin has recently adopted something called the "cygwin heap".  This is
an internal heap that is inherited by forked/execed children.  It
consists of process specific information that should be inherited.  So
things like the file descriptor table, the current working directory,
and the chroot value live there.

It is also used to pass argv information to a child process.  That's the
problem.  If you allocate space for argv on the heap and then exec a
process the child process (1) will happily use the space in the heap.
But what happens when that process execs another process (2)?  The space
used by child process (1) still is being used in child process (2) but
it is basically just a memory leak.

What is supposed to happen is that memory used by child process 1 is
tagged in such a way that child process 2 will know to delete it.  This
is in cygheap_fixup_in_child.

AFAICT, this really wasn't actually working for a while.  One problem is
that the mechanism which kept track of the freed space (buckets) wasn't
being propagated to exec'ed children.  This resulted in a potentially
huge memory leak.  So, I moved the bucket array from cygwin's data area
into the cygwin heap where it will be propagated automatically.

That's where everything broke down.  For some reason it looks like
things were being doubly freed in the cygheap_fixup_in_child cleanups,
resulting in the SEGVs that we're seeing.

I think I have a "fix" for this (amazing what comes to you while you are
sleeping) but I don't really understand why the previous code wasn't
working.  So, I'll be doing some studying in between meetings today to
try to convince myself that I understand what was going wrong before.
I don't like fixing things by merely changing an algorithm unless I
understand what was wrong with the old algorithm.

In the meantime, if you could check out the enclosed patch and verify
or disappoint, I'd be grateful.

For the curious, most of the patch is actually just replacing
cygheap->buckets with a macro so that I could switch back and forth
between the static array and the cygheap->buckets.

cgf

Index: cygheap.cc
===================================================================
RCS file: /cvs/uberbaum/winsup/cygwin/cygheap.cc,v
retrieving revision 1.37
diff -p -r1.37 cygheap.cc
*** cygheap.cc	2001/09/06 03:39:18	1.37
--- cygheap.cc	2001/09/06 16:47:38
*************** struct cygheap_entry
*** 36,41 ****
--- 36,43 ----
    };
  
  #define NBUCKETS (sizeof (cygheap->buckets) / sizeof (cygheap->buckets[0]))
+ #define cygbuckets cygheap->buckets
+ 
  #define N0 ((_cmalloc_entry *) NULL)
  #define to_cmalloc(s) ((_cmalloc_entry *) (((char *) (s)) - (int) (N0->data)))
  
*************** cygheap_fixup_in_child (child_info *ci, 
*** 136,142 ****
        for (_cmalloc_entry *rvc = cygheap->chain; rvc; rvc = rvc->prev)
  	{
  	  cygheap_entry *ce = (cygheap_entry *) rvc->data;
! 	  if (rvc->b >= NBUCKETS || ce->type <= HEAP_1_START)
  	    continue;
  	  else if (ce->type < HEAP_1_MAX)
  	    ce->type += HEAP_1_MAX;	/* Mark for freeing after next exec */
--- 138,144 ----
        for (_cmalloc_entry *rvc = cygheap->chain; rvc; rvc = rvc->prev)
  	{
  	  cygheap_entry *ce = (cygheap_entry *) rvc->data;
! 	  if (ce->type < HEAP_1_START)
  	    continue;
  	  else if (ce->type < HEAP_1_MAX)
  	    ce->type += HEAP_1_MAX;	/* Mark for freeing after next exec */
*************** _cmalloc (int size)
*** 200,209 ****
      continue;
  
    cygheap_protect->acquire ();
!   if (cygheap->buckets[b])
      {
!       rvc = (_cmalloc_entry *) cygheap->buckets[b];
!       cygheap->buckets[b] = rvc->ptr;
        rvc->b = b;
      }
    else
--- 202,211 ----
      continue;
  
    cygheap_protect->acquire ();
!   if (cygbuckets[b])
      {
!       rvc = (_cmalloc_entry *) cygbuckets[b];
!       cygbuckets[b] = rvc->ptr;
        rvc->b = b;
      }
    else
*************** static void __stdcall
*** 223,232 ****
  _cfree (void *ptr)
  {
    cygheap_protect->acquire ();
    _cmalloc_entry *rvc = to_cmalloc (ptr);
    DWORD b = rvc->b;
!   rvc->ptr = cygheap->buckets[b];
!   cygheap->buckets[b] = (char *) rvc;
    cygheap_protect->release ();
  }
  
--- 225,235 ----
  _cfree (void *ptr)
  {
    cygheap_protect->acquire ();
+   ((cygheap_entry *) ptr)->type = HEAP_FREE;
    _cmalloc_entry *rvc = to_cmalloc (ptr);
    DWORD b = rvc->b;
!   rvc->ptr = cygbuckets[b];
!   cygbuckets[b] = (char *) rvc;
    cygheap_protect->release ();
  }
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]