This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Checking function calls


On Fri, 2002-12-06 at 18:08, Michael Elizabeth Chastain wrote:
> > I know, I didn't plan ahead good enough when I started writing it, and
> > now I'm stuck with either this, or a large rewrite.
> 
> When I run into this kind of problem, I like to step back -- way back --
> get away from computers for a day or two and think about it.
> 
> I think there is no easy way out, that you actually are stuck with a
> large rewrite.  There are just too many pthread_mutex_lock's flying
> around.

I'm beginning to believe that, too. Maybe I have just been too
optimistic.

> 
> For instance:
> 
>   client.c:findtransfer() does not have any locks.
> 
>   in client.c:freesharecache(), there is code:
> 
>     if (cache->parent != NULL)
>     {
>       pthread_mutex_lock(&cache->parent->mutex)l;
>       ...
>     }
> 
>   in general, it's unsafe to test a member and then acquire the lock,
>   because someone else can delete cache->parent between the "if" statement
>   and the acquisition of the lock.
> 

Here, however, that isn't possible, since all deletions from that list
go via the freesharecache function, and a deletion of the parent also
loops through, locks, and deletes all the children, and since one of the
children apparently is locked, it won't go any further. I suspect it
might deadlock it, though.

> I recommend finding a textbook on multi-threaded programming that covers
> "how to write thread-safe lists".  From your package, it looks like
> you are in it to learn, so you could step way back from the code and
> learn some theory at this point.

Yeah, when I began writing this program, I did not have much experience
in multithreading. That's the reason that there are much too few mutexes
in the program.
Still, I don't think that's the reason for this bug. The loop in which
it crashes in quite thread-safe.

> Another alternative is to use one big mutex for the whole list.

That is precisely what I have been wanting to implement for a long time.
It's only that it would require an enormous rewrite to implement
everywhere that it should be used.

> The drawback is that walking the list locks the whole list against
> addition and deletion.  If your list walker is just "print status
> information" then that is fine.  If your list walker does some
> long-lived network operation at each node then it is not fine.

I have, however, made sure that doesn't happen by only using nonblocking
I/O.

Once again, though, I don't think that thread-unsafeness is the reason
for this bug to happen. But I've added checks to that loop now, so I
should discover it sooner or later. Thank you very much for all your
help.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]