= Thread Safety Branch = This is related to com.redhat.elfutils.pmachata.threads. The goal here is to make elfutils thread-safe. == libelf == libelf is about done, and merged to trunk. - The approach taken was to have a rwlock per Elf. That lock is rdlocked/wrlocked according to the use. wrlocking is relatively scarce, often necessary only to initialize write-once caches. - All externally visible functions lock on entry, unlock on leave, and are thin wrappers around functions that do the actual work. The pattern is that for externally visible `elf_X`, the workers are called `__libelf_X_rdlock` and `__libelf_X_wrlock`. These are called internally: wrlock worker assumes that the caller holds a write lock, rdlock worker assumes a read lock. - The worker may need to relock, e.g. to update a cache. Because in pthreads, there's no operation to upgrade a lock, the worker does that by first releasing the lock, and wrlocking it immediately. That means it loses the lock for a while, and Elf can meanwhile be transformed in all kinds of ways. The caller needs to be aware of that, and take care not to cache any data that might be invalid after the lock is lost. == libdw == This is work in progress. It was dropped for now in favour of other elfutils work. - The approach taken is similar to that taken for libelf. We have a lock per Dwarf, and rdlock/wrlock that as appropriate if work is done with data structure "descended" from that given Dwarf. E.g. lock can be taken like this: `rwlock_rdlock (attr->cu->dbg->lock);` - `__libdw_visit_scopes` doesn't do any lock handling itself. It needs at least a read lock, but the caller needs to use the right locking level with respect to the visitor that is called. It's assumed that previsit and postvisit may relock. - When handing the control over to external callback (for example `dwarf_func_inline_instances`), the visitor may need to unlock, so that the callback can use official (locking) elfutils API, and later lock again. The plan is to use this unlock-callback-relock approach in all places where callbacks are used. - The lock that is taken after callback returns is rdlock. That makes sense, because write lock is typically needed only to init caches and similar, and that is relatively infrequent operation. However that also means that the lock level can actually be downgraded. When taking wrlock in advance ("we are going to need wrlock anyway, so take it right away"), the care has to be taken not to call functions that can downgrade the lock this way. - I've been rich with comments that reason why we don't mind that this or that function may relock. This is to mark calls where the relocking analysis has been done. These marks, however, may become invalid as the code evolves. I don't really know what to do about that, apart from stripping these marks when the branch stabilizes, and insisting that any future developers analyse the code again to see if relocking takes place, and is a problem with respect to their patch. - If the function uses `Dwarf_Die *` as one of the arguments, and gives up the lock (perhaps indirectly via another function), another function can step in and modify that Die either directly, or via `dwarf_child`, `dwarf_siblingof`, etc. So all functions that call (indirectly) functions that lose lock need to be checked for using `die->` references after the lock may have been lost. This analysis was not done.