This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Jakub Jelinek wrote:
> On Fri, May 17, 2002 at 10:48:07AM +0100, Keith Whitwell wrote:
>
>>Yes, we've been aware of this for a little while. One thing that we've got a
>>bit of in there is assembly for the non-threaded dispatch case (the opensource
>>libGL.so doesn't really handle the threaded case in a performant way, but
>>we've made some effort on the non-threaded case), that looks a bit like this:
>>
>>ALIGNTEXT16
>>GLOBL_FN(GL_PREFIX(NewList))
>>GL_PREFIX(NewList):
>> MOV_L(GLNAME(_glapi_Dispatch), EAX)
>> JMP(GL_OFFSET(_gloffset_NewList))
>>
>>This generates the library entrypoint 'glNewList', which just grabs the active
>>dispatch table and jumps to the real function. I had some emails with HJ Lu
>>about this, but didn't really get what he was saying. Are these a problem for
>>building with -fPIC? I'm not really interested in giving this up as I believe
>>any benefits from -fPIC will be quickly outweighed by any loss at the dispatch
>>layer.
>>
>
> Well, if you do this, you should at least put this into an
> .section Gltext, "awx"
> so that it is not DT_TEXTREL.
> But I still wonder, how often will the target this jumps to change
> during lifetime of typical GL application if using GLX extensions.
> Won't it be most of the time the __indirect_* variant, even for threaded
> apps? Or are they being changed between __indirect_*, noop and
> software rendering all the time in typical application?
The __indirect stuff is rarely used - it packages stuff up and sends it over a
pipe (or network connection) for the X server to work on. The real meat of the
driver is in a separately dlopened 'driver.so' backend which peforms 'direct
rendering' -- direct access to the hardware from the application (typically
via a kernel dma engine and various mediation/locking schemes supporting
multiple clients plus the X server all banging away at the same piece of
hardware at once).
The driver.so is kept separate and dlopened to cope with people changing video
cards or indeed having more than one installed.
> If changing it is rare, I think my proposal with jmp something; nop; nop; nop
> and changing it at setdispatch time if something changed should be faster.
In normal use it changes very frequently. The GL api is specified as a
state-machine, lends itself well to this type of implementation.
Historically Mesa didn't have a proper dispatch layer, so we under-used this
facility - that is changing however...
> Other things: concerning compsize.c routines, I think they should be
> at least inlined if not killed and replaced by switch () statements
> doing copy by hand (with fallthrough's).
Maybe, but that's on the indirect path and nobody really cares about the
performance there. There are better alternatives for efficient remote GL (see
www.chromium.org).
Keith