This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/3] network: recvmsg and sendmsg standard compliance (BZ#16919)



On 08/06/2016 17:15, Zack Weinberg wrote:
> On Tue, Jun 7, 2016 at 10:21 AM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>> On 07/06/2016 10:31, Zack Weinberg wrote:
>>>
>>> send/recv(m)msg are kernel primitives, and the fundamental basis of my
>>> objection is that I think the C library should always faithfully expose
>>> all the true kernel interfaces, *even if they are in some way wrong*.
>>> This conformance violation should be addressed by the kernel first, and
>>> only then should the C library follow suit.  That means that neither
>>> this patch, nor the follow-up patch tackling cmsgbuf, should be applied
>>> at all.  If either has already been applied, they should be backed out.
>>
>> I strongly disagree with this definition, the C library is still an
>> abstraction on underlying kernel and GLIBC should and follows POSIX
>> standards even when it deviates from the kernel primitives.  The same
>> idea of standard is what drove the various fixes on both math library
>> conformance and various primitives (quick_exit is an example).
> 
> You are going to have a very hard time persuading me to change my
> position, and this ain't gonna do it. This is circular logic.  "We
> should follow POSIX because we should follow POSIX."
> 
> I would consider a real (not made up for the purpose, and ideally,
> already existing) program that is broken by not having these types be
> as POSIX specifies to be a *valid argument* for changing the types,
> but even that might not be a *persuasive* argument for changing the
> types, especially since Florian has pointed out actual breakage from
> changing them.  (Frankly, I think Florian's report of actual breakage
> should be the last word on the subject - back the patch out already,
> and let us never speak of this again.)
> 
> What would persuade you to accept *my* position on this issue?

I am stating we follow POSIX to very reason we follow other technical
standard: to provide libc compatibility.

And the breakage Florian has pointed (and I replied) is a very
specific one that also require the interposing library to know a
very deal of the interposed library.  This kind of tooling is highly
coupled with implementation and there are various hacks and slight
breakages that minor GLIBC changes already incurred (for instance
on libsanitizer, every TCB size change needs to be explicit take
in account).

And I do not see the tooling breakage as compelling reason to break
interface changes and fixes.

> 
> (I'm cc:ing some of the usual standards-compliance gurus.  I'm
> slightly more likely to be convinced by someone who is not advocating
> for their own patch.)
> 
>> And it is also why some from community view explicit exposing some
>> Linux primitives (such as gettid) to be a controversial subject.
> 
> As soon as I get some spare time (probably not in the 2.24 time frame)
> I am going to post a patch that makes glibc expose every single one of
> the Linux primitives that we don't already expose, because that's what
> I think we should do.  But that's a tangent from this discussion.

This has been discussed before, so I would suggest you to first read
Joseph suggested list [1].  The original thread [2] also show more
discussion for each syscalls [2].

[1] https://sourceware.org/ml/libc-alpha/2015-11/msg00373.html
[2] https://sourceware.org/ml/libc-alpha/2013-02/msg00030.html

> 
> ...
>>> Earlier, I said that I didn't like copying cmsgbuf because it wasn't
>>> possible to be sure that no cmsg opcodes cared (now or in the future)
>>> about the address of the buffer, and Adhemerval said (effectively) that
>>> such an opcode would not make sense.  That's not true.  Imagine, if you
>>> will, a cmsg that expects the ancillary buffer to be overlaid on a
>>> shared memory area, and rewrites the *non*-ancillary buffer to reflect
>>> the location of that memory area in the receiver.
>>
>> Again, I see to no problem in this scenario: the function prototype states
>> a constant cmsghdr and it will not change its state. Even if the ancillary
>> buffer might change, it is up to application to synchronize its access
>> and call sendmsg in a flow where the data is a consistent state.  I personally
>> see that calling a syscall with a buffer in racy condition does not make
>> sense.
> 
> You clearly still don't get it.  It's not about the buffer being in a
> racy condition.  It's that the *address might be part of the message.*
>  "Nobody should do that" is NOT a valid objection, because this is an
> arbitrarily extensible interface.
> 
> Let me try again with another example.  Imagine that there exists a
> SCM_CREATE_SHMEM ancillary message whose effect is to *convert that
> chunk of the ancillary buffer into a shared memory area*.  The kernel
> will remap the data portion of the cmsg into the receiver, and supply
> the receiver with the address at which it was mapped.  (You might be
> about to object that it doesn't make sense to embed the desired shared
> memory area in the cmsg, but, again, that is not a valid objection.
> This is an arbitrarily extensible interface.  People can, will, and
> *have* done arbitrarily bizarre things with it.)  Copying the
> ancillary buffer, *in and of itself*, would break this message.  So
> would applying any small size limit to the length of an ancillary
> buffer.  And come to think of it, this hypothetical cmsg would also
> justify the kernel's insisting to continue to use size_t for both
> cmsg_len and msg_controllen.

This very interface does not make sense: the ancillary message will
be required to be remmaped anyway in syscall transition to kernel
space.  So in the end, if you try to remap a 1GB buffer in this
hypothetical syscall, kernel in the end will need to first to copy
the 1GB message to kernel space and then remap the original pointer.
I highly double kernel will ever supports such syscall or any syscall
that you might pass a buffer that is suppose to be volatile.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]