This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Count number of logical processors sharing L2 cache
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: Florian Weimer <fweimer at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 24 May 2016 14:35:02 -0700
- Subject: Re: [PATCH] Count number of logical processors sharing L2 cache
- Authentication-results: sourceware.org; auth=none
- References: <CAMe9rOoy2YaQTdyqZpQ3=ytDc5dywNshzHAN2ymN60=L5KwbiA at mail dot gmail dot com> <CAMe9rOoq8MNkX0GvoePQ-C51mfUr2ikrRJgqCZE0CoGoJEmOOw at mail dot gmail dot com> <d4cf36ee-f402-41fe-5108-e072b47f2399 at redhat dot com> <CAMe9rOpUuYboLH9WgyHH4HiBaSBXJ+uB=MPUft2S26N+wYJ9-A at mail dot gmail dot com> <76801b5c-7770-23a9-9b7c-4e44722247e1 at redhat dot com> <CAMe9rOqAuxZZ=gpd1zXvbRrsqjhT8G6C9WBbpwaqa65s=-ZTnQ at mail dot gmail dot com> <57449424 dot 1000009 at redhat dot com>
On Tue, May 24, 2016 at 10:49 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 05/24/2016 11:02 AM, H.J. Lu wrote:
>> CAT applies to a specific thread/process. Cache sizes in glibc are applied
>> to string/memory functions for all threads/processes. They both try to avoid
>> over-using shared cache by a single thread/process. But they work at
>> different levels and have different behaviors. Glibc also uses the cache size
>> to decide when to use non-temporal store to avoid cache pollution and speed
>> up writing a large amount of data..
>
> Don't you mean that CAT applies to a core (and all of its logical cores)?
>
> Might it be the case that a thread or process could be migrated by the linux kernel
> between various cores configured with different CAT values and the glibc heuristics
> could be poorly tuned for some of those cores?
>
> As I see it the values computed by init_cacheinfo() are only average heuristics for
> the core.
>
> I agree that Florian has a point, that these values may become less useful in the
> presence of the dynamically changing L3<->core partitioning enabled by CAT.
>
> It is silly though to think that you would allow a thread or process to migrate
> away from the CAT-tuned core. The design of CAT is such that you want to isolate
> the tuned application to one ore more cores and use CAT to control the L3 allocation
> for those cores.
I checked with our kernel CAT implementer. CAT supports both
processor and process.
> In the case where you have a process pinned to a core, and CAT is used to limit the
> L3 of that core, do the glibc heuristics computed in init_cacheinfo() match the
> reality of the L3<->core allocation? Or would a lower L3 CAT-tuned value mean that
> glibc would be mis-tuned for that core?
CAT dedicates part of L3 cache to certain processor or process so
that L3 cache is always available to them. Glibc tries not to take all
L3 cache in memcpy/memset so thar L3 cache is available for other
operations within the same process as well as to other processor/process.
CAT and glibc work at different angels. There is no direct conflict between
CAT and glibc. At the moment, I am not sure if CAT-aware glibc will
improve performance.
--
H.J.