This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2] Single threaded stdio optimization

From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
To: "triegel at redhat dot com" <triegel at redhat dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>
Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
Date: Fri, 30 Jun 2017 15:34:05 +0000
Subject: Re: [PATCH v2] Single threaded stdio optimization
Authentication-results: sourceware.org; auth=none
Authentication-results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com;
Nodisclaimer: True
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

Torvald wrote:
>
> What's interesting here is that your high-level optimization is faster
> than doing the single-thread check in the low-level lock (x86 has it
> already in the low-level lock).

Have you ever looked at the generated code for eg. getc?
Each lock does a lot of work even with the low level lock bypass
optimization. It still does several branches, reads and writes, and
this is repeated twice for the lock and unlock. A single branch bypassing
all that is obviously going to be much faster...

And interestingly when you remove the low level lock optimization,
multithreaded code will run faster too as it no longer needs to do the
extra checks for the single-threaded case.

Wilco

Follow-Ups:
- Re: [PATCH v2] Single threaded stdio optimization
  - From: Torvald Riegel

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]