This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Fwd: Optimize global stap variable for further performance improvement(~8%)]


Hi,

  Below is a mail discussing about improving the LKET's performance. I
used a multi-thread(8 threads) app which will call getsid() in a loop
running on a 4-way ppc64 box(8 logical CPUs)

  The testing data shows that we need some additional optimization for
those read only global variables(Or those only be written in probe
begin/end). I searched mailinglist and found a topic about "global
constant":

http://sources.redhat.com/ml/systemtap/2006-q1/msg00487.html

 So it seems to me there are two options:
<1> introduce "const" type as suggested by Mark McLoughlin
<2> if the translator finds a global variable is only written in probe
begin/end, then elides the rw_lock of this variable.

  Any comments?

- Guanglei


-------- 原始信息 --------
主题: Optimize global stap variable for further performance
improvement(~8%)
日期: Thu, 03 Aug 2006 17:14:08 +0800
发件人: Li Guanglei <guanglei@cn.ibm.com>
组织: IBM CSTL
收件人: Jose Santos <jrs@us.ibm.com>
抄送: Jian Gui <guijian@cn.ibm.com>, Xue Peng Li <xuepengl@cn.ibm.com>

Hi,

  The current HookID/GroupID are defined as a stap variable and a same
name prefixed with "_" is also defined with the same value to be used
by embedded c codes, e.g:

global
        GROUP_SYSCALL,
        HOOKID_SYSCALL_ENTRY, HOOKID_SYSCALL_RETURN,
...

%{
/* used in embedded c codes */

/* Group ID Definitions */
int _GROUP_SYSCALL = 2;
int _HOOKID_SYSCALL_ENTRY = 1;
int _HOOKID_SYSCALL_RETURN = 2;
...
%}

  And the translator will assign each global variable a rw_lock.
Although these IDs will only be written in "probe begin" but each
probe handlers has to call "read_trylock":

    while (! read_trylock (& global_HOOKID_SCSI_IOENTRY_lock)&&
(++numtrylock < MAXTRYLOCK))
      ndelay (TRYLOCKDELAY);

  Although a read lock won't contention with each other, but my test
shows removing this read lock will have a improvement of ~8%

  Here is the testing data:

===== With Data Transfer ======

Original LKET
  4254

Modified LKET without using global HookID/GroupID stap variable:
  3930

~7.62% improvement

====== Without data transfer ====

Original LKET:
   3699

Modified LKET without using global HookID/GroupID stap variable:
   3332

~9.9% improvement

 So we should start to eliminate all the global hookid/groupid stap
variables.

- Guanglei






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]