Bug 2060 - improve translated C code to reduce compile & run time
Summary: improve translated C code to reduce compile & run time
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P1 normal
Target Milestone: ---
Assignee: Frank Ch. Eigler
URL:
Keywords:
: 1159 1330 (view as bug list)
Depends on:
Blocks: 2111
  Show dependency treegraph
 
Reported: 2005-12-15 18:09 UTC by Martin Hunt
Modified: 2006-01-26 23:01 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed: 2006-01-23 18:13:16


Attachments
my test case (184 bytes, text/plain)
2005-12-15 18:10 UTC, Martin Hunt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Hunt 2005-12-15 18:09:15 UTC
Typical compile times for simple scripts probing kernel.syscall.* is 1 to 2 minutes.

~> time stap -p2 sys.stp > foo

real    0m1.570s
user    0m1.518s
sys     0m0.052s
~> time stap -p3 sys.stp > foo

real    0m3.282s
user    0m2.136s
sys     0m1.148s
~> wc -l foo
183458 foo
~> time stap -p4 sys.stp

real    1m27.217s
user    1m23.365s
sys     0m4.691s

So we have a 183458 line C file to compile. The context struct itself is over
14000 lines long and includes stuff like:
struct function__module_flags_str_locals {
      int64_t f;
      union {
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
      };
      string_t __retvalue;
    } function__module_flags_str;

Everything else in the C file looks normal at first glance. Very repetetive,
obviously.
Comment 1 Martin Hunt 2005-12-15 18:10:00 UTC
Created attachment 804 [details]
my test case
Comment 2 Frank Ch. Eigler 2005-12-15 18:16:07 UTC
Are you sure you're running cvs systemtap?
Graydon made a big improvement in just this area of code a few days ago: bug #1931
Comment 3 Frank Ch. Eigler 2005-12-15 18:17:27 UTC
Never mind, misunderstood your timings.
Needs further study.
Comment 4 Graydon Hoare 2005-12-21 01:36:03 UTC
This looks decidedly wrong. Off hand I can't tell why. It's possible that we're
simply generating too much code -- maybe 200 syscalls times a handful of
parameter-accessor functions makes "too much code" -- but it also looks like
we're generating junk as well. 
Comment 5 Frank Ch. Eigler 2006-01-04 21:45:03 UTC
Experiments ongoing.

Counterintuitively, it seems like the probe handler bodies are *not* the
dominant factor.  With all ~500 of them commented out, the compile time is still
just as long.  Judging by the resulting function/symbol sizes, I infer that the
module_init/module_exit functions are stressing the C compiler most, and
therefore will look there first.
Comment 6 Frank Ch. Eigler 2006-01-10 18:52:36 UTC
Patches just committed appear to improve this significantly.
Comment 7 Martin Hunt 2006-01-10 20:34:24 UTC
BEFORE
~> time stap -p4 sys.stp
real    1m40.500s
user    1m35.334s
sys     0m5.947s

AFTER
~> time stap -p4 sys.stp
real    0m47.287s
user    0m46.979s
sys     0m1.393s

That was a big improvement. Still, I hope we can eventually improve upon this. 
I suggest keeping this open at a lowered priority.
Comment 8 Frank Ch. Eigler 2006-01-10 20:44:42 UTC
Right.  I anticipate further improvements are possible along these lines:

- reducing the amount of code generated (duh), particularly:
  - collecting the activity-count additions & especially checks
  - reducing the frequency of last_stmt assignments, and last_eerror checks
  - raising some global variable locking/unlocking code up to the outermost
nesting level of probe/function bodies; beyond simplifying the emitted C code,
this could reduce potential concurrency but it would kill a bunch of race conditions
- adjusting the kbuild CFLAGS to lessen optimization
Comment 9 Frank Ch. Eigler 2006-01-10 20:52:37 UTC
*** Bug 1159 has been marked as a duplicate of this bug. ***
Comment 10 Frank Ch. Eigler 2006-01-10 23:18:42 UTC
*** Bug 1330 has been marked as a duplicate of this bug. ***
Comment 11 Frank Ch. Eigler 2006-01-23 18:13:16 UTC
- will include lock lifting, unused $target elimination, and one or two other
optimizations
Comment 12 Frank Ch. Eigler 2006-01-24 17:58:40 UTC
mostly done; need just lock lifting now
Comment 13 Frank Ch. Eigler 2006-01-26 23:01:30 UTC
lock lifting done.
other future improvements are possible; will be tracked separately.