This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC/PATCH] TCP and Unix socket accept queue time examples


On 12/07/2012 08:13 PM, Eric Wong wrote:
> These examples can be useful for measuring how long it
> took an application to accept a newly-connected client.
> ---
>  This is my first submission to SystemTap, comments greatly appreciated!
> 
>  .../network/tcp_acceptq_time.meta                  | 13 ++++++++
>  .../network/tcp_acceptq_time.stp                   | 17 +++++++++++
>  .../network/unix_acceptq_time.meta                 | 13 ++++++++
>  .../network/unix_acceptq_time.stp                  | 35 ++++++++++++++++++++++
>  4 files changed, 78 insertions(+)
>  create mode 100644 testsuite/systemtap.examples/network/tcp_acceptq_time.meta
>  create mode 100755 testsuite/systemtap.examples/network/tcp_acceptq_time.stp
>  create mode 100644 testsuite/systemtap.examples/network/unix_acceptq_time.meta
>  create mode 100755 testsuite/systemtap.examples/network/unix_acceptq_time.stp
> 
> diff --git a/testsuite/systemtap.examples/network/tcp_acceptq_time.meta b/testsuite/systemtap.examples/network/tcp_acceptq_time.meta
> new file mode 100644
> index 0000000..a2f5f81
> --- /dev/null
> +++ b/testsuite/systemtap.examples/network/tcp_acceptq_time.meta
> @@ -0,0 +1,13 @@
> +title: Accept Queue time for TCP sockets
> +name: tcp_acceptq_time.stp
> +version: 1.0
> +author: Eric Wong
> +keywords: network queue accept tcp
> +subsystem: network
> +status: experimental
> +exit: user-controlled
> +output: trace
> +scope: system-wide
> +description: Print out executable name, PID and time an accepted TCP socket spent in the listen queue
> +test_check: stap -p4 tcp_acceptq_time.stp
> +test_installcheck: stap tcp_acceptq_time.stp -c "sleep 0.2"
> diff --git a/testsuite/systemtap.examples/network/tcp_acceptq_time.stp b/testsuite/systemtap.examples/network/tcp_acceptq_time.stp
> new file mode 100755
> index 0000000..b979804
> --- /dev/null
> +++ b/testsuite/systemtap.examples/network/tcp_acceptq_time.stp
> @@ -0,0 +1,17 @@
> +#! /usr/bin/env stap
> +
> +/* wrap-around just in case */

what does the comment above mean?

> +global tcp_acceptq_start%

What is the '%' doing at the end of tcp_acceptq_start?

> +
> +probe kernel.function("sk_acceptq_added") {
> +	tcp_acceptq_start[$sk] = cpu_clock_us(0)
> +}

Why use cpu_clock_us(0) from processor 0? On multiprocessor systems this can cause an interprocessor interrupt.
Why not use gettimeofday_us() instead?

> +
> +probe kernel.function("sk_acceptq_removed") {
> +	started = tcp_acceptq_start[$sk]
> +	if (started) {
> +		delete tcp_acceptq_start[$sk]
> +		diff = cpu_clock_us(0) - started
> +		printf("%s[%d] %ld\n", execname(), pid(), diff)
> +	}
> +}

The sk_acceptq_added increments backlog and sk_accepts_removed decrements, so there could be more than thing listening.  So this is only going to count the time from the last sk_acceptq_added to the first sk_acceptq_removed.  Also it sk_acceptq_remove isn't clearing out tcp_acceptq_start global array so eventually that will fill up. Which tiome is most useful:

-from earliest sk_acceptq_added to earliest sk_acceptq_removed
-from latest sk_acceptq_added to earliest sk_acceptq_removed
-from earliest sk_acceptq_added to latest sk_acceptq_removed
-from latest sk_acceptq_added to latest sk_acceptq_removed

> diff --git a/testsuite/systemtap.examples/network/unix_acceptq_time.meta b/testsuite/systemtap.examples/network/unix_acceptq_time.meta
> new file mode 100644
> index 0000000..7ed5c88
> --- /dev/null
> +++ b/testsuite/systemtap.examples/network/unix_acceptq_time.meta
> @@ -0,0 +1,13 @@
> +title: Accept Queue time for Unix stream sockets
> +name: unix_acceptq_time.stp
> +version: 1.0
> +author: Eric Wong
> +keywords: network queue accept unix listen
> +subsystem: network
> +status: experimental
> +exit: user-controlled
> +output: trace
> +scope: system-wide
> +description: Print out executable name, PID and time an accepted Unix socket spent in the listen queue
> +test_check: stap -p4 unix_acceptq_time.stp
> +test_installcheck: stap unix_acceptq_time.stp -c "sleep 0.2"
> diff --git a/testsuite/systemtap.examples/network/unix_acceptq_time.stp b/testsuite/systemtap.examples/network/unix_acceptq_time.stp
> new file mode 100755
> index 0000000..e514d6d
> --- /dev/null
> +++ b/testsuite/systemtap.examples/network/unix_acceptq_time.stp
> @@ -0,0 +1,35 @@
> +#! /usr/bin/env stap
> +
> +/* wrap-around just in case there are more unix_create1() callers */
> +global unix_acceptq_start%

What is the '%' at the end of the line for?

> +
> +/*
> + * Linux 3.6.x only has two unix_create1() callers:
> + * - unix_stream_connect() - we care about this
> + * - unix_create()         - we do not care about this
> + */
> +probe kernel.function("unix_create1").return {
> +	unix_acceptq_start[$return] = cpu_clock_us(0)
> +}

unix_create1() is a static function in the kernel that appears to only be called by other functions in the file.  This currently works, but this function could be inlined by the compiler and then the function().return probe point won't be available.

Probably should use gettimeofday_us() rather than the cpu_clock_us(0) which could cause an interprocessor interrupt on smp machines.
> +
> +/*
> + * we only care about unix_create1() called from unix_stream_connect(),
> + * so avoid wasting space here.
> + * Maybe future (or past) kernels had more unix_create1() callers,
> + * wraparound for the unix_acceptq_start array will prevent us from hitting limits
> + */
> +probe kernel.function("unix_create").return {
> +	delete unix_acceptq_start[$return]
> +}

Would it be possible to watch the unix_stream_connect() directly and avoid probing unix_create() and unix_create1()?


> +
> +probe kernel.function("unix_accept").return {
> +	if ($return != 0)
> +		next
> +	sk = @cast($newsock, "struct socket")->sk
> +	started = unix_acceptq_start[sk]
> +	if (started) {
> +		delete unix_acceptq_start[sk]
> +		diff = cpu_clock_us(0) - started
> +		printf("%s[%d] %ld\n", execname(), pid(), diff)
> +	}
> +}
> 

-Will


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]