This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ltt-dev] [PATCH] Fix tracepoints for network socket


* Atsushi Tsuji (a-tsuji@bk.jp.nec.com) wrote:
> Hi,
>
> Currently, the tracepoints for network socket could not trace all the
> network activity due to its location, sock_{send/recv}msg, because there
> is the path without through sock_{send/recv}msg (like below).
>
> Kernel path for sendmsg:
>   sys_write      sys_{send/sendto/sendmsg}
>      |                     |
>   sock_aio_write   sock_sendmsg
>        \                  /
>         \                /
>           __sock_sendmsg
>
> So I think __sock_{send/recv}msg is better tracepoints to track
> network socket activity.
>

Hi Atsushi,

That's a great patch ! Thanks !

Benjamin noticed this problematic behavior recently and was starting to
investigate on this issue. Your patch comes timely. :)

> And I'd like to request to get the return value on those tracepoints
> to track the real size of sending/recieving by user and the error status
> of __sock_{send/recv}msg.
>

Can we use the system call exit return value instead ? This should
already be recording exactly this information.

One more comment below.

> The below patch is for lttng tree to change those tracepoints.
>
> Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
> ---
> diff --git a/ltt/probes/net-trace.c b/ltt/probes/net-trace.c
> index 2c88ec6..34cef1a 100644
> --- a/ltt/probes/net-trace.c
> +++ b/ltt/probes/net-trace.c
> @@ -79,18 +79,18 @@ void probe_socket_sendmsg(struct socket *sock, struct 
> msghdr
> *msg,
>  		size_t size, int ret)
>  {
>  	trace_mark_tp(net, socket_sendmsg, socket_sendmsg, probe_socket_sendmsg,
> -		"sock %p family %d type %d protocol %d size %zu",
> +		"sock %p family %d type %d protocol %d size %zu ret %d",
>  		sock, sock->sk->sk_family, sock->sk->sk_type,
> -		sock->sk->sk_protocol, size);
> +		sock->sk->sk_protocol, size, ret);
>  }
>
>  void probe_socket_recvmsg(struct socket *sock, struct msghdr *msg,
>  		size_t size, int flags, int ret)
>  {
>  	trace_mark_tp(net, socket_recvmsg, socket_recvmsg, probe_socket_recvmsg,
> -		"sock %p family %d type %d protocol %d size %zu",
> +		"sock %p family %d type %d protocol %d size %zu ret %d",
>  		sock, sock->sk->sk_family, sock->sk->sk_type,
> -		sock->sk->sk_protocol, size);
> +		sock->sk->sk_protocol, size, ret);
>  }
>
>  void probe_socket_create(struct socket *sock, int fd)
> diff --git a/net/socket.c b/net/socket.c
> index 2b63bab..e8b2b5d 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -567,7 +567,11 @@ static inline int __sock_sendmsg(struct kiocb *iocb, 
> struct
> socket *sock,
>  	if (err)
>  		return err;
>
> -	return sock->ops->sendmsg(iocb, sock, msg, size);
> +	err = sock->ops->sendmsg(iocb, sock, msg, size);
> +	trace_socket_sendmsg(sock, msg, size, err);
> +
> +	return err;
> +
>  }
>
>  int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
> @@ -581,7 +585,6 @@ int sock_sendmsg(struct socket *sock, struct msghdr 
> *msg,
> size_t size)
>  	ret = __sock_sendmsg(&iocb, sock, msg, size);
>  	if (-EIOCBQUEUED == ret)
>  		ret = wait_on_sync_kiocb(&iocb);
> -	trace_socket_sendmsg(sock, msg, size, ret);


I am just wondering about the implication of moving this trace event
across the wait_on_sync_kiocb() call. This means that the sendmsg event
will now occur before we block on kiocb. I don't think it changes
anything given we have the full syscall and scheduler instrumentation
which tells us in which state we are (which syscall) and when we are
waiting. This sendmsg event is really there to add more syscall-specific
information. The problem is that we will get the return value of
__sock_sendmsg rather than the return value of wait_on_sync_kiocb, so in
the case __sock_sendmsg returns -EIOCBQUEUED, we end up not knowing the
"real" return value of the system call.

So given we already have the syscall return value in the syscall_exit
event, I wonder if it's really useful to add this information to the
sendmsg and recvmsg tracepoints ?

Best regards,

Mathieu


>  	return ret;
>  }
>
> @@ -650,7 +653,11 @@ static inline int __sock_recvmsg(struct kiocb *iocb, 
> struct
> socket *sock,
>  	if (err)
>  		return err;
>
> -	return sock->ops->recvmsg(iocb, sock, msg, size, flags);
> +	err = sock->ops->recvmsg(iocb, sock, msg, size, flags);
> +	trace_socket_recvmsg(sock, msg, size, flags, err);
> +
> +	return err;
> +
>  }
>
>  int sock_recvmsg(struct socket *sock, struct msghdr *msg,
> @@ -666,7 +673,6 @@ int sock_recvmsg(struct socket *sock, struct msghdr 
> *msg,
>  	ret = __sock_recvmsg(&iocb, sock, msg, size, flags);
>  	if (-EIOCBQUEUED == ret)
>  		ret = wait_on_sync_kiocb(&iocb);
> -	trace_socket_recvmsg(sock, msg, size, flags, ret);
>  	return ret;
>  }
>
>
>
>
>
> _______________________________________________
> ltt-dev mailing list
> ltt-dev@lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]