This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH -tip v3&10 07/18] x86: Add text_poke_smp for SMP cross modifying code


* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Mathieu Desnoyers wrote:
> > * Masami Hiramatsu (mhiramat@redhat.com) wrote:
> [...]
> >> +
> >> +/*
> >> + * Cross-modifying kernel text with stop_machine().
> >> + * This code originally comes from immediate value.
> >> + */
> >> +static atomic_t stop_machine_first;
> >> +static int wrote_text;
> >> +
> >> +struct text_poke_params {
> >> +	void *addr;
> >> +	const void *opcode;
> >> +	size_t len;
> >> +};
> >> +
> >> +static int __kprobes stop_machine_text_poke(void *data)
> >> +{
> >> +	struct text_poke_params *tpp = data;
> >> +
> >> +	if (atomic_dec_and_test(&stop_machine_first)) {
> >> +		text_poke(tpp->addr, tpp->opcode, tpp->len);
> >> +		smp_wmb();	/* Make sure other cpus see that this has run */
> >> +		wrote_text = 1;
> >> +	} else {
> >> +		while (!wrote_text)
> >> +			smp_rmb();
> >> +		sync_core();
> >
> > Hrm, there is a problem in there. The last loop, when wrote_text becomes
> > true, does not perform any smp_mb(), so you end up in a situation where
> > cpus in the "else" branch may never issue any memory barrier. I'd rather
> > do:
> 
> Hmm, so how about this? :)
> ---
> } else {
> 	do {
> 		smp_rmb();
> 	while (!wrote_text);
> 	sync_core();
> }
> ---
> 

The ordering we are looking for here are:

Write-side: smp_wmb() orders text_poke stores before store to wrote_text.

Read-side: order wrote_text load before subsequent execution of modified
           instructions.

Here again, strictly speaking, wrote_text load is not ordered with respect to
following instructions. So maybe it's fine on x86-TSO specifically, but I would
not count on this kind of synchronization to work in the general case.

Given the very small expected performance impact of this code path, I would
recomment using the more solid/generic alternative below. If there is really a
gain to get by creating this weird wait loop with strange memory barrier
semantics, fine, otherwise I'd be reluctant to accept your proposals as
obviously correct.

If you really, really want to go down the route of proving the correctness of
your memory barrier usage, I can recommend looking at the memory barrier formal
verification framework I did as part of my thesis. But, really, in this case,
the performance gain is just not there, so there is no point in spending time
trying to prove this.

Thanks,

Mathieu

> >
> > +static volatile int wrote_text;
> >
> > ...
> >
> > +static int __kprobes stop_machine_text_poke(void *data)
> > +{
> > +	struct text_poke_params *tpp = data;
> > +
> > +	if (atomic_dec_and_test(&stop_machine_first)) {
> > +		text_poke(tpp->addr, tpp->opcode, tpp->len);
> > +		smp_wmb();	/* order text_poke stores before store to wrote_text */
> > +		wrote_text = 1;
> > +	} else {
> > +		while (!wrote_text)
> > +			cpu_relax();
> > +		smp_mb();	/* order wrote_text load before following execution */
> > +	}
> >
> > If you don't like the "volatile int" definition of wrote_text, then we
> > should probably use the ACCESS_ONCE() macro instead.
> 
> hm, yeah, volatile will be required.
> 
> Thank you,
> 
> 
> -- 
> Masami Hiramatsu
> e-mail: mhiramat@redhat.com
> 
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency Consultant
EfficiOS Inc.
http://www.efficios.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]