This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: notify_page_fault() problem
On Mon, Apr 30, 2007 at 11:23:22PM +0200, Andi Kleen wrote:
Quentin Barnes <qbarnes@urbana.css.mot.com> writes:
Now on i386's do_page_fault(), it avoids the above infinite
recursion by checking to see if the fault happened in kernel space.
Actually the real avoidance is by calling vmalloc_sync_all() when
the notifier is registered. Probably you need to implement an equivalent
for ARM
Ah, okay. I see a call to vmalloc_sync_all() in
register_page_fault_notifier() which would also resolve this
problem and it was put in around 2.6.18.
However, vmalloc_sync_all() is i386 and x86_64 specific as well
as their change to register_page_fault_notifier(). I don't see
other platform doing anything else doing anything special in their
register_page_fault_notifier(). I have trouble believing that x86
and ARM are unique somehow with needing to address this problem.
Why doesn't anyone else hit this? Is it a lurking problem or are
there other fixes in other forms out there?
One thing I don't understand is why notify_page_fault() is called
so early in everyone's page fault handling code.
So that users like kprobes don't deadlock on a fault inside
a region that takes the mm_sem (or some other lock taken by pf)
I think I follow, but I'm not positive.
I would expect by the time the page fault control flow got through
all the expected cases and checked them all that those obtained
locks would be surrendered.
I guess part of the answer has to do with what people's expectations
are for intercepting faults with their kprobes fault handler though.
-Andi
Quentin