This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH v3 2.6.39-rc1-tip 12/26] 12: uprobes: slot allocation for uprobes
- From: Peter Zijlstra <peterz at infradead dot org>
- To: Srikar Dronamraju <srikar at linux dot vnet dot ibm dot com>
- Cc: Ingo Molnar <mingo at elte dot hu>, Steven Rostedt <rostedt at goodmis dot org>, Linux-mm <linux-mm at kvack dot org>, Arnaldo Carvalho de Melo <acme at infradead dot org>, Linus Torvalds <torvalds at linux-foundation dot org>, Jonathan Corbet <corbet at lwn dot net>, Christoph Hellwig <hch at infradead dot org>, Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>, Thomas Gleixner <tglx at linutronix dot de>, Ananth N Mavinakayanahalli <ananth at in dot ibm dot com>, Oleg Nesterov <oleg at redhat dot com>, Andrew Morton <akpm at linux-foundation dot org>, SystemTap <systemtap at sources dot redhat dot com>, Jim Keniston <jkenisto at linux dot vnet dot ibm dot com>, Roland McGrath <roland at hack dot frob dot com>, Andi Kleen <andi at firstfloor dot org>, LKML <linux-kernel at vger dot kernel dot org>
- Date: Mon, 18 Apr 2011 18:46:11 +0200
- Subject: Re: [PATCH v3 2.6.39-rc1-tip 12/26] 12: uprobes: slot allocation for uprobes
- References: <20110401143223.15455.19844.sendpatchset@localhost6.localdomain6> <20110401143457.15455.64839.sendpatchset@localhost6.localdomain6>
On Fri, 2011-04-01 at 20:04 +0530, Srikar Dronamraju wrote:
> Every task is allocated a fixed slot. When a probe is hit, the original
> instruction corresponding to the probe hit is copied to per-task fixed
> slot. Currently we allocate one page of slots for each mm. Bitmaps are
> used to know which slots are free. Each slot is made of 128 bytes so
> that its cache aligned.
>
> TODO: On massively threaded processes (or if a huge number of processes
> share the same mm), there is a possiblilty of running out of slots.
> One alternative could be to extend the slots as when slots are required.
As long as you're single stepping things and not using boosted probes
you can fully serialize the slot usage. Claim a slot on trap and release
the slot on finish. Claiming can wait on a free slot since you already
have the whole SLEEPY thing.
> +static int xol_add_vma(struct uprobes_xol_area *area)
> +{
> + struct vm_area_struct *vma;
> + struct mm_struct *mm;
> + struct file *file;
> + unsigned long addr;
> + int ret = -ENOMEM;
> +
> + mm = get_task_mm(current);
> + if (!mm)
> + return -ESRCH;
> +
> + down_write(&mm->mmap_sem);
> + if (mm->uprobes_xol_area) {
> + ret = -EALREADY;
> + goto fail;
> + }
> +
> + /*
> + * Find the end of the top mapping and skip a page.
> + * If there is no space for PAGE_SIZE above
> + * that, mmap will ignore our address hint.
> + *
> + * We allocate a "fake" unlinked shmem file because
> + * anonymous memory might not be granted execute
> + * permission when the selinux security hooks have
> + * their way.
> + */
That just annoys me, so we're working around some stupid sekurity crap,
executable anonymous maps are perfectly fine, also what do JITs do?
> + vma = rb_entry(rb_last(&mm->mm_rb), struct vm_area_struct, vm_rb);
> + addr = vma->vm_end + PAGE_SIZE;
> + file = shmem_file_setup("uprobes/xol", PAGE_SIZE, VM_NORESERVE);
> + if (!file) {
> + printk(KERN_ERR "uprobes_xol failed to setup shmem_file "
> + "while allocating vma for pid/tgid %d/%d for "
> + "single-stepping out of line.\n",
> + current->pid, current->tgid);
> + goto fail;
> + }
> + addr = do_mmap_pgoff(file, addr, PAGE_SIZE, PROT_EXEC, MAP_PRIVATE, 0);
> + fput(file);
> +
> + if (addr & ~PAGE_MASK) {
> + printk(KERN_ERR "uprobes_xol failed to allocate a vma for "
> + "pid/tgid %d/%d for single-stepping out of "
> + "line.\n", current->pid, current->tgid);
> + goto fail;
> + }
> + vma = find_vma(mm, addr);
> +
> + /* Don't expand vma on mremap(). */
> + vma->vm_flags |= VM_DONTEXPAND | VM_DONTCOPY;
> + area->vaddr = vma->vm_start;
> + if (get_user_pages(current, mm, area->vaddr, 1, 1, 1, &area->page,
> + &vma) > 0)
> + ret = 0;
> +
> +fail:
> + up_write(&mm->mmap_sem);
> + mmput(mm);
> + return ret;
> +}