This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
RE: Review patches of user space kprobe
- From: "Zhang, Yanmin" <yanmin dot zhang at intel dot com>
- To: <prasanna at in dot ibm dot com>
- Cc: <systemtap at sources dot redhat dot com>, "Keshavamurthy, Anil S" <anil dot s dot keshavamurthy at intel dot com>, "Mao, Bibo" <bibo dot mao at intel dot com>
- Date: Thu, 22 Dec 2005 13:09:03 +0800
- Subject: RE: Review patches of user space kprobe
One more comment for patch 1/3.
register_aggr_kprobe needs to be aware of user space probe.
Yanmin
>>-----Original Message-----
>>From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org] On Behalf Of Zhang, Yanmin
>>Sent: 2005年12月22日 11:37
>>To: prasanna@in.ibm.com
>>Cc: systemtap@sources.redhat.com; Keshavamurthy, Anil S; Mao, Bibo
>>Subject: RE: Review patches of user space kprobe
>>
>>General question:
>>1) How to insert an uprobe at anonymous page (VMA)? I think there are 2 cases related to the question.
>> a) Many applications execute codes produced themselves, such like JIT (Just-In-Time) of JVM.
>> b) Some executables include TEXTREL section. When they are loaded into memory and linked dynamically, the text section might be
>>changed, and kernel will do a Copy-On-Write to create a new anonymous page and map the new page to the process address space. So after
>>the process starts, we couldn't insert uprobe on its copied pages.
>>Should a new interface be added to support it? The parameters could be process id and offset in the process address space. Of course,
>>it could be an enhancement and implemented later.
>>
>>3) Can function register_userspace_probe do not call register_kprobe? I think it's not necessary. It's just my feeling. It's up to you
>>to make decision. :)
>>
>>2) Function get_inode_ops should take care of errors and its caller, register_userspace_probe should check if the return value of
>>get_inode_ops is IS_ERR. If so, the error code should be returned instead of a hard-coded -ENOSYS.
>>
>>Other more comments against patch 1/3 are inline below.
>>
>>Yanmin
>>
>>>>-----Original Message-----
>>>>From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org] On Behalf Of Zhang, Yanmin
>>>>Sent: 2005年12月21日 16:30
>>>>To: prasanna@in.ibm.com
>>>>>>+static struct vm_area_struct *find_get_vma(unsigned long offset,
>>>>>>+ struct address_space
>>>>*mapping)
>>>>>>+{
>>>>>>+ struct vm_area_struct *vma = NULL;
>>>>>>+ struct prio_tree_iter iter;
>>>>>>+ struct prio_tree_root *head = &mapping->i_mmap;
>>>>>>+ struct mm_struct *mm;
>>>>>>+ unsigned long start, end;
>>>>>>+
>>>>>>+ spin_lock(&mapping->i_mmap_lock);
>>>>>>+ vma_prio_tree_foreach(vma, &iter, head, offset, offset) {
>>>>>>+ mm = vma->vm_mm;
>>>>>>+ spin_lock(&mm->page_table_lock);
>>>>>>+ start = vma->vm_start - (vma->vm_pgoff << PAGE_SHIFT);
>>>>>>+ end = vma->vm_end - (vma->vm_pgoff << PAGE_SHIFT);
>>>>>>+ spin_unlock(&mm->page_table_lock);
>>>>>>+ if ((start + offset) < end) {
>>>>>>+ spin_unlock(&mapping->i_mmap_lock);
>>>>>>+ return vma;
>>It's not safe to return vma without lock. There is a race condition. If vma is released by another thread, kernel might be crazy when
>>this thread tries to access it later.
>>
>>If the page is mapped to many vma in different processes, function find_get_vma just returns one vma. It's not enough.
>>
>>I'd like to suggest to do the flush_icache in the vma_prio_tree_foreach loop.
>>
>>
>>>>>>+ }
>>>>>>+ }
>>>>>>+ spin_unlock(&mapping->i_mmap_lock);
>>>>>>+ return NULL;
>>>>>>+}