summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* powerpc: Uprobes port to powerpcf18_uprobes_ppcAnanth N Mavinakayanahalli2012-10-296-1/+251
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the port of uprobes to powerpc. Usage is similar to x86. [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0xb4860) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 [root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20 [ perf record: Woken up 22 times to write data ] [ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ] [root@xxxx ~]# ./bin/perf report --stdio ... 69.05% tar libc-2.12.so [.] malloc 28.57% rm libc-2.12.so [.] malloc 1.32% avahi-daemon libc-2.12.so [.] malloc 0.58% bash libc-2.12.so [.] malloc 0.28% sshd libc-2.12.so [.] malloc 0.08% irqbalance libc-2.12.so [.] malloc 0.05% bzip2 libc-2.12.so [.] malloc 0.04% sleep libc-2.12.so [.] malloc 0.03% multipathd libc-2.12.so [.] malloc 0.01% sendmail libc-2.12.so [.] malloc 0.01% automount libc-2.12.so [.] malloc The trap_nr addition patch is a prereq. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Add trap_nr to thread_structAnanth N Mavinakayanahalli2012-10-294-0/+5
| | | | | | | | | | | | Add thread_struct.trap_nr and use it to store the last exception the thread experienced. In this patch, we populate the field at various places where we force_sig_info() to the process. This is also used in uprobes to determine if the probed instruction caused an exception. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Consolidate {k,u}probe definitionsAnanth N Mavinakayanahalli2012-10-292-13/+44
| | | | | | | | | | | | Move is_trap() and relatives to a common file to be shared between kprobes and uprobes. Code movement only; no change in functionality. Suggested by Michael Ellerman. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* uprobes: add exports necessary for uprobes use by modulesf18_uprobes_outJosh Stone2012-10-292-0/+9
| | | | | | ssia. Signed-off-by: Anton Arapov <anton@redhat.com>
* uprobes: Fix the racy uprobe->flags manipulationf18_uprobes_upstreamf18_uprobes_tipf18_uprobes_olegOleg Nesterov2012-10-291-14/+14
| | | | | | | | | | | | Multiple threads can manipulate uprobe->flags, this is obviously unsafe. For example mmap can set UPROBE_COPY_INSN while register tries to set UPROBE_RUN_HANDLER, the latter can also race with can_skip_sstep() which clears UPROBE_SKIP_SSTEP. Change this code to use bitops. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Fix prepare_uprobe() race with itselfOleg Nesterov2012-10-291-0/+8
| | | | | | | | | | | | | | | | | | | | install_breakpoint() is called under mm->mmap_sem, this protects set_swbp() but not prepare_uprobe(). Two or more different tasks can call install_breakpoint()->prepare_uprobe() at the same time, this leads to numerous problems if UPROBE_COPY_INSN is not set. Just for example, the second copy_insn() can corrupt the already analyzed/fixuped uprobe->arch.insn and race with handle_swbp(). This patch simply adds uprobe->copy_mutex to serialize this code. We could probably reuse ->consumer_rwsem, but this would mean that consumer->handler() can not use mm->mmap_sem, not good. Note: this is another temporary ugly hack until we move this logic into uprobe_register(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce prepare_uprobe()Oleg Nesterov2012-10-292-29/+41
| | | | | | | | | | | Preparation. Extract the copy_insn/arch_uprobe_analyze_insn code from install_breakpoint() into the new helper, prepare_uprobe(). And move uprobe->flags defines from uprobes.h to uprobes.c, nobody else can use them anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Fix handle_swbp() vs unregister() + register() raceOleg Nesterov2012-10-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Strictly speaking this race was added by me in 56bb4cf6. However I think that this bug is just another indication that we should move copy_insn/uprobe_analyze_insn code from install_breakpoint() to uprobe_register(), there are a lot of other reasons for that. Until then, add a hack to close the race. A task can hit uprobe U1, but before it calls find_uprobe() this uprobe can be unregistered *AND* another uprobe U2 can be added to uprobes_tree at the same inode/offset. In this case handle_swbp() will use the not-fully-initialized U2, in particular its arch.insn for xol. Add the additional !UPROBE_COPY_INSN check into handle_swbp(), if this flag is not set we simply restart as if the new uprobe was not inserted yet. This is not very nice, we need barriers, but we will remove this hack when we change uprobe_register(). Note: with or without this patch install_breakpoint() can race with itself, yet another reson to kill UPROBE_COPY_INSN altogether. And even the usage of uprobe->flags is not safe. See the next patches. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Do not delete uprobe if uprobe_unregister() failsOleg Nesterov2012-10-291-6/+6
| | | | | | | | | | | | | | | | | | | | | | delete_uprobe() must not be called if register_for_each_vma(false) fails to remove all breakpoints, __uprobe_unregister() is correct. The problem is that register_for_each_vma(false) always returns 0 and thus this logic does not work. 1. Change verify_opcode() to return 0 rather than -EINVAL when unregister detects the !is_swbp insn, we can treat this case as success and currently unregister paths ignore the error code anyway. 2. Change remove_breakpoint() to propagate the error code from write_opcode(). 3. Change register_for_each_vma(is_register => false) to remove as much breakpoints as possible but return non-zero if remove_breakpoint() fails at least once. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Don't return success if alloc_uprobe() failsOleg Nesterov2012-10-291-1/+3
| | | | | | | If alloc_uprobe() fails uprobe_register() should return ENOMEM, not 0. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/x86: Only rep+nop can be emulated correctlyOleg Nesterov2012-10-291-14/+2
| | | | | | | | | | | | | | | __skip_sstep() correctly detects the "nontrivial" nop insns, but since it doesn't update regs->ip we can not really skip "0x0f 0x1f | 0x0f 0x19 | 0x87 0xc0", the probed application is killed by SIGILL'ed handle_swbp(). Remove these additional checks. If we want to implement this correctly we need to know the full insn length to update ->ip. rep* + nop is fine even without updating ->ip. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Simplify is_swbp_at_addr(), remove stale commentsOleg Nesterov2012-10-291-49/+24
| | | | | | | | | | | After the previous change is_swbp_at_addr() is always called with current->mm. Remove this check and move it close to its single caller. Also, remove the obsolete comment about is_swbp_at_addr() and uprobe_state.count. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill set_orig_insn()->is_swbp_at_addr()Oleg Nesterov2012-10-291-9/+23
| | | | | | | | | | | | | | | | | | | | Unlike set_swbp(), set_orig_insn()->is_swbp_at_addr() makes sense, although it can't prevent all confusions. But the usage of is_swbp_at_addr() is equally confusing, and it adds the extra get_user_pages() we can avoid. This patch removes set_orig_insn()->is_swbp_at_addr() but changes write_opcode() to do the necessary checks before replace_page(). Perhaps it also makes sense to ensure PAGE_MAPPING_ANON in unregister case. find_active_uprobe() becomes the only user of is_swbp_at_addr(), we can change its semantics. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce copy_opcode(), kill read_opcode()Oleg Nesterov2012-10-291-44/+19
| | | | | | | | | | | | | | | | | No functional changes, preparations. 1. Extract the kmap-and-memcpy code from read_opcode() into the new trivial helper, copy_opcode(). The next patch will add another user. 2. read_opcode() becomes really trivial, fold it into its single caller, is_swbp_at_addr(). 3. Remove "auprobe" argument from write_opcode(), it is not used since f403072c6. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill set_swbp()->is_swbp_at_addr()Oleg Nesterov2012-10-291-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A separate patch for better documentation. set_swbp()->is_swbp_at_addr() is not needed for correctness, it is harmless to do the unnecessary __replace_page(old_page, new_page) when these 2 pages are identical. And it can not be counted as optimization. mmap/register races are very unlikely, while in the likely case is_swbp_at_addr() adds the extra get_user_pages() even if the caller is uprobe_mmap(current->mm) and returns false. Note also that the semantics/usage of is_swbp_at_addr() in uprobe.c is confusing. set_swbp() uses it to detect the case when this insn was already modified by uprobes, that is why it should always compare the opcode with UPROBE_SWBP_INSN even if the hardware (like powerpc) has other trap insns. It doesn't matter if this breakpoint was in fact installed by gdb or application itself, we are going to "steal" this breakpoint anyway and execute the original insn from vm_file even if it no longer matches the memory. OTOH, handle_swbp()->find_active_uprobe() uses is_swbp_at_addr() to figure out whether we need to send SIGTRAP or not if we can not find uprobe, so in this case it should return true for all trap variants, not only for UPROBE_SWBP_INSN. This patch removes set_swbp()->is_swbp_at_addr(), the next patches will remove it from set_orig_insn() which is similar to set_swbp() in this respect. So the only caller will be handle_swbp() and we can make its semantics clear. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Restrict valid_vma(false) to skip VM_SHARED vmasOleg Nesterov2012-10-291-9/+4
| | | | | | | | | | | | | | | | | | valid_vma(false) ignores ->vm_flags, this is not actually right. We should never try to write into MAP_SHARED mapping, this can confuse an apllication which actually writes to ->vm_file. With this patch valid_vma(false) ignores VM_WRITE only but checks other (immutable) bits checked by valid_vma(true). This can also speedup uprobe_munmap() and uprobe_unregister(). Note: even after this patch _unregister can confuse the probed application if it does mprotect(PROT_WRITE) after _register and installs "int3", but this is hardly possible to avoid and this doesn't differ from gdb case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXECOleg Nesterov2012-10-291-2/+2
| | | | | | | | | | | | uprobe_register() or uprobe_mmap() requires VM_READ | VM_EXEC, this is not right. An apllication can do mprotect(PROT_EXEC) later and execute this code. Change valid_vma(is_register => true) to check VM_MAYEXEC instead. No need to check VM_MAYREAD, it is always set. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Change write_opcode() to use FOLL_FORCEOleg Nesterov2012-10-291-1/+1
| | | | | | | | | | | | | | | | write_opcode()->get_user_pages() needs FOLL_FORCE to ensure we can read the page even if the probed task did mprotect(PROT_NONE) after uprobe_register(). Without FOLL_WRITE, FOLL_FORCE doesn't have any side effect but allows to read the !VM_READ memory. Otherwiese the subsequent uprobe_unregister()->set_orig_insn() fails and we leak "int3". If that task does mprotect(PROT_READ | EXEC) and execute the probed insn later it will be killed. Note: in fact this is also needed for _register, see the next patch. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Move clear_thread_flag(TIF_UPROBE) to uprobe_notify_resume()Oleg Nesterov2012-10-292-3/+3
| | | | | | | | Move clear_thread_flag(TIF_UPROBE) from do_notify_resume() to uprobe_notify_resume() for !CONFIG_UPROBES case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill UTASK_BP_HIT stateOleg Nesterov2012-10-292-21/+9
| | | | | | | | | | | | | | | Kill UTASK_BP_HIT state, it buys nothing but complicates the code. It is only used in uprobe_notify_resume() to decide who should be called, we can check utask->active_uprobe != NULL instead. And this allows us to simplify handle_swbp(), no need to clear utask->state. Likewise we could kill UTASK_SSTEP, but UTASK_BP_HIT is worse and imho should die. The problem is, it creates the special case when task->utask is NULL, we can't distinguish RUNNING and BP_HIT. With this patch utask == NULL always means RUNNING. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Fix UPROBE_SKIP_SSTEP checks in handle_swbp()Oleg Nesterov2012-10-291-16/+15
| | | | | | | | | | | | | | | | | | | If handle_swbp()->add_utask() fails but UPROBE_SKIP_SSTEP is set, cleanup_ret: path do not restart the insn, this is wrong. Remove this check and add the additional label for can_skip_sstep() = T case. Note also that UPROBE_SKIP_SSTEP can be false positive, we simply can not trust it unless arch_uprobe_skip_sstep() was already called. Also, move another UPROBE_SKIP_SSTEP check before can_skip_sstep() into this helper, this looks more clean and understandable. Note: probably we should rename "skip" to "emulate" and I think that "clear UPROBE_SKIP_SSTEP" should be moved to arch_can_skip. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Do not setup ->active_uprobe/state prematurelyOleg Nesterov2012-10-291-6/+3
| | | | | | | | | | | | | handle_swbp() sets utask->active_uprobe before handler_chain(), and UTASK_SSTEP before pre_ssout(). This complicates the code for no reason, arch_ hooks or consumer->handler() should not (and can't) use this info. Change handle_swbp() to initialize them after pre_ssout(), and remove the no longer needed cleanup-utask code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> cked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Do not leak UTASK_BP_HIT if find_active_uprobe() failsOleg Nesterov2012-10-291-1/+4
| | | | | | | | | | | If handle_swbp()->find_active_uprobe() fails we return with utask->state = UTASK_BP_HIT. Change handle_swbp() to reset utask->state at the start. Note that we do this unconditionally, see the next patch(es). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Make arch_uprobe_task->saved_trap_nr "unsigned int"Oleg Nesterov2012-10-291-1/+1
| | | | | | | | Make arch_uprobe_task->saved_trap_nr "unsigned int" and move it down after ->saved_scratch_register, this changes sizeof() from 24 to 16. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/x86: Fix arch_uprobe_disable_step() && UTASK_SSTEP_TRAPPED interactionOleg Nesterov2012-10-291-4/+10
| | | | | | | | | | | | arch_uprobe_disable_step() should also take UTASK_SSTEP_TRAPPED into account. In this case the probed insn was not executed, we need to clear X86_EFLAGS_TF if it was set by us and that is all. Again, this code will look more clean when we move it into arch_uprobe_post_xol() and arch_uprobe_abort_xol(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/x86: Xol should send SIGTRAP if X86_EFLAGS_TF was setOleg Nesterov2012-10-292-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | arch_uprobe_disable_step() correctly preserves X86_EFLAGS_TF and returns to user-mode. But this means the application gets SIGTRAP only after the next insn. This means that UPROBE_CLEAR_TF logic is not really right. _enable should only record the state of X86_EFLAGS_TF, and _disable should check it separately from UPROBE_FIX_SETF. Remove arch_uprobe_task->restore_flags, add ->saved_tf instead, and change enable/disable accordingly. This assumes that the probed insn was not trapped, see the next patch. arch_uprobe_skip_sstep() logic has the same problem, change it to check X86_EFLAGS_TF and send SIGTRAP as well. We will cleanup this all after we fold enable/disable_step into pre/post_hol hooks. Note: send_sig(SIGTRAP) is not actually right, we need send_sigtrap(). But this needs more changes, handle_swbp() does the same and this is equally wrong. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/x86: Do not (ab)use TIF_SINGLESTEP/user_*_single_step() for ↵Oleg Nesterov2012-10-293-15/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | single-stepping user_enable/disable_single_step() was designed for ptrace, it assumes a single user and does unnecessary and wrong things for uprobes. For example: - arch_uprobe_enable_step() can't trust TIF_SINGLESTEP, an application itself can set X86_EFLAGS_TF which must be preserved after arch_uprobe_disable_step(). - we do not want to set TIF_SINGLESTEP/TIF_FORCED_TF in arch_uprobe_enable_step(), this only makes sense for ptrace. - otoh we leak TIF_SINGLESTEP if arch_uprobe_disable_step() doesn't do user_disable_single_step(), the application will be killed after the next syscall. - arch_uprobe_enable_step() does access_process_vm() we do not need/want. Change arch_uprobe_enable/disable_step() to set/clear X86_EFLAGS_TF directly, this is much simpler and more correct. However, we need to clear TIF_BLOCKSTEP/DEBUGCTLMSR_BTF before executing the probed insn, add set_task_blockstep(false). Note: with or without this patch, there is another (hopefully minor) problem. A probed "pushf" insn can see the wrong X86_EFLAGS_TF set by uprobes. Perhaps we should change _disable to update the stack, or teach arch_uprobe_skip_sstep() to emulate this insn. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* ptrace/x86: Partly fix set_task_blockstep()->update_debugctlmsr() logicOleg Nesterov2012-10-291-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in step.c was always very wrong. 1. update_debugctlmsr() was simply unneeded. The child sleeps TASK_TRACED, __switch_to_xtra(next_p => child) should notice TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if needed. 2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register should always match the state of current's TIF_BLOCKSTEP bit. 3. Even get_debugctlmsr() + update_debugctlmsr() itself does not look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR register or the caller can be preempted in between. 4. It is not safe to play with TIF_BLOCKSTEP if task != current. DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each other if the task is running. The tracee is stopped but it can be SIGKILL'ed right before set/clear_tsk_thread_flag(). However, now that uprobes uses user_enable_single_step(current) we can't simply remove update_debugctlmsr(). So this patch adds the additional "task == current" check and disables irqs to avoid the race with interrupts/preemption. Unfortunately this patch doesn't solve the last problem, we need another fix. Probably we should teach ptrace_stop() to set/clear single/block stepping after resume. And afaics there is yet another problem: perf can play with MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even __switch_to_xtra() has problems. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
* ptrace/x86: Introduce set_task_blockstep() helperOleg Nesterov2012-10-291-20/+21
| | | | | | | | | | | No functional changes, preparation for the next fix and for uprobes single-step fixes. Move the code playing with TIF_BLOCKSTEP/DEBUGCTLMSR_BTF into the new helper, set_task_blockstep(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/x86: Implement x86 specific arch_uprobe_*_stepSebastian Andrzej Siewior2012-10-292-0/+35
| | | | | | | | | | | | | | | | | | The arch specific implementation behaves like user_enable_single_step() except that it does not disable single stepping if it was already enabled by ptrace. This allows the debugger to single step over an uprobe. The state of block stepping is not restored. It makes only sense together with TF and if that was enabled then the debugger is notified. Note: this is still not correct. For example, TIF_SINGLESTEP check is not right, the application itself can set X86_EFLAGS_TF. And otoh we leak TIF_SINGLESTEP (set by enable) if the probed insn is "popf". See the next patches, we need the changes in arch/x86/kernel/step.c first. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce arch_uprobe_enable/disable_step()Sebastian Andrzej Siewior2012-10-292-2/+14
| | | | | | | | | | | | | | | | | | As Oleg pointed out in [0] uprobe should not use the ptrace interface for enabling/disabling single stepping. [0] http://lkml.kernel.org/r/20120730141638.GA5306@redhat.com Add the new "__weak arch" helpers which simply call user_*_single_step() as a preparation. This is only needed to not break the powerpc port, we will fold this logic into arch_uprobe_pre/post_xol() hooks later. We should also change handle_singlestep(), _disable_step(&uprobe->arch) should be called before put_uprobe(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Teach find_active_uprobe() to clear MMF_HAS_UPROBESOleg Nesterov2012-10-291-0/+22
| | | | | | | | | | | | | | | | | | | The wrong MMF_HAS_UPROBES doesn't really hurt, just it triggers the "slow" and unnecessary handle_swbp() path if the task hits the non-uprobe breakpoint. So this patch changes find_active_uprobe() to check every valid vma and clear MMF_HAS_UPROBES if no uprobes were found. This is adds the slow O(n) path, but it is only called in unlikely case when the task hits the normal breakpoint first time after uprobe_unregister(). Note the "not strictly accurate" comment in mmf_recalc_uprobes(). We can fix this, we only need to teach vma_has_uprobes() to return a bit more more info, but I am not sure this worth the trouble. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce MMF_RECALC_UPROBESOleg Nesterov2012-10-292-5/+37
| | | | | | | | | | | | | | Add the new MMF_RECALC_UPROBES flag, it means that MMF_HAS_UPROBES can be false positive after remove_breakpoint() or uprobe_munmap(). It is also set by uprobe_dup_mmap(), this is not optimal but simple. We could add the new hook, uprobe_dup_vma(), to set MMF_HAS_UPROBES only if the new mm actually has uprobes, but I don't think this makes sense. The next patch will use this flag to clear MMF_HAS_UPROBES. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: uprobes_treelock should not disable irqsOleg Nesterov2012-10-291-13/+8
| | | | | | | | Nobody plays with uprobes_tree/uprobes_treelock in interrupt context, no need to disable irqs. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Don't put NULL pointer in uprobe_register()Sebastian Andrzej Siewior2012-10-291-1/+2
| | | | | | | | | alloc_uprobe() might return a NULL pointer, put_uprobe() can't deal with this. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Remove "verify" argument from set_orig_insn()Oleg Nesterov2012-10-292-12/+10
| | | | | | | | | Nobody does set_orig_insn(verify => false), and I think nobody will. Remove this argument. IIUC set_orig_insn(verify => false) was needed to single-step without xol area. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Fold uprobe_reset_state() into uprobe_dup_mmap()Oleg Nesterov2012-10-293-14/+2
| | | | | | | | | Now that we have uprobe_dup_mmap() we can fold uprobe_reset_state() into the new hook and remove it. mmput()->uprobe_clear_state() can't be called before dup_mmap(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce MMF_HAS_UPROBESOleg Nesterov2012-10-294-1/+29
| | | | | | | | | | | | | | | | | | | | Add the new MMF_HAS_UPROBES flag. It is set by install_breakpoint() and it is copied by dup_mmap(), uprobe_pre_sstep_notifier() checks it to avoid the slow path if the task was never probed. Perhaps it makes sense to check it in valid_vma(is_register => false) as well. This needs the new dup_mmap()->uprobe_dup_mmap() hook. We can't use uprobe_reset_state() or put MMF_HAS_UPROBES into MMF_INIT_MASK, we need oldmm->mmap_sem to avoid the race with uprobe_register() or mmap() from another thread. Currently we never clear this bit, it can be false-positive after uprobe_unregister() or uprobe_munmap() or if dup_mmap() hits the probed VM_DONTCOPY vma. But this is fine correctness-wise and has no effect unless the task hits the non-uprobe breakpoint. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Do not use -EEXIST in install_breakpoint() pathsOleg Nesterov2012-10-291-11/+5
| | | | | | | | | -EEXIST from install_breakpoint() no longer makes sense, all callers should simply treat it as "success". Change the code to return zero and simplify register_for_each_vma(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Change uprobe_mmap() to ignore the errors but check ↵Oleg Nesterov2012-10-291-21/+6
| | | | | | | | | | | | | | | | | | | | | | | | fatal_signal_pending() Once install_breakpoint() fails uprobe_mmap() "ignores" all other uprobes and returns the error. It was never really needed to to stop after the first error, and in fact it was always wrong at least in -ENOTSUPP case. Change uprobe_mmap() to ignore the errors and always return 0. This is not what we want in the long term, but until we teach the callers to handle the failure it would be better to remove the pointless complications. And this doesn't look too bad, the only "reasonable" error is ENOMEM but in this case the caller should be oom-killed in the likely case or the system has more serious problems. However it makes sense to stop if fatal_signal_pending() == T. In particular this helps if the task was oom-killed. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill dup_mmap()->uprobe_mmap(), simplify uprobe_mmap/munmapOleg Nesterov2012-10-292-30/+3
| | | | | | | | | | | | | | | | | | | | | | 1. Kill dup_mmap()->uprobe_mmap(), it was only needed to calculate new_mm->uprobes_state.count removed by the previous patch. If the forking process has a pending uprobe (int3) in vma, it will be copied by copy_page_range(), note that it checks vma->anon_vma so "Don't copy ptes" is not possible after install_breakpoint() which does anon_vma_prepare(). 2. Remove is_swbp_at_addr() and "int count" in uprobe_mmap(). Again, this was needed for uprobes_state.count. As a side effect this fixes the bug pointed out by Srikar, this code lacked the necessary put_uprobe(). 3. uprobe_munmap() becomes a nop after the previous patch. Remove the meaningless code but do not remove the helper, we will need it. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill uprobes_state->countOleg Nesterov2012-10-292-37/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | uprobes_state->count is only needed to avoid the slow path in uprobe_pre_sstep_notifier(). It is also checked in uprobe_munmap() but ironically its only goal to decrement this counter. However, it is very broken. Just some examples: - uprobe_mmap() can race with uprobe_unregister() and wrongly increment the counter if it hits the non-uprobe "int3". Note that install_breakpoint() checks ->consumers first and returns -EEXIST if it is NULL. "atomic_sub() if error" in uprobe_mmap() looks obviously wrong too. - uprobe_munmap() can race with uprobe_register() and wrongly decrement the counter by the same reason. - Suppose an appication tries to increase the mmapped area via sys_mremap(). vma_adjust() does uprobe_munmap(whole_vma) first, this can nullify the counter temporarily and race with another thread which can hit the bp, the application will be killed by SIGTRAP. - Suppose an application mmaps 2 consecutive areas in the same file and one (or both) of these areas has uprobes. In the likely case mmap_region()->vma_merge() suceeds. Like above, this leads to uprobe_munmap/uprobe_mmap from vma_merge()->vma_adjust() but then mmap_region() does another uprobe_mmap(resulting_vma) and doubles the counter. This patch only removes this counter and fixes the compile errors, then we will try to cleanup the changed code and add something else instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Remove check for uprobe variable in handle_swbp()Sebastian Andrzej Siewior2012-10-291-9/+7
| | | | | | | | by the time we get here (after we pass cleanup_ret) uprobe is always is set. If it is NULL we leave very early in the code. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
* uprobes: Remove redundant lock_page/unlock_pageSrikar Dronamraju2012-10-291-2/+0
| | | | | | | | | Since read_opcode() reads from the referenced page and doesnt modify the page contents nor the page attributes, there is no need to lock the page. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
* fedora kernel: 1c1fad82422e58b20bf06bb371c38154b4b965dbv3.6.3-4f18Anton Arapov2012-10-29808-9892/+19678
| | | | Signed-off-by: Anton Arapov <anton@redhat.com>
* fedora kernel: 5006e294cad99968b73c89679431ab73ec6bd0bav3.6.0-0.rc5.git3.1Anton Arapov2012-09-17263-1175/+2607
| | | | Signed-off-by: Anton Arapov <anton@redhat.com>
* fedora kernel: 235ab5f360023f79ce24102cee8a9678a5a5e987Anton Arapov2012-09-1739713-0/+15928063
| | | | Signed-off-by: Anton Arapov <anton@redhat.com>
* initAnton Arapov2012-04-161-0/+0
Signed-off-by: Anton Arapov <anton@redhat.com>