• Paul Mackerras's avatar
    perf_counter: Dynamically allocate tasks' perf_counter_context struct · a63eaf34
    Paul Mackerras authored
    This replaces the struct perf_counter_context in the task_struct with
    a pointer to a dynamically allocated perf_counter_context struct.  The
    main reason for doing is this is to allow us to transfer a
    perf_counter_context from one task to another when we do lazy PMU
    switching in a later patch.
    
    This has a few side-benefits: the task_struct becomes a little smaller,
    we save some memory because only tasks that have perf_counters attached
    get a perf_counter_context allocated for them, and we can remove the
    inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
    end up recompiling nearly everything whenever perf_counter.h changes.
    
    The perf_counter_context structures are reference-counted and freed
    when the last reference is dropped.  A context can have references
    from its task and the counters on its task.  Counters can outlive the
    task so it is possible that a context will be freed well after its
    task has exited.
    
    Contexts are allocated on fork if the parent had a context, or
    otherwise the first time that a per-task counter is created on a task.
    In the latter case, we set the context pointer in the task struct
    locklessly using an atomic compare-and-exchange operation in case we
    raced with some other task in creating a context for the subject task.
    
    This also removes the task pointer from the perf_counter struct.  The
    task pointer was not used anywhere and would make it harder to move a
    context from one task to another.  Anything that needed to know which
    task a counter was attached to was already using counter->ctx->task.
    
    The __perf_counter_init_context function moves up in perf_counter.c
    so that it can be called from find_get_context, and now initializes
    the refcount, but is otherwise unchanged.
    
    We were potentially calling list_del_counter twice: once from
    __perf_counter_exit_task when the task exits and once from
    __perf_counter_remove_from_context when the counter's fd gets closed.
    This adds a check in list_del_counter so it doesn't do anything if
    the counter has already been removed from the lists.
    
    Since perf_counter_task_sched_in doesn't do anything if the task doesn't
    have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
    __perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
    the case where the current task adds the first counter to itself and
    thus creates a context for itself.
    
    This also adds similar code to __perf_counter_enable to handle a
    similar situation which can arise when the counters have been disabled
    using prctl; that also leaves cpuctx->task_ctx = NULL.
    
    [ Impact: refactor counter context management to prepare for new feature ]
    Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    a63eaf34
fork.c 41.5 KB