This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Poor scaling




On 08/08/2016 06:00 PM, David Smith wrote:
On 08/07/2016 07:47 AM, Avi Kivity wrote:
Running a process with 12 active threads, each bound to its own core,
and touching a systemtap array fairly frequently, I see:


   72.91%  scylla           [kernel.kallsyms]     [k] _raw_spin_lock
    3.49%  scylla           [kernel.kallsyms]     [k] start_callback.isra.62
    1.77%  scylla           scylla                [.]
_ZSt13__adjust_heapIN9__gnu_cxx17__normal_iteratorIPPN8logalloc11region_implESt6vectorIS4_SaIS4_EEEElS4_NS0_5__ops15_Iter_comp_iterIZNS2_7tracker4impl15compact_on_idleERKSt8functionIFbvEEEUlS4_S4_E_EEEvT_T0_SM_T1_T2_.isra.437.constprop.582

    1.54%  scylla           [kernel.kallsyms]     [k] _raw_spin_unlock
    1.39%  scylla           scylla                [.]
_ZN7reactor10smp_pollfn4pollEv
    1.29%  scylla           libc-2.19.so          [.] epoll_pwait
    1.21%  swapper          [kernel.kallsyms]     [k] poll_idle
    1.06%  scylla           [kernel.kallsyms]     [k] _raw_read_lock_irqsave
    1.01%  scylla           scylla                [.]
_ZSt11__push_heapIN9__gnu_cxx17__normal_iteratorIPPN8logalloc11region_implESt6vectorIS4_SaIS4_EEEElS4_NS0_5__ops14_Iter_comp_valIZNS2_7tracker4impl15compact_on_idleERKSt8functionIFbvEEEUlS4_S4_E_EEEvT_T0_SM_T1_T2_.isra.436.constprop.583

    0.80%  scylla           [kernel.kallsyms]     [k]
_raw_read_unlock_irqrestore
    0.78%  scylla           scylla                [.]
_ZN17smp_message_queue20flush_response_batchEv
    0.75%  scylla           [kernel.kallsyms]     [k]
utrace_report_syscall_entry
    0.63%  scylla           [kernel.kallsyms]     [k] task_utrace_struct
    0.50%  scylla           [kernel.kallsyms]     [k] native_read_tsc
    0.37%  scylla           [kernel.kallsyms]     [k]
utrace_report_syscall_exit
    0.32%  scylla           [kernel.kallsyms]     [k] delay_tsc
    0.28%  scylla           scylla                [.]
_ZN17smp_message_queue13process_queueILm2EZNS_16process_incomingEvEUlPNS_9work_itemEE_EEmRNS_8lf_queueET0_.isra.2371

    0.26%  scylla           scylla                [.]
_ZN17smp_message_queue13process_queueILm4EZNS_19process_completionsEvEUlPNS_9work_itemEE_EEmRNS_8lf_queueET0_.isra.2278

    0.20%  scylla           [kernel.kallsyms]     [k] _raw_spin_lock_irq
    0.18%  scylla           scylla                [.]
_ZN6memory9cpu_pages4freeEPv
    0.18%  scylla           liblz4.so.1.0.0       [.] LZ4_compress
    0.17%  scylla           scylla                [.]
_ZN7reactor9run_tasksER15circular_bufferISt10unique_ptrI4taskSt14default_deleteIS2_EESaIS5_EE



Now this is not something I can run to monitor a production load.


Is there any way I can improve scaling?
It depends, we'd have to see your script to make suggestions.



Here is the latest version of my script (I'm not sure whether tid() is a process-local or global identifier, so I used a pid(), tid() pair).

#!/usr/bin/stap

# usage: task_latency.stap process_name latency_threshold_ms

global start_time

probe process(@1).mark("reactor_run_tasks_single_start") {
    start_time[pid(), tid()] = gettimeofday_us()
}

probe process(@1).mark("reactor_run_tasks_single_end") {
    delete start_time[pid(), tid()]
}

probe timer.profile {
    if ([pid(), tid()] in start_time) {
        now = gettimeofday_us()
        start = start_time[pid(), tid()]
        if ((now - start) > $2 * 1000) {
            printf("detected tasks running for >%sms\n", @2)
            print_usyms(ubacktrace())
        }
    }
}


Are entire probes serialized, or just individual variables? If the latter, perhaps you can detect that an array's index is always tid(), remove that index, and convert the global variable to a thread-local one (with an exception for begin and end probes).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]