[BIOSAL] Results with Xeon and Xeon Phi

Boisvert, Sebastien boisvert at anl.gov
Mon Nov 3 16:05:21 CST 2014


> From: Fangfang Xia [fangfang.xia at gmail.com]
> Sent: Monday, November 03, 2014 3:47 PM
> To: Boisvert, Sebastien
> Cc: biosal at lists.cels.anl.gov
> Subject: Re: [BIOSAL] Results with Xeon and Xeon Phi
> 
> 
> This interesting. I’m curious what the call stacks for these spin locks are?
> 
> On Nov 3, 2014, at 3:35 PM, Boisvert, Sebastien <boisvert at anl.gov> wrote:
> 42.42%
>   [kernel]                          [k] _spin_lock    
> 

I traced the job with perf.

[boisvert at bigmem biosal]$ ./performance/latency_probe/latency_probe -threads-per-node 30 | tee log

[boisvert at bigmem biosal]$ perf record -g -e cpu-cycles -o spinlock.data -p 4744
^C[ perf record: Woken up 842 times to write data ]
[ perf record: Captured and wrote 215.969 MB spinlock.data (~9435828 samples) ]

[boisvert at bigmem biosal]$ perf report -i spinlock.data 

Samples: 2M of event 'cpu-cycles', Event count (approx.): 1263624159273                                                                                                      
-  61.46%  latency_probe  [kernel.kallsyms]   [k] _spin_lock                                                                                                                ▒
   - _spin_lock                                                                                                                                                             ◆
      - 57.54% futex_wake                                                                                                                                                   ▒
           do_futex                                                                                                                                                         ▒
           sys_futex                                                                                                                                                        ▒
           system_call_fastpath                                                                                                                                             ▒
         - __lll_unlock_wake_private                                                                                                                                        ▒
              100.00% 0xb54                                                                                                                                                 ▒
      - 42.20% futex_wait_setup                                                                                                                                             ▒
           futex_wait                                                                                                                                                       ▒
           do_futex                                                                                                                                                         ▒
           sys_futex                                                                                                                                                        ▒
           system_call_fastpath                                                                                                                                             ▒
         - __lll_lock_wait_private                                                                                                                                          ▒
              100.00% 0xb54 


For this, it is not clear what the problem is. The biosal code is not using Fast Userspace mutexes (FUTEX).

I used gdb:

(gdb) info threads
(gdb)  thread 5

(gdb) bt
#0  0x0000003f196f7fce in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x0000003f1963651d in _L_lock_10 () from /lib64/libc.so.6
#2  0x0000003f19636361 in random () from /lib64/libc.so.6
#3  0x0000003f196369e9 in rand () from /lib64/libc.so.6
#4  0x0000000000402087 in process_send_ping (self=0x7f3df6e88360) at performance/latency_probe/process.c:278
#5  0x000000000040228c in process_receive (self=0x7f3df6e88360, message=0x7f3de256ada0) at performance/latency_probe/process.c:252
#6  0x0000000000406aa4 in thorium_actor_receive_private (self=0x7f3df6e88360) at engine/thorium/actor.c:1195
#7  thorium_actor_receive (self=0x7f3df6e88360) at engine/thorium/actor.c:1091
#8  thorium_actor_work (self=0x7f3df6e88360) at engine/thorium/actor.c:2077
#9  0x0000000000408b65 in thorium_worker_work (worker=0xf82028) at engine/thorium/worker.c:1858
#10 thorium_worker_run (worker=0xf82028) at engine/thorium/worker.c:1646
#11 0x0000000000408dc3 in thorium_worker_main (worker1=0xf82028) at engine/thorium/worker.c:675
#12 0x0000003f1a2079d1 in start_thread () from /lib64/libpthread.so.0
#13 0x0000003f196e886d in clone () from /lib64/libc.so.6


The problem is that all threads share the same seed, and in the glibc there a futex to protect the
seed and/or the current value.

I am fixing this right away.

> 
> 


More information about the BIOSAL mailing list