[BIOSAL] latency_probe results on POWER7 (on dowd at JLSE)

Boisvert, Sebastien boisvert at anl.gov
Tue Nov 4 13:03:18 CST 2014


Version: c541b41c0e

$ make CONFIG_MPI=n CC=cc -j

Hardware: IBM Power 740 Express (8205-E6D)
http://www-03.ibm.com/systems/power/hardware/740/specs.html

I think this is a POWER7+ with 6 cores and 8 threads per core (48 threads).

[boisvert at dowd biosal]$ ./performance/latency_probe/latency_probe > log-1
[boisvert at dowd biosal]$ grep COUNTER log-1
PERFORMANCE_COUNTER node-count = 1
PERFORMANCE_COUNTER worker-count-per-node = 1
PERFORMANCE_COUNTER actor-count-per-worker = 100
PERFORMANCE_COUNTER worker-count = 1
PERFORMANCE_COUNTER actor-count = 100
PERFORMANCE_COUNTER message-count-per-actor = 40000
PERFORMANCE_COUNTER message-count = 4000000
PERFORMANCE_COUNTER elapsed-time = 21.154284 s
PERFORMANCE_COUNTER computation-throughput = 189086.994342 messages / s
PERFORMANCE_COUNTER node-throughput = 189086.994342 messages / s
PERFORMANCE_COUNTER worker-throughput = 189086.994342 messages / s
PERFORMANCE_COUNTER worker-latency = 5288 ns
PERFORMANCE_COUNTER actor-throughput = 1890.869943 messages / s
PERFORMANCE_COUNTER actor-latency = 528857 ns


I am using gcc 4.4 (default on the system):

$ cc --version|head -n1
cc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)

With 2 worker threads, there is some sort of problem (maybe memory visibility ?):

[boisvert at dowd biosal]$ ./performance/latency_probe/latency_probe -threads-per-node 3 > log-3                                                                                                 
^C
[boisvert at dowd biosal]$ tail log-3
progress 1000097 276000/40000
progress 1000086 366000/40000
progress 1000078 280000/40000
progress 1000086 368000/40000
progress 1000156 280000/40000
progress 1000086 370000/40000
progress 1000039 280000/40000
progress 1000086 372000/40000
progress 1000184 282000/40000


More information about the BIOSAL mailing list