The Power8 is an impressive design for what it is intended to do. I am not sure it is intended to do scientific math.
"POWER8 is designed to be a massively multithreaded chip, with each of its cores capable of handling eight hardware threads simultaneously, for a total of 96 threads executed simultaneously on a 12-core chip."
This implies to me that it expects a great many 'slow' jobs (where slow means they access main memory, which is so slow that the CPU can productively change which program it is thinking about 8 times before main memory returns data)
It is not clear that there are 8 sets of SIMD registers per core (it reads as if there are only two) or that SIMD workload accesses main memory all that often (Many WUs work on video cards with only 512 meg ram, and the POWER8 chip may have 128 meg of cache before it goes to main memory. Fast enough caches (and there seem to be 4 layers of cache) and small enough datasets would reduce the number of times the 8 threads per CPU would need to switch.
The POWER chips excel at connectivity to multiple peripherals, but again F@H does not need entire banks of hard drives.
One future advantage of POWER is that you can design GPUs that use main memory directly without needing a PCI-E bus. If GPU vendors adopt this protocol, you could do excellent GPU folding.
https://en.wikipedia.org/wiki/POWER8