I just received my copy of the recently released 6th edition of Computer Architecture: A Quantitive Approach. For your information, Chapter 4 covers over 70 pages "Data-Level Parallelism in Vector, SIMD and GP Architectures". Nice writeup of the evolution since the 5th edition covered the then new and emerging GPGPU model 10 years ago. Some updated information about Intel's AVX512 and NVidia's Pascal architecture. Also a quick explanation by NVidia of the evolution of the native instruction sets of new GPUs underneath the PTX instruction set.
Interestingly, they ran (identical, recompiled) code from a paper, Intel researchers published in 2010 where they compared the Core i7-960 CPU with the GTX 280. Hennessy and Patterson took those comparisons forward and compared them to contemporary CPU'S (Intel Platinum 8180 with 28 cores) and NVidia's P100 Pascal architecture. Their based on those exampels, they concluded, that the speed difference between CPU and GPU now vs. back then got rather narrower than wider (CPU performance progressed faster than GPU's).
The book:
https://www.elsevier.com/books/computer ... 2-811905-1The 2010 Intel paper:
http://sbel.wisc.edu/Courses/ME964/Lite ... PU2010.pdf Would be interesting to see if the "good ol' days" of CPU driven big WU's might come back

. (Probably not, as the comparison was not including economic efficiency. Both parts compared are approx in the 10k USD range and a 1080Ti has a better ratio to a price-comparable Intel CPU)
Nevertheless, allways nice to look things up in the seminal CA-QA book.
Andy