%z Article %K Tullsen96 %A Dean Tullsen %A Susan Eggers %A Joel Emer %A Henry Levy %A Jack Lo %A Rebecca Stamm %T Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor %C Proceedings of the 23rd Annual International Symposium on Computer Architecture %D May 1996 %P 191-202 %O http://www.cs.washington.edu/research/smt/papers/ISCA96.ps %z Article %K Tullsen99 %A Dean Tullsen %A Jack Lo %A Susan Eggers %A Henry Levy %T Suppoerting fine-grain synchronization on a simultaneous multithreaded processor %C Proceedings of the 5th International Symposium on High Performance Computer Architecture %D January 1999 %P 54-58 %O http://www.cs.washington.edu/research/smt/papers/hpca.ps %z Article %K Kumar97 %A A. Kumar %T The HP PA-8000 RISC CPU %J IEEE Micro %V 17 %N 2 %D March-April 1997 %P 27-32 %z Article %K Schlansker00 %A M.S. Schlansker %A B.R. Rau %T EPIC: Explicitly parallel instruction computing %J IEEE Computer %V 33 %N 2 %D Feb. 2000 %P 37-45 %z Article %K Smith95 %A James E. Smith %A Gurindar S. Sohi %T The microarchitecture of superscalar processors %J Proceedings of the IEEE %V 83 %D October 1995 %P 1609-1624 %z Thesis %K Munoz97 %A Raul E. Silvera Munoz %T Static instruction scheduling for dynamic issue processors %I ACAPS Laboratory, School of Computer Science, McGill University %D 1997 %z Article %K Agarwal96 %A Ramesh K. Agarwal %T A super scalar sort algorithm for RISC processors %C Processings 1996 ACM SIGMOD International Conference on Management of Data %D 1996 %P 240-246 %O http://citeseer.nj.nec.com/agarwal96super.html %z Article %K Staelin01a %A Carl Staelin %T Analyzing the memory hierarchy %D October 2001 %I Hewlett-Packard Laboratories %C Palo Alto, CA %z Article %K Staelin01b %A Carl Staelin %T lmbench3: Measuring scalability %D October 2001 %I Hewlett-Packard Laboratories %C Palo Alto, CA %z Article %K Frigo98 %A M. Frigo %A S.G. Johnson %T FFTW: An adaptive software architecture for the FFT %C Proceedings 1998 ICASSP %V 3 %P 1381-1384 %O http://www.fftw.org/fftw-paper-icassp.pdf %z Article %K Whaley98 %A R. Clint Whaley %A Jack Dongarra %T Automatically tuned linear algebra software %C Proceedings of the 1998 ACM/IEEE SC98 Conference %D 1998 %O http://sourceforge.net/projects/math-atlas %z Article %K Staelin98 %A Carl Staelin %A Larry McVoy %T mhz: Anatomy of a microbenchmark %C Proceedings USENIX Annual Technical Conference %c New Orleans, LA %D June 1998 %P 155-166 %z Article %K McVoy96 %A Larry McVoy %A Carl Staelin %T lmbench: Portable tools for performance analysis %C Proceedings USENIX Winter Conference %c San Diego, CA %D January 1996 %P 279-284 %z Thesis %K Prestor01 %A Uros Prestor %T Evaluating the memory performance of a ccNUMA system %R Masters Thesis %I School of Computing, University of Utah %C Salt Lake City, Utah %D May 2001 %O http://www.cs.utah.edu/~uros/thesis/thesis.pdf %z Article %K Saavedra95 %A R.H. Saavedra %A A.J. Smith %T Measuring cache and TLB performance and their effect on benchmark runtimes %J IEEE Transactions on Computers %V 44 %N 10 %D October 1995 %P 1223-1235 %z Book %K Knuth73 %A Donald E. Knuth %T The Art of computer programming, 2nd Edition %I Addison-Wesley %D 1973 %z Book %K Hennessy96 %A John L. Hennessy %A David A. Patterson %T Computer Architecture A Quantitative Approach, 2nd Edition %I Morgan Kaufman %D 1996 %z Article %K McCalpin95 %A John D. McCalpin %T Memory bandwidth and machine balance in current high performance computers %J IEEE Technical Committee on Computer Architecture newsletter %D December 1995