Martin Schulz


Email: schulzm@llnl.gov
Phone: 925-423-6498


Martin is a Computer Scientist at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory (LLNL). He earned his Doctorate in Computer Science in 2001 from the Technische Universität München (Munich, Germany). He also holds a Master of Science in Computer Science from the University of Illinois at Urbana Champaign. After completing his graduate studies and a postdoctoral appointment in Munich, he worked for two years as a Research Associate at Cornell University, before joining LLNL in 2004.

Martin's research interests include parallel and distributed architectures and applications; performance monitoring, modeling and analysis; memory system optimization; parallel programming paradigms; tool support for parallel programming; power efficiency for parallel systems; optimizing parallel and distributed I/O; and fault tolerance at the application and system level. In his position at LLNL he especially focuses on the issue of scalability for parallel applications, code correctness tools, and parallel performance analyzer as well as scalable tool infrastructures to support these efforts.

Martin is a member of LLNL's Scalability Team, which focuses on research towards a scalable software stack for next generation systems, as well as LLNL's ASC CSSE ADEPT (Application Development Environment and Performance Team) and he works closely with colleagues in CASC's Parallel Systems Group and in the Development Environment Group (DEG).

Current Activities / Roles

  • Chair of the MPI Forum (the international standardization body for the widely used Message Passing Interface/MPI)
  • Active member of the OpenMP Language Committee, Working group lead for tools
  • Member of the DOE Technical Council on Resilience
  • Member of the Coordination Committee for the US/Japan collaborations on post-peta/exascale computing
  • Lead PI for the DOE/ASCR funded project PIPER: “Performance Insights for Programmers and Exascale Runtimes”
  • LLNL PI for the DOE/ASCR project “Characterizing Faults, Errors, and Failures in Extreme-scale Systems”
  • LLNL Co-PI for DOE/ASCR funded project “ARGO”, focus: power scheduling and optimization
  • LLNL PI for the DOE/ASCR Exascale Co-Design Center CESAR for Nuclear Reactor Modeling
  • PI for internal ISCP project on HPC Tool Productization
  • Thrust lead for tools of the DOE/ASCR Exascale Co-Design Center ExMatEx for Material Simulation
  • Co-Lead of the Performance Analysis and Visualization for Exascale (PAVE) efforts at LLNL
  • Member of the Scalability team at LLNL funded by DOE/NNSA/ASC
  • Member of the DOE Technical Council on Resilience

Honors and Organziation

  • Member, R&D 100 Winning team for the Stack Trace Analysis Tool (STAT), 2011
  • ACM Service Award 2009
  • Member, Gordon Bell Prize Winning Team, 2006
  • Best Paper Software Track at IPDPS 2013,
  • Nominated for best student and best paper at SC 2013
  • Best Student Paper at SC 2012 (also nominated for best paper)
  • Best Paper Software Track at IPDPS 2007
  • Best Paper at PADTAD – IV 2006
  • Member of the Institute for Electrical and Electronics Engineers (IEEE) and IEEE Computer Society
  • Member of the Association of Computing Machinery (ACM) and SIGHPC

Selected Academic Service

  • Subject Area Editor, Journal for Parallel Computing – Systems and Applications
  • Vice Chair for Performance Modeling and Evaluation, CCGrid 2013/2016
  • Co-Organizer of the Workshop on Visual Performance Analysis, November 2014-2016
  • Co-Organizer of Workshop on Extreme-Scale Programming Tools, November 2015/2016
  • Co-Organizer of 1st Workshop on Software Frameworks for Scalable Scientific Simulations, July 2015
  • Global Chair for “Green High Performance Computing” at Euro-Par 2014
  • Co-PC Chair: International Conference on Supercomputing (ICS) 2014
  • Co-Organizer of VAPLS Workshop, held with IEEE Vis, October 2013
  • Area Chair for Performance, SC 2011
  • General Chair, PACT 2009
  • Program Chair, Workshop on Productivity and Performance (PROPER) 2013
  • Steering Committee of PACT (2009-2011)
  • Steering Committee of HIPS (2000-now)
  • Steering Committee of VPA (2014-now)
  • Member of a large number of program committees, incl. PPoPP 2016/2015/2012; PACT 2016/2014/2012; ASPLOS 2012/2016; Supercomputing 2007/2010-2014 (papers) & 2008/09 (tutorials) & 2015/16 (doctoral showcase); IPDPS 2007-2009/2011-2014; GreenComputing 2011-2012; ICPP 2007/2009/2011-2014
  • HIPS: Workshop on High-Level Parallel Programming Models and Supportive Environments (Chair 2000, Co-Chair 2003 & 2008, Program Committee 2001-2002,2004-2006,2009-2012, Steering Committee since 2001)
  • SCI-Europe: International Conference on SCI-based Technology and Research (Steering Committee 2000-2001, Co-Chair 2001, Program Committee 2000-2001)
  • Finance and Registration Chair, PPoPP 2009
  • Workshops and Tutorial Chair, IEEE Cluster 2010

Publications

Journals

  1. Ignacio Laguna, David F. Richards, Todd Gamblin, Martin Schulz, Bronis R. de Supinski, Kathryn Mohror, and Howard Pritchard, "Evaluating and Extending User-Level Fault Tolerance in MPI", International Journal of High Performance Computing Applications (IJHPCA), to appear (LLNL-JRNL-663434).
  2. Aniruddha Marathe, Rachel Harris, David K. Lowenthal, Bronis R. de Supinski, Barry Rountree, Martin Schulz, “Exploiting Redundancy for Cost-Effective, Time-Constrained Execution of Scalable HPC Applications on Amazon EC2”, Transactions on Parallel and Distributed Systems, to appear.
  3. Peer-Timo Bremer, Bernd Mohr, Valerio Pascucci, Martin Schulz, “Connecting Performance Analysis and Visualization to Advance Extreme Scale”, Informatik Spektrum: 37. 2014, 2.
  4. Katherine E. Isaacs, Todd Gamblin, Abhinav Bhatele, Martin Schulz, Bernd Hamann, and Peer-Timo Bremer, “Ordering Traces Logically to Identify Lateness in Message Passing Programs”, IEEE Transactions on Parallel and Distributed Systems. Vol. 27, Issue 3, March 2015.
  5. Ignacio Laguna, Dong H. Ahn, Bronis R. de Supinski, Todd Gamblin, Gregory L. Lee, Martin Schulz, Saurabh Bagchi, Milind Kulkarni, Bowen Zhou, Zhezhe Chen, and Feng Qin. Debugging high-performance computing applications at massive scales. Communications of the ACM, Vol. 58 No. 9, Pages 72-81.
  6. Tanzima Zerin Islam, Kathryn Mohror, Martin Schulz, "Exploring the MPI Tool Information Interface: Features and Capabilities”, International Journal of High Performance Computing Applications, August 19, 2015.
  7. Katherine E. Isaacs, Todd Gamblin, Abhinav Bhatele, Martin Schulz, Ilir Jusufi, Bernd Hamann, Peer-Timo Bremer, “Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time”, IEEE Transactions on Visualization and Computer Graphics, Proceedings of IEEE Symposium on Information Visualization (InfoVis 2014), Paris, France, October 2014
  8. Hilbrich, Tobias, Joachim Protze, Martin Schulz, Bronis R. de Supinski and Matthias S. Mueller, "MPI Runtime Error Detection with MUST: Advances in Deadlock Detection”, Scientific Programming, Vol. 21, No. 3, pp. 109-121, October 2013, (Reprint of SC2012 paper) (LLNL-CONF-555978).
  9. Olivier, Stephen, Bronis R. de Supinski, Martin Schulz and Jan F. Prins, "Characterizing and Mitigating Work Time Inflation in Task Parallel Programs”, Scientific Programming, Vol. 21, No. 3, pp. 123-136, October 2013, (Reprint of SC2012 paper) (LLNL-CONF-555492).
  10. Joshua D. Goehner, Dorian C. Arnold, Dong H. Ahn, Gregory L. Lee, Bronis R. de Supinski, Matthew P. LeGendre, Barton P. Miller and Martin Schulz, "LIBI: A Framework for Bootstrapping Extreme Scale Software Systems”, Parallel Computing, Vol. 39, No. 3, March 2013, (LLNL-JRNL-575496).
  11. Barry Rountree, Guy Cobb, Todd Gamblin, Martin Schulz, Bronis R. de Supinski, and Henry Tufo, "Parallelizing Heavyweight Debugging Tools with MPIecho”, Parallel Computing, Vol. 39, No. 3, March 2013.
  12. Dong Li, Bronis R. de Supinski, Martin Schulz, Dimitrios S. Nikolopoulos and Kirk W. Cameron, "Strategies for Energy Efficient Resource Management of Hybrid Programming Models”, Transactions on Parallel and Distributed Systems, Vol. 24, No. 1, pp. 144-157, January 2013, (LLNL-JRNL-521391).
  13. G. Gopalakrishnan, R. M. Kirby, S. Siegel, R. Thakur, W. Gropp, E. Lusk, B. R. De Supinski, M. Schulz, G. Bronevetsky, “Formal Analysis of MPI-based Parallel Programs”, Communications of the ACM, Vol. 54 No. 12, Pages 82-91.
  14. A. Landge, J. Levine, K. Isaacs, A. Bhatele, T. Gamblin, M. Schulz, S. H. Langer, P.-T. Bremer, V. Pascucci, “Visualizing Network Traffic to Understand the Performance of Massively Parallel Simulations”, IEEE Transactions on Visualization and Computer Graphics, Volume 18, Issue 12, In Proceedings of IEEE Symposium on Information Visualization (InfoVis 2012), Seattle, WA, October 2012.
  15. R. Preissl, M. Schulz, D. Kranzlmueller, B.R. de Supinski and D. Quinlan, “Transforming MPI Source Code based on Communication Patterns”, Future Generation Computer Systems, Vol. 16, Number 1 / 2010, pp. 147-154, (LLNL-JRNL-408081).
  16. M. Noeth, P. Ratn, F. Mueller, M. Schulz and B.R. de Supinski, "ScalaTrace: Scalable Compression and Replay of Communication Traces in Massively Parallel Environments”, Journal of Parallel and Distributed Computing (JPDC), Vol. 69, No. 8, Aug 2009, pages 969-710, (LLNL-JRNL-403992).
  17. M. Schulz, J. Galarowicz, D. Maghrak, W. Hachfeld, D. Montoya, S. Cranford, “Open|SpeedShop: An Open Source Infrastructure for Parallel Performance Analysis”, Scientific Programming, Vol. 16, Number 2-3 / 2008, pp. 105-121, IOS Press, (LLNL-JRNL-234840).
  18. B.R. de Supinski, M. Schulz, V.V. Bulatov, W. Cabot, B. Chan,  A.W. Cook, E.W. Draeger, J.N. Glosli, J.A. Greenough, K. Henderson, A. Kubota, S. Louis, B.J. Miller, M.V. Patel, T.E. Spelce, F.H. Streitz, P.L. Williams, R.K. Yates, A. Yoo, G. Almasi, G. Bhanot, A. Gara, J.A. Gunnels, M. Gupta, J. Moreira, J. Sexton, B. Walkup, C. Archer, F. Gygi, T.C. Germann, K. Kadau, P.S. Lomdahl, C. Rendleman, M.L. Welcome, W. McLendon, B. Hendrickson, F. Franchetti, S. Kral. J. Lorenz, C.W. Überhuber, E. Chow and U. Catalyurek, “BlueGene/L Applications: Parallelism on a Massive Scale”, International Journal of High Performance Computing Applications, January 2008, 22:33-51, (UCRL-JRNL-224370).
  19. G. Lee, M. Schulz, D. Ahn, A. Bernat, B.R. de Supinski, S. Ko, and B. Rountree, "Dynamic Binary Instrumentation and Data Aggregation on Large Scale Systems”, International Journal of Parallel Programming, (UCRL-JRNL- 226801).
  20. E. Ipek, S.A. McKee, K. Singh, R. Caruana, B.R. de Supinski and M. Schulz, "Efficient Architectural Design Space Exploration via Predictive Modeling”, ACM Transactions on Architecture and Code Optimization, (UCRL-JRNL-227222).
  21. K. Singh, E. Ipek, S.A. McKee, B.R. de Supinski, M. Schulz and R. Caruana, "Predicting Parallel Application Performance via Machine Learning Approaches”, Concurrency and Computation: Practice & Experience, (UCRL-JRNL-222444).
  22. T. Brandes, H. Schwamborn, M. Gerndt, J. Jeitner, E. Kereku, W. Karl, M. Schulz, J. Tao, H. Brunst, W. Nagel, R. Neumann, R. Mueller-Pfefferkorn, B. Trenkler, and H.-C. Hoppe, “Monitoring Cache Behavior on Parallel SMP Architectures and Related Programming Tools”,  Future Generation Computer Systems (FGCS), Vol.21, Nr. 8, October 2005, pp. 1298-1311.
  23. J. Tao, M. Schulz, and W. Karl, “Simulation as a Tool for Optimizing Memory Accesses on NUMA Machines”, Performance Evaluation. Vol.60, No.1-4, May 2005, pp.31-50.
  24. J. Tao, M. Schulz, and W. Karl, “Simulation as a Tool for Optimizing Memory Accesses on NUMA Machines”,  Performance Evaluation, Future Generation Computer Systems. Vol.19, No.5, 2003, pp.761-776.
  25. J. Tao, M. Schulz, and W. Karl, “ARS: An Adaptive Runtime System for Locality Optimizations”, Future Generation Computer Systems (FGCS), Vol 19, No. 5, 2003, pp. 761-776.
  26. M. Schulz, J. Tao, C. Trinitis, and W. Karl, “SMiLE: An Integrated, Multi-Paradigm Software Infrastructure for SCI-based Clusters”, Future Generation Computer Systems (FGCS), Vol. 19, No. 4, pp.521-532, (Special issue with best papers of CCGrid02), 2003.
  27. G. Torralba, V. González, E. Sanchis, J. Tao, M. Schulz, and W. Karl, “Data monitoring in high-performance clusters for computing applications”, IEEE Transactions on Nuclear Science, Vol. 49, No. 2, April 2002.
  28. J. Tao, W. Karl, and M. Schulz, “Memory Access Behavior Analysis of NUMA-based Shared Memory Programs”, Scientific Computing, Special issue on Performance-Oriented Application Development for Distributed Architectures.

Conferences

  1. Simone Atzeni, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Dong Ahn, Gregory Lee, Ignacio Laguna, Martin Schulz, Joachim Protze, Matthias Mueller, “ARCHER: Effectively Spotting Data Races in Large OpenMP Applications”, International Parallel and Distributed Processing Symposium (IPDPS) 2016, Chicago, IL, May 2016, to appear.
  2. Lee Savoie, David Lowenthal, Bronis R. de Supinski, Tanzima Islam, Kathryn Mohror, Barry Rountree, Martin Schulz, “I/O Aware Power Shifting”, International Parallel and Distributed Processing Symposium (IPDPS) 2016, Chicago, IL, May 2016, to appear.
  3. Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Nancy Amato, “MPMD Framework for Offloading Load Balance Computation”, International Parallel and Distributed Processing Symposium (IPDPS) 2016, Chicago, IL, May 2016, to appear.
  4. Matthias Weber, Ronny Brendel, Tobias Hilbrich, Kathryn Mohror, Martin Schulz, Holger Brunst, “Structural Clustering: A New Approach to Support Performance Analysis At Scale”, International Parallel and Distributed Processing Symposium (IPDPS) 2016, Chicago, IL, May 2016, to appear.
  5. Ignacio Laguna, Martin Schulz, David F. Richards, Jon Calhoun and Luke Olson. “IPAS: Intelligent Protection Against Silent Output Corruption in Scientific Applications”, International Symposium on Code Generation and Optimization 2016, Barcelona, Spain, March 2016, to appear.
  6. Katherine E. Isaacs, Abhinav Bhatele, Jonathan Lifflander, David Böhme, Todd Gamblin, Martin Schulz, Bernd Hamann, and Peer-Timo Bremer. Analyzing the structure of execution traces from task-based parallel runtimes. SC2015, Austin, TX, November  2015. 
  7. Peter Bailey, Aniruddha Marathe, David Lowenthal, Barry Rountree, Martin Schulz, “Finding the Limits of Power-Constrained Application Performance”, SC2015, Austin, TX, November  2015.
  8. Yuichi Inadomi, Tapasya Patki, Koji Inoue, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, Ikuo Miyoshi, “Analyzing and Mitigating the Impact of Manufacturing Variability in Power-Constrained Supercomputing”, SC2015, Austin, TX, November  2015.
  9. Kento Sato, Dong Ahn, Ignacio Laguna, Gregory Lee, Martin Schulz, “Clock Delta Compression for Scalable Order-Replay of Non-Deterministic Parallel Applications”, SC2015, Austin, TX, November  2015.
  10. Daniel A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz, “Dynamic Power Sharing for Higher Job Throughput”, SC2015, Austin, TX, November  2015.
  11. Hormozd Gahvari, Martin Schulz, Ulrike Yang, “An Approach to Selecting Thread + Process Mixes for Hybrid MPI + OpenMP Applications”, IEEE Cluster, Chicago, IL, September 2015.
  12. Tobias Hilbrich, Martin Schulz, Holger Brunst, Joachim Protze, Bronis R. de Supinski, Matthias S. Mueller, “Event-Action Mappings for Parallel Tools Infrastructures”, EuroPar 2015, Vienna, Austria, August 2015,
  13. Aniruddha Marathe, Peter E. Bailey, David K. Lowenthal, Barry Rountree, Martin Schulz, Bronis R. de Supinski, “A Run-Time System for Power-Constrained HPC Applications”, International Supercomputing Conference (ISC), Frankfurt, Germany, July 2015.
  14. Tapasya Patki, Anjana Sasidharan, Matthias Maiterth, David Lowenthal, Barry Rountree, Martin Schulz, Bronis de Supinski, “Practical Resource Management in Power-Constrained High Performance Computing”, 24th International ACM Symposium on High-Performance Parallel and Distributed Computing, Portland, Oregon, June 2015.
  15. Daniel A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz, “POW: System-wide Dynamic Reallocation of Limited Power in HPC” (short paper), 24th International ACM Symposium on High-Performance Parallel and Distributed Computing, Portland, Oregon, June 2015.
  16. Swann Perarnau, Rajeev Thakur, Kamil Iskra, Ken Raffenetti, Franck Cappello, Rinku Gupta, Pete Beckman, Marc Snir, Henry Hoffmann, Martin Schulz and Barry Rountree, “Distributed Monitoring and Management of Exascale Systems in the Argo Project”, 10th International Federated Conference on Distributed Computing Techniques (IFIP DAIS), Inria Grenoble – Rhône-Alpes, France, June 2015
  17. Abhinav Bhatele, Andrew R. Titus, Jayaraman J. Thiagarajan, Nikhil Jain,Todd Gamblin, Peer-Timo Bremer, Martin Schulz, and Laxmikant V. Kale. Identifying the Culprits behind Network Congestion. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS'15), Hyderabad, INDIA, May 25-29 2015.
  18. Nicklas Jensen, Niklas Nielsen, Gregory L Lee, Sven Karlsson, Matthew LeGendre, Martin Schulz and Dong Ahn, “A Scalable Prescriptive Parallel Debugging Model”, In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS'15), Hyderabad, INDIA, May 25-29 2015.
  19. Peter E Bailey, David K Lowenthal, Vignesh Ravi, Barry Rountree, Martin Schulz and Bronis R de Supinski, “Adaptive Configuration Selection for Power-Constrained Heterogeneous Systems”, Proceedings of the 43rd International Conference on Parallel Processing (ICPP), September 2014.
  20. T. Islam, K. Mohror and M. Schulz, “Exploring the Capabilities of the New MPI_T Interface”, EuroMPI/Asia 2014, Kyoto, Japan, September 2014.
  21. Ignacio Laguna, David F. Richards, Todd Gamblin, Martin Schulz, and Bronis R. de Supinski, “Evaluating User-Level Fault Tolerance for MPI Applications”, In EuroMPI/Asia 2014, Kyoto, Japan, September 9-12 2014.
  22. Ananta Tiwari, Anthony Gamst, Michael Laurenzano, Martin Schulz and Laura Carrington, “Modeling the Impact of Reduced Memory Bandwidth on HPC Applications”, Proceedings of EuroPar 2014, Porto, Portugal, August 2014.
  23. K. E. Isaacs, A. Bhatele, P.-T. Bremer, T. Gamblin, A. Gimenez, B. Hamann, I. Jusufi, and M. Schulz. State of the Art of Performance Visualization. In R. Borgo, R. Maciejewski, and I. Viola, editors, Eurographics Conference on Visualization (EuroVis’14), Swansea, Wales, UK.
  24. Aniruddha Marathe, Rachel Harris, David K. Lowenthal, Bronis R. de Supinski, Barry Rountree, Martin Schulz, "Exploiting Redundancy for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2”, In the proceedings of High-Performance Parallel and Distributed Computing (HPDC) 2014.
  25. S. Mitra, I. Laguna, D. H. Ahn, S. Bagchi, M. Schulz, and T. Gamblin, “Accurate application progress analysis for large-scale parallel debugging”, In Programming Langauge Design and Implementation (PLDI’14), Edinburgh, UK, June 9-11 2014
  26. T. Hilbrich, J. Protze, M. Wagner, M.S. Mueller, M. Schulz, B.R. de Supinski, and W.E. Nagel, "Memory Usage Optimizations for Online Event Analysis", In the proceedings of the Exascale Applications and Software Conference (EACS 2014), April 2-3 2014, Stockholm, Sweden.
  27. A. Breslow, A. Tiwari, M. Schulz, L. Carrington, L. Tang, J. Mars, “Enabling Fair Pricing on HPC Systems with Node Sharing”, SC2013, Denver, CO, November  2013, (nominated for best paper and best student paper).
  28. M. Schulz, J. Belak, A. Bhatele, P.-T. Bremer, G. Bronevetsky, M. Casas, T. Gamblin, K. Isaacs, I. Laguna, J. Levine, V. Pascucci, D. Richards, B. Rountree, “Performance Analysis Techniques for the Exascale Co-Design Process”, Proceedings of PARCO 2013,  Munich, Germany, September 2013.
  29. Hilbrich, Tobias, Fabian Haensell, Martin Schulz, Bronis R. de Supinski, Joachim Protze , Matthias Mueller and Wolfgang E. Nagel, “Runtime MPI Collective Checking with Tree-Based Overlay Networks”, EuroMPI 2013, Madrid, Spain, September 15-18, 2013.
  30. Weber, Matthias, Kathryn Mohror, Martin Schulz, Bronis R. de Supinski, Holger Brunst and Wolfgang E. Nagel, “Alignment-Based Metrics for Trace Comparison”, Euro-Par 2013, Aachen, Germany, August 26–30, 2013, (LLNL-CONF-586852).
  31. Marathe, Aniruddha, Rachel Harris, David K. Lowenthal, Bronis R. de Supinski, Barry Rountree, Martin Schulz and Xin Yuan, "A Comparative Study of High-Performance Computing on the Cloud”, Twenty Second IEEE International Symposium High Performance Distributed Computing (HPDC 2013), New York, New York, June 17–21, 2013, (LLNL-CONF-634532).
  32. Patki, Tapysa, David K. Lowenthal, Barry Rountree, Martin Schulz and Bronis R. de Supinski, " Exploring Hardware Overprovisioning in Power-Constrained, High Performance Computing”, Twenty Seventh International Conference on Supercomputing (ICS 2013), Eugene, Oregon, June 10–14, 2013, (LLNL-CONF-634672).
  33. Ahn, Dong, Michael J. Brim, Bronis R. de Supinski, Todd Gamblin, Gregory L. Lee, Matthew LeGendre, Barton P. Miller and Martin Schulz, "Efficient and Scalable Retrieval Techniques for Global File Properties”, Twenty Seventh International Parallel and Distributed Processing Symposium (IPDPS 2013), Boston, Massachusetts, May 20–24, 2013, (LLNL-PROC-554055).
  34. Ian Karlin, Abhinav Bhatele, Jeff, Bradford L. Chamberlain, Jonathan Cohen, Zachary DeVito, Riyaz Haque, Dan Laney, Edward Luke, Felix Wang, David Richards, Martin Schulz, Charles Still, “Exploring Traditional and Emerging Parallel Programming Models using a Proxy Application”, IPDPS 2013, Cambridge, MA, May 2013 (best paper, software track).
  35. Abhinav Bhatele, Todd Gamblin, Katherine E. Isaacs, Brian Gunney, Martin Schulz, Peer-Timo Bremer, Bernd Hamann, “Novel Views of Performance Data to Analyze Large-Scale Adaptive Applications”, SC2012, Salt Lake City, Utah, November  2012.
  36. Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, Matthias Mueller, “MPI Runtime Error Detection with MUST - Advances in Deadlock Detection”, SC2012, Salt Lake City, Utah, November  2012, (nominated for best paper).
  37. Stephen L. Olivier, Bronis R. de Supinski, Martin Schulz, Jan F. Prins, “Characterizing and Mitigating Work Time Inflation in Task Parallel Programs”, SC2012, Salt Lake City, Utah, November  2012, (best student paper).
  38. Abhinav Bhatele, Todd Gamblin, Steven H. Langer, Peer-Timo Bremer, Erik W. Draeger, Bernd Hamann, Katherine E. Isaacs, Aaditya G. Landge, Joshua A. Levine, Valerio Pascucci, Martin Schulz, Charles H. Still, “Mapping Applications with Collectives over Sub-Communicators on Torus Networks”, SC2012, Salt Lake City, Utah, November  2012.
  39. Martin Schindewolf, Martin Schulz, John Gyllenhaal, Barna Bihari, Amy Whang, Wolfgang Karl, “What Scientific Applications Can Benefit from Hardware Transactional Memory?”, SC2012, Salt Lake City, Utah, November  2012.
  40. C.-H. Ho, M. de Kruijf, K. Sankaralingam, B. Rountree, M. Schulz, B. R. de Supinski, “Mechanisms and Evaluation of Cross-Layer Fault-Tolerance for Supercomputing”, Proceedings of the 41st International Conference on Parallel Processing (ICPP), September 2012.
  41. H. Gahvari, W. Gropp, K. E. Jordan, M. Schulz, U. M. Yang, “Modeling the Performance of an Algebraic Multigrid Cycle Using Hybrid MPI/OpenMP”, Proceedings of the 41st International Conference on Parallel Processing (ICPP), September 2012.
  42. Spyros Lyberis, Polyvios Pratikakis, Dimitrios Nikolopoulos, Martin Schulz, Todd Gamblin, Bronis de Supinski, “The Myrmics Memory Allocator: Hierarchical, Message-Passing Allocation for Global Address Spaces”, Proceedings of the International Symposium on Memory Management, Beijing, China, June 2012.
  43. M. Casas Guix, B. R. de Supinski, G. Bronevetsky and M. Schulz, “Fault Resilience of the Algebraic Multi-Grid Solver”, to Proceedings of the 26th International Conference on Supercomputing (ICS), June 2012.
  44. O. Pearce, T. Gamblin, B.R. de Supinski, M. Schulz and N. Amato, “Quantifying the Effectiveness of Load Balance Algorithms”, Proceedings of the 26th International Conference on Supercomputing (ICS), June 2012.
  45. Tobias Hilbrich, Matthias Mueller, Bronis de Supinski, Martin Schulz, Wolfgang Nagel, “GTI: A Generic Tools Infrastructure for Event Based Tools in Parallel Systems”, Proceedings of IPDPS, May 2012.
  46. David Boehme, Bronis de Supinski, Markus Geimer, Martin Schulz, Felix Wolf, “Scalable Critical-Path Based Performance Analysis”, Proceedings of IPDPS, May 2012.
  47. Ignacio Laguna, Todd Gamblin, Bronis R. de Supinski, Saurabh Bagchi, Greg Bronevetsky, Dong H. Ahn, Martin Schulz, Barry Rountree, "Large Scale Debugging of Parallel Tasks with AutomaDeD", SC2011, Seattle, Washington, November 12–18, 2011.
  48. A. Vo, G. Gopalakrishnan, R. M. Kirby, B. de Supinski, M. Schulz, and G. Bronevetsky, "Large Scale Verification of MPI Programs Using Lamport Clocks with Lazy Update", International Conference on Parallel Architectures and Compilation Techniques (PACT 2011).
  49. Martin Schulz, Joshua A. Levine, Peer-Timo Bremer, Todd Gamblin, and Valerio Pascucci. Interpreting performance data across intuitive domains. In International Conference on Parallel Processing (ICPP'11), Taipei, Taiwan, September 13-16 2011, (LLNL-CONF-476091).
  50. Barry Rountree, David K. Lowenthal, Martin Schulz, and Bronis R. de Supinski. “Practical performance prediction under dynamic voltage frequency scaling”. In Second International Green Computing Conference (IGCC), July 2011.
  51. Hormozd Gahvari, Allison Baker, Martin Schulz, Ulrike Yang, Kirk Jordan and William Gropp, Modeling the Performance of an Algebraic Multigrid Cycle on HPC Platforms, International Conference on Supercomputing, (ICS 2011), Tuscon, AZ, June 2011.
  52. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang, “Challenges of Scaling Algebraic Multigrid across Modern Multicore Architectures”, Proceedings of IPDPS, May 2011.
  53. Susmit Biswas, Bronis R. de Supinski, Martin Schulz, Diana Franklin, Timothy Sherwood, Frederic T. Chong, “Exploiting Data Similarity to Reduce Memory Footprints”, Proceedings of IPDPS, May 2011.
  54. Zoltan Szebenyi, Todd Gamblin, Martin Schulz, Bronis R. de Supinski, Felix Wolf, Brian J.N. Wylie, “Reconciling sampling and direct instrumentation for unintrusive call-path profiling of MPI programs”, Proceedings of IPDPS, May 2011.
  55. Anh Vo, Sriram Aananthakrishnan, Ganesh Gopalakrishnan, Bronis R. de Supinski, Martin Schulz and Greg Bronevetsky, "A Scalable and Distributed Dynamic Formal Verifier for MPI Programs”, SC2010, New Orleans, Louisiana, November 13–19, 2010.
  56. Robert Preissl, Bronis R. de Supinski, Martin Schulz, Daniel J. Quinlan, Dieter Kranzlmueller and Thomas Panas, “Exploitation of Dynamic Communication Patterns through Static Analysis”, 2010 International Conference on Parallel Processing (ICPP-10), San Diego, CA, September 13-16, 2010, (LLNL-CONF-438991).
  57. Karan Singh, Matthew Curtis-Maury, Sally A. McKee, Filip Blagojevic, Dimitris S. Nikolopoulos, Bronis R. de Supinski and Martin Schulz, “Comparing Scalability Prediction Strategies on an SMP of CMPs”, Euro-Par 2010, Naples, Italy, August 31–September 3, 2010, (LLNL-CONF-423717).
  58. Greg Bronevetsky, Ignacio Laguna, Saurabh Bagchi, Bronis R. de Supinski, Dong H. Ahn and Martin Schulz, "AutomaDeD: Automata-Based Debugging for Dissimilar Parallel Tasks”, 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2010), Chicago, IL, June 28 – July 1, 2010, (LLNL-CONF-426270).
  59. Allison Baker, Martin Schulz, and Ulrike Yang, “On the Performance of an Algebraic Multigrid Solver on Multicore Clusters”, VecPar 2010, June 2010.
  60. Frank Mueller, Xing Wu, Martin Schulz, Todd Gamblin and Bronis R. de Supinski, "ScalaTrace: Tracing, Analysis and Modeling of HPC Codes at Scale”, Para 2010: State of the Art in Scientific and Parallel Computing, Reykjavík, Iceland, June 6-9, 2010, (LLNL-CONF-427005).
  61. Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Robert J. Fowler and Daniel A. Reed, "Clustering Performance Data Efficiently at Massive Scales”, Twenty Fourth International Conference on Supercomputing (ICS 2010), Tsukuba, Japan,
    June 1–4, 2010, (LLNL-CONF-422684).
  62. Bradley Barnes, Jeonifer Garren, David K. Lowenthal, Jaxk Reeves, Bronis R. de Supinski, Martin Schulz and Barry Rountree, "Using Focused Regression for Accurate Time-Constrained Scaling of Scientific Applications”, Twenty Fourth International Parallel and Distributed Processing Symposium (IPDPS 2010), Atlanta, GA, April 19–23, 2010, (LLNL-CONF-422989).
  63. Dong Li, Bronis R. de Supinski, Martin Schulz, Kirk Cameron and Dimitrios S. Nikolopoulos, "Power-aware MPI Task Aggregation Prediction for High-End Computing Systems”, Twenty Fourth International Parallel and Distributed Processing Symposium (IPDPS 2010), Atlanta, GA, April 19–23, 2010, (LLNL-CONF-422991).
  64. Dong Li, Dimitrios S. Nikolopoulos, Kirk Cameron, Bronis R. de Supinski, and Martin Schulz, "Hybrid MPI/OpenMP Power-Aware Computing”, Twenty Fourth International Parallel and Distributed Processing Symposium (IPDPS 2010), Atlanta, GA, April 19–23, 2010, (LLNL-CONF-422990).
  65. D.H. Ahn, B.R. de Supinski, I. Laguna, G.L. Lee, B. Liblit, B.P. Miller, M. Schulz, “Scalable Temporal Order Analysis for Large Scale Debugging”,SC2009, Portland, Oregon, November, 2009, (LLNL-CONF- 412227).
  66. B.R. de Supinski, Sadaf~Alam, D.H. Bailey, L. Carrington, C. Daley, A. Dubey, T. Gamblin, D. Gunter, P.D. Hovland, H. Jagode, K. Karavanic, G. Marin, J. Mellor-Crummey, S. Moore, B. Norris, L. Oliker, C. Olschanowsky, P.C. Roth, M. Schulz, S. Shende, A. Snavely, W. Spear, M. Tikir, J. Vetter, P. Worley, and N. Wright, “Modeling the Office of Science ten year facilities plan: The PERI Architecture Tiger Team”, SciDAC 2009, (LLNL-CONF-413427).
  67. B. Rountree, D.K. Lowenthal, B.R. de Supinski, M. Schulz, V.W. Freeh and T. Bletsch, “Adagio: Making DVS Practical for Complex HPC Applications”, Twenty Third International Conference on Supercomputing (ICS 2009), (LLNL-CONF-412083).
  68. M. Schulz, A. W. Cook, W.H. Cabot, B.R. de Supinski and W.D. Krauss, “On the Performance of the Miranda CFD Code on Multicore Architectures”, Twenty First International Conference on Parallel Computational Fluid Dynamics (ParallelCFD 2009), (LLNL-ABS-411404).
  69. M. Schulz, J. Galarowicz, D. Maghrak, W. Hachfeld, D. Montoya, S. Cranford, “Analyzing the Performance of Scientific Applications with Open|SpeedShop”, Twenty First International Conference on Parallel Computational Fluid Dynamics (ParallelCFD 2009), (LLNL-ABS- 418135).
  70. J. Li, M. Xiaosong, K. Singh, M. Schulz, B.R. de Supinski, and S.A. McKee, "Machine Learning Based Online Performance Prediction for Runtime Parallelization and Task Scheduling”, 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, Massachusetts, April 26–28, 2009, (LLNL-CONF-407723).
  71. G. Lee, D.H. Ahn, D.C. Arnold, B.R. de Supinski, M. Legendre, B.P. Miller, M. Schulz and B. Liblit, "Lessons Learned at 208K: Towards Debugging Millions of Cores”, SC2008, Austin, Texas, November 15–21, 2008, (LLNL-CONF-402967).
  72. T. Gamblin, B.R. de Supinski, M. Schulz, R.J. Fowler and D. A. Reed, "Scalable Load Balance Measurement for SPMD Codes”, SC2008, Austin, Texas, November 15–21, 2008, (LLNL-CONF-406045).
  73. M. Curtis-Maury, A. Shah, F. Blagojevic, D.S. Nikolopoulos, B.R. de Supinski and M. Schulz, "Prediction Models for Multi-dimensional Power-Performance Optimizations on Many Cores”, Seventeenth International Conference on Parallel Architectures and Compilation Techniques (PACT-2008), Toronto, ON, Canada, October 25–29, 2008, (LLNL-CONF-400453).
  74. D.H. Ahn, D.C. Arnold, B.R. de Supinski, G.L. Lee, B.P. Miller and M. Schulz, “Overcoming Scalability Challenges for Tool Daemon Launching”, 2008 International Conference on Parallel Processing (ICPP-08), Portland, OR, September 8-12, 2008, (LLNL-CONF-401480).
  75. R. Preissl, T. Koeckerbauer, M. Schulz, D. Kranzlmueller, B.R. de Supinski and D.J. Quinlan, “Detecting Patterns in MPI Communication Traces”, 2008 International Conference on Parallel Processing (ICPP-08), Portland, OR, September 8-12, 2008, (LLNL-CONF-401716).
  76. M. Schulz, G. Bronevetsky and B.R. de Supinski, “On the Performance of Transparent MPI Piggyback Messages”, EuroPVM/MPI 2008, Dublin, Ireland, September 7–10, 2008, (LLNL-CONF-402937).
  77. R. Preissl, M. Schulz, D. Kranzlmueller, B.R. de Supinski and D.J. Quinlan, “Using MPI Communication Patterns to Guide Source Code Transformations”, Tools for Program Development and Analysis in Computational Science, Krakow, Poland, June 23-25, 2008, (LLNL-CONF-400356).
  78. B. Barnes, B. Rountree, D.K. Lowenthal, J. Reeves, B.R. de Supinski and M. Schulz, "A Regression-Based Approach to Scalability Prediction”, Twenty Second International Conference on Supercomputing (ICS 2008), Kos, Greece, June 7-12, 2008, (LLNL-CONF-400700).
  79. P. Ratn, F. Mueller, B.R. de Supinski and M. Schulz, "Preserving Time in Large-Scale Communication Traces”, Twenty Second International Conference on Supercomputing (ICS 2008), Kos, Greece, June 7-12, 2008, (LLNL-CONF-400703).
  80. B.R. de Supinski, R.J. Fowler, T. Gamblin, F. Mueller, P. Ratn and M. Schulz, "An Open Infrastructure for Scalable, Reconfigurable Analysis”, International Workshop on Scalable Tools for High-End Computing (STHEC), Kos, Greece, June 7, 2008, (LLNL-CONF-403954).
  81. M. Schulz and B.R. de Supinski, “P^nMPI Tools: A Whole Lot Greater than the Sum of Their Parts”, In Supercomputing 2007, Reno, NV, USA, November 12-18, 2007, (UCRL-CONF-229978).
  82. B. Rountree, D. Lowenthal, S. Funk, V.W. Freeh, B.R. de Supinski and M. Schulz, “Bounding Energy Consumption in Large-scale MPI Programs”, In Supercomputing 2007, Reno, NV, USA, November 12-18, 2007, (UCRL-CONF-233221).
  83. G.L. Lee, D. Ahn, B.R. de Supinski, M. Schulz, D.C. Arnold, and B.P. Miller, Benchmarking the Stack Trace Analysis Tool for BlueGene/L, In Parallel Computing: Architectures, Algorithms and Applications Proceedings of the International Conference ParCo 2007, Aachen, Germany, September 4-7, 2007, (UCRL-CONF-235241).
  84. M. Schulz and B.R. de Supinski, “Practical Differential Profiling”, Euro-Par 2007, Rennes, France, August 28 – 31, 2007, (UCRL-CONF-227812).
  85. D. Arnold, D.H. Ahn, B.R. de Supinski, G.L. Lee, B.P. Miller and M. Schulz, "Stack Trace Analysis for Large Scale Debugging”, Twenty First International Parallel and  Distributed Processing Symposium (IPDPS 2007), Long Beach, CA, March 26–30, 2007, (UCRL-CONF-227108).
  86. M. Noeth, F. Mueller, M. Schulz and B.R. de Supinski, "Scalable Compression and Replay of Communication Traces in Massively Parallel Environments”, Twenty First International Parallel and  Distributed Processing Symposium (IPDPS 2007), Long Beach, CA, March 26–30, 2007, (UCRL-CONF-227098), (Best Paper Award).
  87. B.C. Lee, D.M. Brooks, B.R. de Supinski, M. Schulz, K. Singh and S.A. McKee, “Methods of Inference and Learning for Performance Modeling of Parallel Applications”, ACM SIGPLAN 2007 Symposium on Principles and Practice of Parallel Programming (PPoPP 2007), San Jose, CA, March 14-17, 2007, (UCRL-CONF-227097).
  88. F. Gygi, E.W. Draeger, M. Schulz, B.R. de Supinski, J.A. Gunnels, V. Austel, J.C. Sexton, F. Franchetti, S. Kral, C.W. Überhuber and J. Lorenz, "Large-Scale Electronic Structure Calculations of High-Z Metals on the BlueGene/L Platform”, SC2006, Tampa, FL, November 11-17, 2006, (UCRL-PROC-220592), (Gordon Bell Prize Winner).
  89. E. Ipek, S.A. McKee, B.R. de Supinski, M.Schulz and R.Caruana, “Efficiently Exploring Architectural Design Spaces via Predictive Modeling”, Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XII), San Jose, CA, October 21-25, 2006, (UCRL-CONF-223240).
  90. M. Schulz, D. Kranzlmueller and B.R. de Supinski, “Exploring Unexpected Behavior in MPI”, 2006 International Conference on High Performance Computing and Communications (HPCC-06), Munich, Germany, September 13-15, 2006, (UCRL-CONF-222368).
  91. M. Schulz, and B.R. de Supinski, “A Flexible and Dynamic Infrastructure for MPI Tool Interoperability”, 2006 International Conference on Parallel Processing (ICPP-06), Columbus, OH, August 14-18, 2006, (UCRL-CONF-221608).
  92. M. Noeth, F. Mueller, M. Schulz and B.R. de Supinski, "Scalable Compression and Replay of Communication Traces in Massively Parallel Environments", P=ac2 Conference, IBM T.J. Watson, Oct 2006.
  93. M. Schulz, "Extracting Critical Path Graphs from MPI Applications", IEEE Cluster 2005, September 2005, (UCRL-CONF-214107).
  94. E. Ipek, B. R. de Supinski,  M. Schulz, S.A. McKee, and R. Caruana, "An Approach to Performance Prediction for Parallel Applications", Euro‑Par 2005, Springer LNCS, 3648, August 2005, (UCRL-CONF-212365).
  95. B. S. White,  S. A. McKee;  B. R. de Supinski, B. Miller, D. Quinlan, and M. Schulz, "Improving the Computational Intensity of Unstructured Mesh Applications", The 19th ACM International Conference on Supercomputing, June 2005, (UCRL-CONF-212479).
  96. M. Schulz, B. S. White, S.A. McKee, H.-H. Lee, and J. Jeitner, “Owl: Next Generation System Monitoring”, Proceedings of ACM Computing Frontiers, April 2005, (UCRL-CONF-209855).
  97. M. Schulz, G. Bronevetsky, R. Fernandes, D. Marques, K. Pingali, and P. Stodghill, "Implementation and Evaluation of a Scalable Application-level Checkpoint-Recovery Scheme for MPI Programs", Proceedings of Supercomputing 2004, November 2004, (UCRL-CONF-205612).
  98. G. Bronevetsky, M. Schulz, P. Szwed., D. Marques, and K. Pingali, "Application-level Checkpointing for Shared Memory Programs", Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2004), October 2004, (UCRL-CONF-205594).
  99. T. Mohan, B.R. de Supinski, S.A. McKee, F. Mueller, A. Yoo, and M. Schulz, “Identifying and Exploiting Spatial Regularity in Data Memory References”, Supercomputing 2003, November 2003.
  100. E. Wheelhouse, C. Trinitis, M. Schulz, and A. Blaszczyk, “CAD Grid: Corporate-Wide Resource Sharing for Parameter Studies”, Euro-Par 2003, European Conference on Parallel Computing, August 2003 (LNCS, Springer Verlag ).
  101. T. Mu, J. Tao, M. Schulz, and S.A. McKee, “Interactive Locality Optimizations on NUMA Architectures”, ACM Symposium on Software Visualization (Softvis), June 2003 (ACM Press).
  102. M. Schulz and S.A. McKee, “A Framework for Portable Shared Memory Programming”, International Parallel and Distributed Processing Symposium (IPDPS), April 2003 (IEEE CS Press).
  103. J. Tao, M. Schulz, and W. Karl, “A Simulation Tool for Evaluating Shared Memory Systems”, Annual Simulation Symposium (ASS), May 2003.
  104. D. Kranzlmüller and M. Schulz, “Notes on Nondeterminism in Message Passing Programs”,  9th European PVM/MPI Users´ Group Meeting,  pp. 357-367, October 2002 (LNCS, Springer Verlag).
  105. C. Trinitis, M. Schulz, and W. Karl, “A Comprehensive Electric Field Simulation Environment on Top of SCI”,  9th European PVM/MPI Users´ Group Meeting, pp. 114-121, October. 2002 (LNCS, Springer Verlag).
  106. C. Trinitis, M. Schulz, and W. Karl, “Boosting the Performance of Electromagnetic Simulations on a PC-Cluster”, International Conference of Parallel Computing in Electrical Engineering (PARELEC), September 2002 (IEEE CS Press).
  107. M. Schulz, “Using Semantic Information to Guide Efficient Parallel I/O on Clusters”, Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11), July 2002 (IEEE CS Press).
  108. M. Schulz, J. Tao, C. Trinitis, and W. Karl, “SMiLE: An Integrated, Multi-Paradigm Software Infrastructure for SCI-based Clusters”, Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), (selected for best papers publication in FGCS), May 2002 (IEEE CS Press).
  109. J. Tao, M. Schulz, and W. Karl, “Improving Data Locality Using Dynamic Page Migration based on Memory Access Histograms”, Proceedings of the International Conference on Computational Science (ICCS), session on Tools for Program Development and Analysis, April 2002 (LNCS, Springer Verlag).
  110. M. Schulz, “Parallel Volume Rendering based on Isosurface Extraction Using Commodity Clusters”, Visualization, Imaging, and Image Processing (VIIP), September 2001 (ACTA Press).
  111. J. Tao, W. Karl, and M. Schulz, “Using Simulation to Understand the Data Layout of Programs”, Applied Simulation and Modeling (ASM), September 2001 (ACTA Press).
  112. C. Trinitis, M. Schulz, M. Eberl, and W. Karl, “SCI-based LINUX PC-Clusters as a Platform for Electromagnetic Field Calculations”, 6th International Conference on Parallel Computing Technologies (PaCT 2001), September 2001 (LNCS, vol. 2127, Springer Verlag).
  113. M. Schulz, “DIOM: Parallel I/O for Data Intensive Applications on Commodity Clusters”, Parallel and Distributed Computing and Systems (PDCS), August 2001 (ACTA Press).
  114. W. Karl, M. Schulz, M. Völk, and S. Ziegler, “Meeting the Computational Demands of Nuclear Medical Imaging using Commodity Clusters”, International Conference on Computational Science (ICCS), May 2001 (LNCS, vol. 2074, Springer Verlag).
  115. J. Tao, W. Karl, and M. Schulz, “Visualizing the Memory Access Behavior of Shared Memory Applications on NUMA Architectures”, International Conference on Computational Science (ICCS), May 2001 (LNCS, vol. 2074, Springer Verlag).
  116. M. Schulz, “Efficient Deployment of shared memory models on clusters of PCs using the SMiLEing HAMSTER approach”, International Conference on Algorithms and Architectures in Parallel Processing (ICA3PP), December 2000, (World Scientific Publishing).
  117. M. Schulz, “Multithreaded Programming of PC clusters”, Parallel Architectures and Compilation Techniques (PACT), October 2000 (IEEE Press).
  118. W. Karl, M. Schulz, M. Völk, and S. Ziegler, “NEPHEW: Applying a Toolset for the Efficient Deployment of a Medical Image Application on SCI-based clusters”, Euro-Par 2000, European Conference on Parallel Computing, August 2000 (LNCS, vol. 1900, Springer Verlag).
  119. M. Schulz, “Efficient Coherency and Synchronization Management in SCI based DSM systems”, SCI-Europe 2000, August 2000.
  120. J. Tao, W. Karl, and M. Schulz, “Understanding the Behavior of Shared Memory Applications Using the SMiLE Monitoring Framework”, SCI-Europe 2000, August 2000.
  121. R. Hockauf, J. Jeitner, W. Karl, R. Lindhof, M. Schulz, V. Gonzales, E. Sanquis, and G. Torralba, “Design and Implementation Aspects for the SMiLE Hardware Monitor”, SCI-Europe 2000, August 2000.
  122. W. Karl, M. Schulz, and J. Tao, “Using the SMiLE Monitoring Infrastructure to Detect and Lower the Inefficiency of Parallel Applications”, HPCN-Europe, May 2000 (LNCS, vol. 1823, Springer Verlag).
  123. W. Karl, M. Schulz, and J. Trinitis, “Multilayer Online-Monitoring for Hybrid DSM systems on top of PC clusters with a SMiLE”, 11th International Conference on Modeling Techniques and Tools for Computer Performance Evaluation, USA, March 2000 (LNCS, vol.  1786, Springer Verlag).
  124. W. Karl, M. Leberecht, and M. Schulz, “Optimizing data locality for SCI-based PC-clusters with the SMiLE monitoring approach”, Parallel Architectures and Compilation Techniques (PACT), October 1999 (IEEE Press).
  125. M. Schulz, M. Völk, W. Karl, F. Munz, and S. Ziegler, “Running a spectral analysis code on top of SCI shared memory using the TreadMarks API”, SCI-Europe '99, September 1999.
  126. M. Schulz and H. Hellwagner, “Global Virtual Memory based on SCI-DSM”, SCI-Europe '98, September 1998.
  127. M. Schulz, SISCI Pthreads, “SMP-like programming on an SCI-cluster”, HPCN-Europe, April 1998 (LNCS, vol. 1401, Springer Verlag).
  128. X. Zhang, A. Dasdan, M. Schulz, R. Gupta, and A. Chien, “Architectural Adaptation of Application-Specific Locality Optimizations”, International Conference on Computer Design (ICCD), September 1997.

Workshops

  1. Matthias Maiterth, Martin Schulz, Barry Rountree, Dieter Kranzlmueller, “Power Balancing in an Emulated Exascale Environment”, The 12th IEEE Workshop on High-Performance, Power-Aware Computing (HPPAC) 2016, Chicago, IL, May 2016, to appear.
  2. Daniel Ellsworth, Tapasya Patki, Swann Perarnau, Sangmin Seo, Kazutomo Yoshii, Henry Hoffman, Martin Schulz and Pete Beckman, “System-Wide Power Management With Argo”, The 12th IEEE Workshop on High-Performance, Power-Aware Computing (HPPAC) 2016, Chicago, IL, May 2016, to appear.
  3. Simone Atzeni, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Dong H. Ahn, Ignacio Laguna, Martin Schulz, Gregory L. Lee, Joachim Protze, Matthias S. Mueller, “ARCHER: Effectively Spotting Data Races in Large OpenMP Applications”, EC2 Workshop, June 2015.
  4. Ananta Tiwari, Martin Schulz, Laura Carrington, “Predicting Optimal Power Allocation for CPU and DRAM Domains”, 16th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, May 2015.
  5. K. Shoga, B. Rountree, M. Schulz, “Whitelisting MSRs”, Third Workshop on Extreme-Scale Programming Tools, November 2014.
  6. Hormozd Gahvari, William Gropp, Kirk Jordan, Martin Schulz and Ulrike Yang, “Algebraic Multigrid on a Dragonfly Network: First Experiences on a Cray XC30”, Fifth International Workshop on Performance Modeling, Benchmarking, and Simulation of HPC Systems (PMBS), November 2014.
  7. Joachim Protze, Simone Atzeni, Dong H. Ahn, Martin Schulz, Ganesh Gopalakrishnan, Matthias Mueller, Ignacio Laguna, Zvonimir Rakamaric, Greg L. Lee, “Towards Providing Low-Overhead Data Race Detection for Large OpenMP Applications”, Workshop on the LLVM Compiler Infrastructure in HPC, November 2014.
  8. Howard Pritchard, Ignacio Laguna, Kathryn Mohror, Todd Gamblin, Martin Schulz, Nickolas Davis, “A Global Exception Fault Tolerance Model for MPI”, Workshop on Exascale MPI, November 2014.
  9. Martin Schulz, Abhinav Bhatele, David Boehme, Peer-Timo Bremer, Todd Gamblin, Alfredo Gimenez, Kate Isaacs, “A Flexible Data Model to Support Multi-Domain Performance Analysis”, 8th International Parallel Tools Workshop, Stuttgart, Germany, October 2014.
  10. Joachim Protze, Tobias Hilbrich, Martin Schulz, Bronis R. de Supinski, Wolfgang E. Nagel and Matthias S. Mueller, “MPI Runtime Error Detection with MUST: A Scalable and Crash-Safe Approach”, Fifth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2014), Minneapolis, MN, September 2014.
  11. Dong H. Ahn, Jim Garlick, Mark Grondona, Don Lipari, Becky Springmeyer, Martin Schulz, “Flux: A Next-Generation Resource Management Framework for Large HPC Centers”, 10th International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems), Minneapolis, MN, September 2014.
  12. I. Laguna, E. Leon, M. Schulz and M. Stephenson, “A Study of Application-Level Recovery Methods for Transient Network Faults”, Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA) 2013, Denver, CO, November 2013.
  13. D. Ahn, G. Lee, G. Gopalakrishnan, Z. Rakamaric, M. Schulz, I. Laguna, “Overcoming Extreme-Scale Reproducibility Challenges Through a Unified, Targeted, and Multilevel Toolset”, First International Workshop on Software Engineering for High Performance Computing in Computational Science and Engineering (SE-HPCCSE 2013), Denver, CO, November 2013.
  14. M. Schulz, J. Belak, G. Bronevetsky, M. Casas, I. Laguna, D. Richards, B. Rountree, “Analyzing Future Exascale Platforms on Today’s Machines”, Workshop on Extreme-Scale Programming Tools, Denver, CO, November 2013.
  15. T. Gamblin, M. Schulz, T. Bremer, and A. Bhatele. Building models from measurement with multi-domain correlation. In DOE Workshop on Modeling & Simulation of Exascale Systems & Applications (ModSim 2013), Seattle, WA, September 18-19 2013.
  16. M. Schulz, J. Belak, G. Bronevetsky, I. Laguna, D. Richards, B. Rountree, “Emulating Exascale Conditions on Today’s Platforms”, In DOE Workshop on Modeling & Simulation of Exascale Systems & Applications (ModSim 2013), Seattle, WA, September 18-19 2013.
  17. A. Eichenberger, J. Mellor-Crummey, M. Schulz, M. Wong, N. Copty, J. DelSignore, R. Dietrich, X. Liue, E. Loh, D. Lorenz, and other members of the OpenMP Tools Working Group, “OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis”, 9th International Workshop on OpenMP, Canberra, Australia, September 2013.
  18. H. Gahvari, W.D. Gropp, K. Jordan, M. Schulz, U. Yang, “Systematic Reduction of Data Movement in Algebraic Multigrid Solvers”, Workshop on Large Scale Parallel Processing (LSPP), Camebridge, MA, May 2013.
  19. Hilbrich, Tobias, Joachim Protze, Bronis R. de Supinski, Martin Schulz, Matthias Mueller and Wolfgang E. Nagel, "Intralayer Communication for Tree-Based Overlay Networks”, Fourth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2013), Lyon, France, October 2, 2013, (LLNL-CONF-639390).
  20. Barry Rountree, Dong Ahn, Bronis R. de Supinski, David K. Lowenthal, and Martin Schulz. Beyond DVFS: A first look at performance under a hardware-enforced power bound. In 8th Workshop on High-Performance, Power-Aware Computing (HPPAC), May 2012.
  21. Martin Schulz, Abhinav Bhatele, Peer-Timo Bremer, Todd Gamblin, Katherine Isaacs, Joshua A. Levine, and Valerio Pascucci. Creating a tool set for optimizing topology-aware node mappings. In 5th International Workshop on Parallel Tools, September 2011.
  22. Joshua Goehner, Dorian Arnold, Dong Ahn, Gregory Lee, Bronis R. de Supinski, Matthew Legendre, Martin Schulz and Barton Miller, “A Framework for Bootstrapping Extreme Scale Software Systems”, In the First International Workshop on High-performance Infrastructure for Scalable Tools (WHIST), June 2011.
  23. Barry Rountree, Guy Cobb, Todd Gamblin, Martin Schulz, Bronis R. de Supinski, and Henry Tufo. “Parallelizing heavyweight debugging tools with MPIecho.” In the First International Workshop on High-performance Infrastructure for Scalable Tools (WHIST), June 2011.
  24. A. Humphrey, C. Derrick, B. Tibbitts, A. Vo, S. Vakkalanka, G. Gopalakrishnan, B. de Supinski, M. Schulz and G. Bronevetsky, “Verification for Portability, Scalability, and Grokkability”, (EC)2 2010: Workshop on Exploiting Concurrency Efficiently and Correctly, Edinburgh, UK, July 2010.
  25. Bronevetsky, Greg, Ignacio Laguna, Saurabh Bagchi, Bronis R. de Supinski, Dong H. Ahn and Martin Schulz, “Statistical Fault Detection for Parallel Applications with AutomaDeD”, The 2010 IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE 6), Palo Alto, CA, March 23-24, 2010, (LLNL-CONF-426254).
  26. F. Mueller, X. Wu, M. Schulz, B.R. de Supinski, and T. Gamblin, “ScalaTrace: Tracing, Analysis and Modeling of HPC Codes at Scale”, Para 2010: State of the Art in Scientific and Parallel Computing, Reykjavi, June 2010.
  27. T. Hilbrich, M. Schulz, B.R. de Supinski. M.S. Mueller, “MUST: A Scalable Approach to Runtime Error Detection in MPI Programs”, 3rd Parallel Tools Workshop, Dresden, Germany, September 2009.
  28. S. Biswas, D. Franklin, T. Sherwood, F. Chang, B.R. de Supinski, M. Schulz, “PSMalloc: Content Based Memory Management for MPI Applications”, MEDEA 2009, Raleigh, NC, September 2009 (LLNL-PROC-414508).
  29. B.R. de Supinski, R.J. Fowler, T. Gamblin, F. Mueller, P. Ratn and M. Schulz, "An Open Infrastructure for Scalable, Reconfigurable Analysis”, International Workshop on Scalable Tools for High-End Computing (STHEC), Kos, Greece, June 7, 2008, (LLNL-CONF-403954).
  30. R. Preissl, M. Schulz, D. Kranzlmueller, B.R. de Supinski and D.J. Quinlan, “Using MPI Communication Patterns to Guide Source Code Transformations”, Tools for Program Development and Analysis in Computational Science, Springer LNCS, 5103, May 2008, (UCRL-CONF-400356).
  31. M. Curtis-Maury, K. Singh, S.A. McKee, F. Blagojevic, D.S. Nikolopoulos, B.R. de Supinski, M. Schulz. "Identifying Energy-Efficient Concurrency Levels using Machine Learning". In Proceedings of the International Workshop on Green Computing, Austin, TX, September 2007, (UCRL-CONF-233024).
  32. R. Vuduc, M. Schulz, D. Quinlan, B. R. de Supinski, and A. Saebjornsen, “Improving Distributed Memory Applications Testing by Message Perturbation”, Fourth Workshop on Parallel and Distributed Systems: Testing and Debugging (PADTAD - IV), July 17, 2006, (UCRL-PROC-221395), (Best Paper Award).
  33. E. Ipek, J. Martinez, B. de Supinski, S. McKee, M. Schulz, “Dynamic Program Phase Detection in Distributed Shared-Memory Multiprocessors”, NSF Next Generation Software Program Workshop (an IPDPS 2006 Workshop), April 25, 2006,
    (UCRL-CONF-219596).
  34. E. Ipek, M. Schulz, B. R. de Supinski, S. A. McKee, and R. Caruana. "Automatic Model Generation for Performance Prediction", Dagstuhl Workshop on Automatic Performance Analysis, Dagstuhl, Germany, December 2005, (UCRL-ABS-217735).
  35. M. Schulz, D. Ahn, A. Bernat, B. R. de Supinski, S. Y. Ko, and B. Rountree, "Scalable Dynamic Instrumentation for BlueGene/L", Workshop on Binary Instrumentation and Applications (St. Louis, MO United States, September 2005), to be published ACM SIGARCH News, (UCRL-CONF-215232).
  36. M. Schulz, J. May, and J. Gyllenhaal, "DynTG: A tool for Interactive, Dynamic Instrumentation", Tools for Program Development and Analysis in Computational Science, Springer LNCS, 3515, pp 140‑14, May2005, (UCRL-CONF-209840).
  37. T. Suh, H.-H. S. Lee, S. A. McKee, and M. Schulz. "Evaluating System-wide Monitoring Capsule Design Using Xilinx Virtex-II Pro FPGA." In Workshop on Architecture Research using FPGA Platforms in conjunction with International Symposium on High-Performance Computer Architecture, San Francisco, CA, February, 2005.
  38. M. Schulz,  B. S. White,  S. A. McKee, H.H.-S. Lee, and J. Jeitner, "A Vision for Next Generation System Monitoring", HPCA Workshop on Hardware Monitoring, February 2005, (UCRL-ABS-208943).
  39. G. Bronevetsky G., M. Schulz, P. Szwed., D. Marques, and K. Pingali, "Checkpointing for Shared Memory Programs at the Application-level", Proceedings of the Sixth European Workshop on OpenMP, Stockholm, Sweden, Oct. 18-22, 2004, (UCRL-CONF-206542).
  40. J. Tao, M. Schulz, W. Karl, “SIMT/OMP: A Toolset to Study and Exploit Memory Locality of OpenMP Applications on NUMA Architectures”, Workshop on OpenMP Applications and Tool (WOMPAT), May, 2004.
  41. P. Szwed, D. Marques, Robert M. Buels, S.A. McKee, M. Schulz, “SimCheck: Fast-Forwarding via Native Execution and Application-Level Checkpointing”, Proceedings of the 8th Workshop on Interaction Between Compilers and Computer Architecture (INTERACT 8), February 2004.
  42. T. Brandes, H. Schawmborn, M. Gerndt, J. Jeitner, E. Kereku, W. Karl, M. Schulz, J. Tao, H. Brunst, W. Nagel, R. Neumann, R. Mueller-Pfefferkorn, B. Trenkler, H.-C. Hoppe, „Werkzeuge für die effiziente parallele Programmierung von Cache-Architekturen”, Proceedings of the 19th PARS Workshop, March 2003.
  43. M. Schulz, J. Tao, and S.A. McKee, “Local Relaxed Consistency Schemes on Shared-Memory Clusters”, Proceedings of the 2nd Workshop on System Area Networks (SAN-2), held at HPCA-9, February 2003.
  44. M. Schulz and C. Trinitis, “An Integrated Parallel Simulation Environment for Electrostatic and Electromagnetic Field Distributions in High Voltage Components”, 6th Meeting of the IBM Scientific User Group, August 2002.
  45. M. Schulz, J. Tao, and W. Karl, “Improving the Scalability of Shared Memory Systems through Relaxed Consistency”, Second Workshop on Caching, Coherence and Consistency (WC3 '02) / held together with ICS'02, June 2002.
  46. M. Schulz, J. Tao, J. Jeitner, and W. Karl, “A Proposal for a New Hardware Cache Monitoring Architecture”, ACM SIGPLAN workshop on Memory Systems Performance (MSP), held together with PLDI 2002, June 2002.
  47. M. Schulz, “Overcoming the Problems Associated with the Existence of Too Many DSM APIs”, Proceedings of the 2002 International Workshop on Distributed Shared Memory on Clusters, held together with CCGrid02, May 2002 (IEEE CS Press).
  48. J. Tao, W. Karl, and M. Schulz, “A Novel Approach for Data Distribution on NUMA Machines”, Proceedings of the 6th German workshop on Parallel Systems and Algorithms, April 2002.
  49. M. Gerndt, A. Schmidt, M. Schulz, and R. Wismüller, “Perfomance Analysis of Teraflop Computers – A Distributed Automatic Approach”, 10th Euromicro Workshop on Parallel, Distributed, and Network Processing (PDP), January 2002.
  50. W. Karl and M. Schulz, “Hybrid-DSM: An Efficient Alternative to Pure Software DSM Systems on NUMA Architectures”, 2nd International Workshop on Software DSM, held together with ICS 2000, May 2000.
  51. M. Schulz, “SCI-VM: A flexible base for transparent shared memory programming models on clusters of PCs”, High level Programming Models and Supportive Environments (HIPS '99), held together with IPDPS 1999, April 1999 (LNCS, vol. 1586, Springer Verlag).
  52. M. Eberl, W. Karl, M. Leberecht, and M. Schulz, „Eine Software-Infrastruktur für Nachrichtenaustausch und gemeinsamen Speicher auf SCI-basierten PC-Clustern“, 2. Workshop on Cluster Computing, March 1999.
  53. W. Karl, M. Leberecht, and M. Schulz, “Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE”, 3rd International Workshop on Communication, Architecture and Applications for Network-Based Parallel Computing (CANPC '99), held together with HPCA, January 1999 (LNCS, vol. 1602, Springer Verlag).
  54. M. Eberl, H. Hellwagner, B. Herland, and M. Schulz, “SISCI - Implementing a Standard Software Infrastructure on an SCI Cluster”, 1. Workshop on Cluster Computing, November 1997.

Posters

  1. Simone Atzeni, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Dong H. Ahn, Ignacio Laguna, Martin Schulz, Gregory L. Lee, “ARCHER: Effectively Spotting Data Races in Large OpenMP Applications”, 2016 European LLVM Developers' Meeting, Barcelona, March 2016.
  2. David Boehme, Todd Gamblin, Peer-Timo Bremer, Olga Pearce, Martin Schulz, “Caliper: Composite Performance Data Collection in HPC”, SC 2015, Austin, TX, Nov. 2015.
  3. Alfredo Gimenez, Benafsh Husain, David Boehme, Todd Gamblin, Martin Schulz, “Mitos: A Simple Interface for Complex Hardware Sampling and Attribution”, SC 2015, Austin, TX, Nov. 2015.
  4. Sandra Wienke, Tim Cramer, Matthias Mueller, Martin Schulz, “Quantifying Productivity—Towards Development Effort Estimation in HPC”, SC 2015, Austin, TX, Nov. 2015.
  5. Swann Perarnau, Rinku Gupta, Pete Beckman, Edgar Leon, Barry Rountree, Martin Schulz, et al., “Argo: An Exascale Operating System and Runtime”, SC 2015, Austin, TX, Nov. 2015.
  6. Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, and Nancy M. Amato, “Decoupled Load Balancing”, In ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming (PPoPP’15), San Francisco, CA, February 7-11 2015.
  7. Nicklas Bo Jensen, Niklas Quarfot Nielsen, Gregory L. Lee, Sven Karlsson, Dong H. Ahn, Matthew Legendre, Martin Schulz, “DySectAPI: Scalable Prescriptive Debugging”, Supercomputing 2014, New Orleans, November 2014.
  8. Katherine E. Isaacs, Todd Gamblin, Abhinav Bhatele, Peer-Timo Bremer, Martin Schulz, and Bernd Hamann. Extracting logical structure and identifying stragglers in parallel execution traces. In ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming (PPoPP’14), Orlando, FL, February 15-19 2014.
  9. M. Weber, K. Mohror, M. Schulz, H. Brunst, B. de Supinski, W. Nagel, “Structural Comparison of Parallel Applications”, SC2013, Denver, Colorado, November, 2013, (nominated for best poster).
  10. I. Laguna, M. Schulz, J. Keasler, D. Richards, J. Belak, “Optimal Placement of Retry-Based Fault Recovery Annotations in HPC Applications”, SC2013, Denver, Colorado, November, 2013.
  11. S. Mitra, I. Laguna, D. Ahn, T. Gamblin, M. Schulz, S. Bagchi, “Scalable Parallel Debugging via Loop-Aware Progress Dependence Analysis”, SC2013, Denver, Colorado, November, 2013.
  12. A. Landge, J. A. Levine, P.-T. Bremer, M. Schulz, T. Gamblin, A. Bhatele, K. E. Isaacs, V. Pascucci, “Interactive Linked Visualizations for Performance Analysis of Heterogeneous Computing Clusters”, GPU Technology Conference (GTC) 2012, San Jose, CA, May 2012.
  13. A. Bhatele, T. Gamblin, B.T. Gunney, M. Schulz, P.T. Bremer and K. E. Isaacs, “Revealing Performance Artifacts in Parallel Codes Through Multi-Domain Visualizations”, SIAMPP2012, Savahnnah, GA, Feburary 2012.
  14. D. Boehme, Martin Schulz, Bronis R. de Supinski, Markus Geimer and Felix Wolf, "Critical Path Analysis for Large-Scale MPI Programs”, a poster at SC2010, New Orleans, Louisiana, November 13–19, 2010, (LLNL-POST-447564).
  15. O. Pearce, Todd Gamblin, Martin Schulz, Bronis R. de Supinski and Nancy Amato, "Load Balance: Correlating Application-Independent Measurements with Application-Semantic Computational Models”, a poster at SC2010, New Orleans, Louisiana, November 13–19, 2010, (LLNL-POST-432915).
  16. C, Klausecker, Thomas Koeckerbauer, Martin Schulz, and Dieter Kranzlmueller, “A New Generation of Integrated Debugging Tools in Eclipse”, a poster at SC2010, New Orleans, Louisiana, November 13–19, 2010.
  17. D. Li, Bronis R. de Supinski, Martin Schulz, Kirk W. Cameron and Dimitrios S. Nikolopoulos, "Model-Based Hybrid MPI/OpenMP Power-Aware Computing”, a poster at SC2009, Portland, Oregon, November 14–20, 2009.
    (LLNL-POST-423694).
  18. D. Li, K.W. Cameron, D.S. Nikolopoulos, M. Schulz, and B.R. de Supinski, “Model-Based Hybrid MPI/OpenMP Power-Aware Computing”, Supercomputing 2008, November 2008.
  19. T. Gamblin, B.R. de Supinski, M. Schulz, D. Reed, and R. Fowler, “Scalable Performance Equivalence Class Detection Using Clustering”, Supercomputing 2008, November 2008.
  20. B. Rountree, D. Lowenthal, B.R. de Supinksi, M. Schulz, V. Freeh, T. Bletch, “Adagio: Saving Energy with Runtime Dynamic Voltage Scaling”, Supercomputing 2008, November 2008.
  21. T. Gamblin, P. Ratn, B.R. de Supinski, M. Schulz, F. Mueller, R.J. Fowler, D.A. Reed, “An Open Framework for Scalable, Reconfigurable Performance Analysis”, Supercomputing 2007, November 12-18, 2007, (UCRL-POST-236200).
  22. R. Preissl, M. Schulz, D. Kranzlmueller, B.R. de Supinski, D.J. Quinlan, “Using MPI Communication Patterns To Guide Source Code Transformations”, Supercomputing 2007, November 12-18, 2007, (UCRL-POST- 236042).
  23. M. Noeth, F. Mueller, M. Schulz, and B. de Supinski, "Scalable Compression and Replay of Communication Traces in Massively Parallel Environments", Supercomputing 2006, November 11-17, 2006, (UCRL-POST-225759).
  24. B. Aichinger,  M. Schulz, D. Kranzmueller, R. Preissl and T. Koeckerbauer, B. de Supinski, "Patterns in Parallel Programs - Towards High-level Understanding of Large-Scale Traces", Supercomputing 2006, November 11-17, 2006, (UCRL-POST-225763).
  25. M. Schulz, D. Kranzlmüller, and B. R. de Supinski, "The MPI Test Suite ‑ Unexpected Behavior in a Standardized Programming Environment", Supercomputing 2005, November 2005.
  26. T. Mu, J. Tao, M. Schulz, and S.A. McKee, “Interactive Locality Optimizations on NUMA Architectures”, ACM Symposium on Software Visualization (Softvis), June 2003.
  27. T. Mu, J. Tao, M. Schulz, and S.A. McKee, “Visualizing Data Distributions on NUMA Architectures to Guide Incremental Optimizations”, Supercomputing 2002, November 2002.
  28. M. Schulz, C. Trinitis, J. Tao, W. Karl, “SMiLE: An integrated, multiparadigm infrastructure for High Performance Computing on SCI-based Clusters”, Supercomputing 2001, November 2001.
  29. G. Torralba, V. Gonzáles, E. Sanchis, J. Tao, M. Schulz, and W. Karl, “Data Monitoring in High Performance Clusters”, 12th IEEE International Congress on Real Time for Nuclear and Plasma Sciences, NPSS, June 2001.
  30. M. Schulz, M. Voelk, W. Karl, and S. Ziegler, „Effiziente iterative PET-Bild Rekonstruktion auf einem Cluster von PCs“, Jahreskongresses der DEGRO, ÖGRO, DGMP - Band 176, Sonder­nummer 1, October 2000.
  31. G. Acher, R. Buchty, M. Eberl, D. Fliegl, W. Karl, M. Leberecht, M. Schulz, and C. Trinitis, “High-Performance Cluster Computing”, International trade fair CeBIT '99, March 1999.
  32. M. Schulz and H. Hellwagner, “Extending NT Virtual Memory by SCI-based Hardware DSM”, Usenix Windows NT Symposium, August 1998.

Tutorials

  1. Martin Schulz, Jim Galarowicz, Don Maghrak, and Mahesh Rajan, “How to Analyze the Performance of Parallel Codes 101”, SC2015, Austin, TX, November  2015.
  2. Dieter Kranzlmueller, David Lowenthal, Barry Rountree, and Martin Schulz, “Power Aware High Performance Computing:Challenges and Opportunities for Application Developers”, SC2015, Austin, TX, November  2015.
  3. Alexandru Calotoiu, Torsten Hoefler, Martin Schulz, Sergei Shudler and Felix Wolf, “Insightful Automatic Performance Modeling”, SC2015, Austin, TX, November  2015.
  4. Alexandru Calotoiu, Torsten Hoefler, Martin Schulz, Sergei Shudler and Felix Wolf, “Insightful Automatic Performance Modeling”, EuroMPI 2015, Bordeaux, France, September 2015.
  5. Martin Schulz, Jim Galarowicz, Mahesh Rajan, “How to Analyze the Performance of Parallel Codes 101”, SC2014, New Orleans, LA, November  2014.
  6. Martin Schulz, Jim Galarowicz, Jennifer Green, Don Maghrak, “How to Analyze the Performance of Parallel Codes 101”, SC2013, Denver, Colorado, November  2013.
  7. Pavan Balaji, Torsten Hoeffler, Martin Schulz, “Next Generation MPI Programming: Advanced MPI-2 & New Features in MPI-3”, International Supercomputing Conference (ISC) 2013, Leipzig, Germany, June 2013.
  8. Martin Schulz, Bernd Mohr, Brian Wylie, “Supporting Code Developments on Extreme-scale Computer Systems”, International Supercomputing Conference (ISC) 2013, Leipzig, Germany, June 2013.
  9. Jim Galarowicz, Martin Schulz, “An Introduction into Performance Analysis for HPC Systems with Open/SpeedShop”, UCAR Software Engineering Assembly, Boulder, CO, April 2013.
  10. Martin Schulz, Jim Galarowicz, Don Maghrak, David Montoya, Mahesh Rajan, Matthew LeGendre, “How to Analyze the Performance of Parallel Codes 101”, SC2012, Salt Lake City, Utah, November  2012.
  11. Martin Schulz, Bernd Mohr, Brian Wylie, “Supporting Performance Analysis and Optimization on Extreme-Scale Computer Systems”, SC2012, Salt Lake City, Utah, November  2012.
  12. Martin Schulz, Torsten Hoeffler, “Next Generation MPI Programming: Advanced MPI-2 & New Features in MPI-3”, International Supercomputing Conference (ISC) 2012, Hamburg, Germany, June 2012.
  13. Martin Schulz, Bernd Mohr, Brian Wylie, “Supporting Code Developments on Extreme-scale Computer Systems”, International Supercomputing Conference (ISC) 2012, Hamburg, Germany, June 2012.
  14. Martin Schulz, Jim Galarowicz, Matthew Legendre, Don Maghrak, and Mahesh Rajan, “How to Analyze the Performance of Parallel Codes 101 – A Case Study with Open|SpeedShop”, SC 2011, Seattle, WA, November 2011.
  15. Martin Schulz, Jim Galarowicz, Matthew Legendre, Don Maghrak, and Mahesh Rajan, “An Introduction into Performance Analysis for HPC Systems with Open|Speedshop ”, SC 2011, Seattle, WA, November 2011.
  16. Martin Schulz, Bernd Mohr, Brian Wylie, “Supporting Code Developments on Extreme-scale Computer Systems”, SC 2011, Seattle, WA, November 2011.
  17. Martin Schulz, Jim Galarowicz, Don Maghrak, David Montoya, and Mahesh Rajan, “How to Analyze the Performance of Parallel Codes 101 – A Case Study with Open|SpeedShop”, SC 2010, New Orleans, LA, November 2010.
  18. Martin Schulz, Don Maghrak, “How to Analyse the Performance of Parallel Codes 101 – A Case Study with Open|SpeedShop”, SciDAC 2010, Chattanooga, TN, July 2010.
  19. Martin Schulz, Don Maghrak, David Montoya, “Performance Analysis and Optimization with Open|SpeedShop”, LCI Conference 2010, Pittsburgh, PA, USA, March 2010.
  20. Adreas Knuepfer, Dieter Kranzlmueller, Martin Schulz, Christof Klausecker, “Large Scale Communication Analysis: Tools for Understanding Highly Scalable Codes”, Supercomputing 2009, Portland, OR, USA, November 2009.
  21. Martin Schulz, Jim Galarowicz, “Performance Analysis and Optimization with Open|SpeedShop”, IEEE Cluster 2009, New Orleans, LA, USA, August 2009.
  22. Martin Schulz, Jim Galarowicz, Don Maghrak, David Montoya, Scott Cranford, “Parallel Performance Analysis with Open|SpeedShop”, Supercomputing 2008, Austin, TX, USA, November 2008.
  23. Martin Schulz, Jim Galarowicz, Samuel Gutierez, Scott Cranford, “Parallel Performance Analysis with Open|SpeedShop”, DoD HPCMod Users’ Meeting, Seattle, WA, USA, July 2008.
  24. Martin Schulz, David Montoya, Jim Galarowicz, “Open|SpeedShop: An Open Source Performance Analysis Framework for Cluster Platforms”, High Performance Computer Science Week (HPCSW), Denver, CO, USA, April 2008.
  25. David Brooks, Bronis R. de Supinski, Benjamin Lee, Sally A. McKee, Martin Schulz, Karan Singh, “Methods of Learning and Inference for Large Design and Parameter Spaces”, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Seattle, WA, USA, March 2008.
  26. M. Schulz, J. Galarowicz, D. Montoya, “Open|SpeedShop: Open Source Performance Analysis for Linux Clusters”, International Conference on Supercomputing (ICS) 2007, June, 2007.
  27. Brooks, David M., Bronis R. de Supinski, Benjamin C. Lee, Sally A. McKee, Martin Schulz and Karan Singh, “Inference and Learning for Large Scale Microarchitectural Analysis”, The 34th International Symposium on Computer Architecture (ISCA 2007), San Diego, CA, June 9-13, 2007, (UCRL-ABS-229864).
  28. M. Schulz, S. Cranford, N. De Bardeleben, J. Galarowicz, D. Maghrak, “Open|SpeedShop: Open Source Performance Analysis for Linux Clusters”, Supercomputing 2006, November 11-17, 2006.
  29. M. Schulz and J. Worringen, “Tutorial: SCI Low-level Programming: SISCI / SMI”, Held as part of the SCI Summer School 2001, Trinity College Dublin, Ireland, October 2001.
  30. W. Karl and M. Schulz, “Tutorial: SMiLE Shared Memory Programming”, Held as part of the SCI Summer School 2000, Trinity College Dublin, Ireland, October 2000.

Book Chapters

  1. M. Schulz, “Checkpointing”, Encyclopedia of Parallel Computing, D. Padua (ed), Springer Verlg, (LLNL-MI-419844)
  2. B. de Supinski, Martin Schulz and Erik W. Draeger, Flexible Tools Supporting a Scalable First-Principles MD Code, a chapter in Scientific Computer Performance, Daivd H. Bailey, Robert F. Lucas and Samuel Williams, editors, Taylor and Francis, publishers, New York, NY, 2010. (UCRL-JRNL-445511*).
  3. M. Gerndt, A. Schmidt, M. Schulz, and R. Wismüller, “Automatic Performance Analysis on Hitachi SR8000”, in High Performance Computing in Science and Engineering, Munich 2002, S. Wagner, W. Hanke, A. Bode, and F. Durst (eds.), Springer Verlag, January 2003.
  4. M. Schulz, “True shared memory programming on SCI-based clusters”, in Scalable Coherent Interface / SCI, Architecture and Software for High-Performance Compute Clusters, H. Hellwagner and R. Reinefeld (eds.), LNCS State-of-the-Art Survey, vol. 1734, Springer Verlag, October 1999.

Editor

  1. M. Schulz, “Proceedings of the 6th Workshop on Productivity and Performance (PROPER 2013)”, published as part of the Euro-Par 2013 Workshop Proceedings, Aachen, Germany, August 2013
  2. M. Schulz and S. Midkiff, “Proceedings of the 13th International Workshop on High-level Parallel Programming Models and Supportive Environments”, published as part of the IPDPS workshop proceedings, April 2008.
  3. M. Gerndt and M. Schulz, “Proceedings of the 8th International Workshop on High-level Parallel Programming Models and Supportive Environments”, IEEE Computer Society Press, April 2003.
  4. B. Coghlan, G. Horn, and M. Schulz, “Proceedings of the 4th International Conference on SCI-based Research and Technology”, SINTEF, October 2001.
  5. J. Rolim, M. Schulz, et. al., “Parallel and Distributed Processing (15 IPDPS 2000 Workshops)”, Lecture Notes in Computer Science (LNCS) vol. 1800, Springer Verlag, May 2000.

Thesis

  1. M. Schulz, “Shared Memory Programming on NUMA-based Clusters using a General and Open Hybrid Hardware/Software Approach”, PhD thesis, Technische Universität München, July 2001.
  2. M. Schulz, “Application Study for the Illinois Concert C++, A Parallel Volume Renderer”, Master's thesis, University of Illinois at Urbana-Champaign, January 1997.

Other Publications

  1. Peer-Timo Bremer, Bernd Mohr, Valerio Pascucci, Martin Schulz, “Dagstuhl Manifesto: Connecting Performance Analysis and Visualization to Advance Extreme Scale”, 2015.
  2. C. Trinitis, M. Bader and M. Schulz, Proceedings of ParSim´09 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 16th European PVM/MPI User's Group Meeting, September 2009 (LNCS, Springer Verlag).
  3. C. Trinitis and M. Schulz, Proceedings of ParSim´08 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 15th European PVM/MPI User's Group Meeting, September 2008 (LNCS, Springer Verlag).
  4. C. Trinitis and M. Schulz, Proceedings of ParSim´07 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 14th European PVM/MPI User's Group Meeting, September 2007 (LNCS, Springer Verlag).
  5. B. Lee, M. Schulz, and B. de Supinski, “Regression Strategies for Parameter Space Exploration: A Case Study in Semicoarsening Multigrid and R”, Lawrence Livermore National Laboratory, September 29, 2006, (UCRL-TR-224851).
  6. C. Trinitis and M. Schulz, Proceedings of ParSim´06 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 13th European PVM/MPI User's Group Meeting, September 2006 (LNCS, Springer Verlag).
  7. C. Trinitis and M. Schulz, Proceedings of ParSim´05 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 12th European PVM/MPI User's Group Meeting, September 2005 (LNCS, Springer Verlag).
  8. C. Trinitis and M. Schulz, Proceedings of ParSim´04 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 11th European PVM/MPI User's Group Meeting, September 2004 (LNCS, Springer Verlag).
  9. C. Trinitis and M. Schulz, Proceedings of ParSim´03 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 10th European PVM/MPI User's Group Meeting, September 2003 (LNCS, Springer Verlag).
  10. T. Mohan, B. de Supinski, S. McKee, F. Mueller, A. Yoo, M. Schulz, "Identifying and Exploiting Spatial Regularity in Data Memory References", Lawrence Livermore National Laboratory, July 2003, TR (UCRL-JC-154597).
  11. C. Trinitis and M. Schulz, Proceedings of ParSim´02 – “Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments”, Held with the 9th European PVM/MPI User's Group Meeting, October 2002 (LNCS Volume 2474, Springer Verlag).
  12. M. Schulz, K. Inoue, B. Childers, and S.A. McKee, Guest Editors, ACM Computer Architecture News, Summer 2002 (Proceedings of  HPCA-2002 WiP Session).
  13. M. Schulz, B. Childers, and S.A. McKee, Guest Editors, IEEE Technical Committee on Computer Architecture (TCCA) Newsletter, Fall 2001 (Proceedings of PACT’01 WiP Session).

Invited Conference/Workshop Presentations

  1. “Characterizing Faults on Production Systems”, SIAM Conference on Parallel Processing for Scientific Computing, Paris, France, April 2016.
  2. “Software standards at the example of the Message Passing Interface (MPI)”, 4th ENES Workshop on High Performance Computing for Climate and Weather, Toulouse, France, April 2016.
  3. “Performance Tuning in a Power-Limited World”, International Workshop on Dynamic Code Auto-Tuning, Barcelona, Spain, March 2016.
  4. “System Software for Power Limited HPC Systems: Challenges and Solutions”, Dagstuhl Seminar on Dark Silicon, February 2016.
  5. “Introspection for Exascale Communication: Needs, Opportunities, Tools and Interfaces”, ExaComm Workshop, Frankfurt, Germany, July 2015.
  6. “Performance Modeling Under a Power Bound: A Tour of the Near Future”, Workshop on Performance Modeling: Methods and Applications, Frankfurt, Germany, July 2015.
  7. “MPI Fault Tolerance:The Good, The Bad, The Ugly”, SIAM CSE, Salt Lake City, March 2015.
  8. “Tuning Challenges at Exascale”, International Workshop on Code Autotuning, San Francisco, February 2015.
  9. “Performance Analysis for the Post-Petascale Era: Going Beyond Just Measuring Flop/s and Cache Misses”, Invited Talk at the JST/CREST International Symposium on Post Petascale System Software, Kobe, Japan, December 2014.
  10. “MPI Fault Tolerance:The Good, The Bad, The Ugly”, Dagstuhl Seminar: Resilience in Exascale Computing, September 2014.
  11. “Providing the Necessary Semantic Context for Performance Tools and Autotuners”, Dagstuhl Seminar on “Auto-Tuning for HPC”, Dagstuhl, Wadern, October 2013.
  12. “LLNL/ASC's Activities and Requirements in Critical Technologies for Modeling and Simulation”, Invited/Motivational talk at the DOE Workshop on Modeling and Simulation, Seattle, WA, September 2013.
  13. “The Message Passing Interface: MPI 3.0 and the road to MPI 4.0”, Invited talk at the International Supercomputing Conference (ISC), Session on Programming Models and Tools, Leipzig, Germany, June 2013.
  14. “Providing More Intuitive Performance Analysis through Scalable Visualizations”, CHANGES Workshop, Juelich, Germany, September 2012.
  15. “Center for Exascale Simulation of Advanced Reactors (CESAR)”, Discovery 2015, HPC Workshop, Berkeley, CA, July 2012.
  16. “Tools and Tool Infrastructures for Co-Design”, Dagstuhl Perspectives Seminar on  “Co-Design for Exascale”, Dagstuhl, Wadern, Germany, May 2012
  17. “A Case for Modular and Intuitive Performance Analysis Tools”, SIAM Parallel Processing, Savannah, GA, Feburary 2012.
  18. “A Case for More IntuitivePerformance Analysis”, Conference on High Speed Computing, Salishan, OR, April 2011.
  19. “Performance Tool Effortsat LLNL and the NNSA Tri-Labs - From Research to Production”, JOWOG Workshop, Livermore, CA, May 2010.
  20. “Constructing Application Performance Models Using Neural Networks”, Dagstuhl Seminar “Code Instrumentation and Modeling for Parallel Performance Analysis”, Dagstuhl, Wadern, Germany, July 2008.
  21. “Automatic Model Generation for Performance Prediction”, Dagstuhl, Wadern, Germany, December 2005.
  22. “Distributed Shared Memory: Shared Memory für Cluster Umgebungen”, 2nd Meeting of the KONWIHR Working Group “Tools for Porting Applications to SMP Clusters”, Technische Universität München, München, Germany, December 2001.
  23. “Shared Memory programming on top of SCI, Open SCI users workshop”, Oslo, Norway, August 1999.
  24. “The SMiLE project (Shared Memory in a Lan-like Environment)”, Open SCI users workshop, Oslo, Norway, August 1999.

Invited Seminar Talks

  1. “Performance Analysis for the Exascale Era: From Measurements to Insights”, TU-Dresden, March 2016.
  2. “System Software for Power Limited HPC Systems: Challenges and Solutions”, University of Illinois at Urbana-Champaign, October 2015.
  3. “Performance Analysis for the Exascale Era: From Measurements to Insights”, TU-Darmstadt, July 2015.
  4. “Performance Analysis for the Exascale Era: Going Beyond Just Measuring Flop/s and Cache Misses”, University of Oregon, March 2015.
  5. “Performance Analysis for the Exascale Era: Going Beyond Just Measuring Flop/s and Cache Misses”, University of Oregon, Eugene, OR, March 2015.
  6. “Multi-Domain Performance Analysis”, IBM TJ Watson Research Center, NY, November 2014.
  7. “Performance Analysis Techniques for the Exascale Co-Design Process”, University of Tokyo, Japan, September 2014
  8. “Performance Analysis Techniques for the Exascale Co-Design Process”, Tokyo Institute of Technology, Japan, Sep. 2014
  9. “Performance Analysis Techniques for the Exascale Co-Design Process”, Kyushu University, Japan, September 2014
  10. “Performance Analysis Techniques for the Exascale Co-Design Process”, University of Arizona, Tucson, AZ, April 2014.
  11. “Performance Analysis Techniques for the Exascale Co-Design Process”, Barcelona Supercomputing Center, Barcelona, Spain, January 2014.
  12. “Providing More Intuitive Performance Analysis through Scalable Visualizations”, Virginia Tech, Blacksburg, VA, Feb. 2013
  13. “Providing More Intuitive Performance Analysis through Scalable Visualizations”, Purdue University, West Lafayette, IN, November 2012.
  14. “Fault Tolerance Techniques for HPC Simulations and for Scale”, Guest Lecture Purdue University, West Lafayette, IN, November 2012.
  15. “Tools and Techniques for Scalable Performance Analysis and Optimization”, National Center for Atmospheric Research (NCAR), October 2012.
  16. “Power and Energy Efficient HPC, A Challenge for the Entire System Stack”, Ludwig-Maximilians-Universität München, May 2012.
  17. “More Intuitive Performance Analysis”, Invited Presentation at the FORTH/University of Crete, Greece, September 2011.
  18. “More Intuitive Performance Analysis”, Invited Presentation at the Department of Energy, Office of Science, Germantown, MD, September 2011.
  19. “Performance and Optimization:A Case for more Modular and Intuitive Tools”, INT Exascale Workshop, Seattle, WA, June 2011.
  20. “How can Tools keep up with the Growing Size of HPC Systems?”, Invited Presentation at the Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany, May 2010.
  21. “How can Tools Keep up with the Growing Size of HPC Systems?”, Invited Presentation at the Leibniz Rechenzentrum, München, January 2010.
  22. “Performance Modeling Techniques to Characterize and Optimize Scaling”, North Carolina State University, February 2009
  23. “Keeping up with Growing Machine Sizes: Challenges and Opportunities for Scaling Tools”, Jülich Supercomputing Center, September 2008.
  24. “Developing New Tool Strategies for Scalable HPC Systems”, North Carolina State University, May 2007.
  25. “Leading the Way to Ultrascale Computing, The BlueGene/L System Software Environment", University of Linz, Austria, Department of Computer Science, December 2005.
  26. “Owl: Reconfigurable, Systemwide Monitoring", HP, Palo Alto, March 2005.
  27. “Reconfigurable System-wide Monitoring - Laying the Foundations for Autonomous Systems”. Northwestern University, February 2004.
  28. “Reconfigurable System-wide Monitoring - Laying the Foundations for Autonomous Systems”. University of Connecticut, February 2004.
  29. “Reconfigurable System-wide Monitoring - Laying the Foundations for Autonomous Systems”. University of Pittsburgh, February 2004.
  30. “Owl: Rekonfigurierbares, Systemweites Monitoring – Grundstein fuer autonome Systeme”, Universität Karlsruhe, Germany, February 2004.
  31. “Adaptive Systems - Foundations and Opportunities”. Lawrence Livermore National Laboratory, January 2004.
  32. “HAMSTER: A framework for portable shared memory programming”, AT&T Research, NJ, USA, November 2002.
  33. “Shared Memory Programming on SCI-based Clusters”, Trinity College Dublin, Ireland (as part of a lecture series for final year students), February 2002.
  34. “HAMSTER: A Framework for Shared Memory Support in NUMA-based Cluster Environments”, Illinois Institute of Technology, Chicago, IL, USA, February 2002.
  35. “Cluster Computing mit SCI: Von der Hardware bis zur Anwendung“, Informatik Kolloquium, Johannes Kepler Universität Linz, Austria, December 2001.
  36. “HAMSTER: A Framework for Shared Memory Support in NUMA-based Cluster Environments”, Brown University, RI, USA, November 2001.
  37. “Shared Memory Programmierung im SMiLE Projekt: Das HAMSTER System“, Max-Planck-Institut für Neuropsychologische Forschung, Leipzig, Leipzig, Germany, September 2001.
  38. “DSM Softwarearchitekturen und Programmierumgebungen”, Technische Universität Chemnitz, Germany (as part of the lecture "Cluster and Grid-Computing"), June 2001.
  39. “Efficient Shared Memory Support in NUMA-based Cluster Environments”, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA, June 2001.
  40. “Efficient Shared Memory Support in NUMA-based Cluster Environments”, Lawrence Livermore National Laboratory, CASC, Livermore, CA, USA, June 2001.
  41.  “The SMiLE Project, An overview”, Held at CVUT Prag, Prag, Czech Republic, May 1998.

 

Selected Ongoing and Concluded Projects

The following is a selection of projects – ongoing and concluded – at Lawrence Livermore National Laboratory, Cornell University, Technische Universität München and University of Illinois at Urbana Champaign.

Projects at Lawrence Livermore National Laboratory

OMPT/D (2014-present)

Design and standardization of new tool interfaces for OpenMP: OMPT, a new interface for performance tools, and OMPD, an new interface for debuggers. These new interfaces will allow tools to be used across different OpenMP implementations and thereby enabling portability. Both interfaces are currently under consideration for inclusion into the OpenMP standard.

PRUNER (2013-present)

Tool set for reproducible debugging at scale. This project includes work on efficient message order replay of MPI programs, dynamic and static race detection in OpenMP codes as well as integration of reproducibility efforts into conventional debuggers.

Fault Tolerance (2011-present)

Efforts around fault characterization for large scale systems and applications as well as resiliency mechanisms in programming models. This work includes efforts on fault injection, error logging and analysis, application hardening, and extensions to the MPI interface for fault tolerance.

GREMLINs (2011-present)

System emulation layer enabling running applications under exascale conditions on petascale platforms. This emulation layer consists of a set of individual modules each targeted at restricting a particular resource (such as available memory, cache size, or power) or at injecting external events (such as faults or noise). I served as the project lead and principle designer within the ExMatEx Exascale Co-Design center.

Performance Analysis and Visualization for Exascale (PAVE) (2010-present)

New approaches and techniques to use visualization for gaining deeper insight into performance data, in particular for large scale scientific applications. The project centers on mapping performance data from the domain the data is collected in to other domains more intuitive to the user, such as the physical simulation domain or the user's data structures. PAVE, which I co-founded, forms the basis for many of the ongoing performance analysis projects at LLNL.

MPI Forum (2009-present)

Chair of the MPI forum, the standardization body for Message Passing Interface standard, since 2012 and active participant since 2009 in the role of tools working group chair. I had a leading role in the ratification of MPI 3.1 as well as the development and standardization of the MPI\_T interface as part of MPI 3.0.

Performance Modeling (2008-present)

Various projects focusing on performance modeling for numerical algorithms, large scale parameter studies in scientific applications and architectural simulation, as well as HPC networks. This work included both building and using models, and also led to several tutorials presented at various international conferences.

Power-Constrained HPC (2006-present)

Design and implementation of runtime and scheduling techniques for HPC systems under a power bound. This work includes runtime systems for both energy reduction and power optimization, the study of upper bounds for runtime savings, the implementation of new power-aware scheduling systems as well as a wide range of characterization efforts of applications executed under a power bound.

STAT (2006-present)

Novel debugging tool enabling a quick overview of the state of users' parallel applications at large scales. This system uses a tree based overlay network to quickly aggregate stack trace information form the entire application and provides an interactive GUI for the exploration of this information. This tool, which started as a student project and is now a production tool on all DOE/NNSA machines, as well as part of Cray's standard software stack, was used on test cases running at over one million MPI processes.

PnMPI (2006-present)

Tool layer for the virtualization and modularization of tools written for the MPI Profiling interface (PMPI). This layer, which enables users to transparently compose and customize new tools from existing PMPI tools, is used by several other tool sets to simplify their design and maintenance, as well as to easily share base components. It also forms the basis for current discussions in the MPI forum on how to redesign the MPI Profiling Interface.

Open|SpeedShop (2004-present)

Comprehensive, easy-to-use performance tool set developed by the Krell Institute under the funding and guidance by DOE/NNSA. Open$\mid$SpeedShop is deployed on all DOE/NNSA systems as well as many other HPC centers and provides users a simple way to gather performance data without the need for explicit application instrumentation. I serve as the DOE lead coordinator for this long-term effort.

Hardware Transactional Memory (2011-2013)

Exploration of new Transactional Memory (TM) functionality available on the BG/Q systems, in particular the Sequoia and Vulcan systems at LLNL, as well as investigation of the impact of TM on applications.

MPI_T (2009-2012)

New tools interface designed for and included in the MPI 3.0 standard. This interface allows tools to gain portable access to MPI internal information as well as to dynamically configure MPI. In addition, the project included close work with MPI library writers on making the interface available in MPI implementations as well as the design of a first set of tools for the new interface.

ScalaTrace (2006-2010)

New techniques of automatic compression of MPI traces. ScalaTrace works by detecting regular patterns and storing only the information necessary to recreate the patterns. This work  led to a best paper award in 2007.

Critical Path Analysis (2005-2010)

Identification of the critical path in applications. This work was used to understand and subsequently optimize performance of parallel applications.

 

Projects at Cornell University

Coherence Mechanisms for Aliased Memories (since 2002-2004)

Design and evaluation of coherence schemes for systems with memory aliases. I worked closely with colleagues at LLNL to provide coherence support for novel high performance memory systems.

Owl: System Monitoring and Performance Evaluation (since 2001-2005)

Design and evaluation of a flexible system-wide monitoring framework with initial studies focusing on monitoring of cache activities.

SimSnap: Combining Native Execution and Architectural Simulation (2003-2005)

Techniques to speed up architectural simulation thereby enabling the use of realistic workloads.

Application-level Fault Tolerance / Cornell Checkpointing Compiler (2003-2004)

Hybrid scheme based on compiler technology and run-time mechanism to provide transparent application level fault tolerance for parallel applications. I developed software and participated in the extension of existing techniques for shared memory environments.

 

Projects at Technische Universität München

Relaxed Hardware Coherence for NUMA architectures (1999-2003)

Exploration of relaxed memory consistency schemes in NUMA architectures. This project was inspired by the observation that current architectures often impose overly strict memory coherence schemes and thereby cause unnecessary memory-update traffic. I initiated and managed the project called HAMSTER: Hybrid-dsm based Adaptive and Modular Shared memory archiTEctuRe. As part of the SMiLE project and as a continuation of the efforts in SISCI, this project developed a shared memory framework that can be retargeted to arbitrary shared memory programs and abstractions independent of the architecture.

DIOM: Distributed I/O Management (2000-2002)

Investigation of efficient parallel I/O for data intensive applications on commodity clusters. I implemented of a prototype of an I/O management framework and applied it to a medical imaging application.

SMiLE: Shared Memory in a Lan-like Environment (1997-2002)

Implementation and exploitation of SCI (Scalable Coherent Interface) based clusters. My work included both the design of hardware components and extensive software development efforts as well as various administrative and strategic tasks.

NEPHEW: Network of PCs Heterogeneous Windows-NT Engineering Toolset (1999-2000)

Port a graphical parallel programming package to cluster environments and evaluation using three real-world applications, including the iterative reconstruction of Positron Emission Tomography images. I developed software, managed the project, and performed extensive dissemination. This ESPRIT funded EU project was done in cooperation with partners in four European countries.

SISCI: Standard software Infrastructure for SCI-based parallel systems (1997-1999)

Design, implemention, and testing  of software infrastructure for SCI (Scalable Coherent Interface) based commodity clusters. My tasks included software development, project management, and extensive dissemination. This ESPRIT funded EU project was done in cooperation with seven partners in four European countries.

 

Projects at University of Illinois

MORPH: Configurable Computing for Petaflops (1996)

Design and evaluation of a then next-generation Petaflop architecture using reconfigurable logic. I conducted the initial design studies and performed evaluation using simulation.

Illinois Concert (1995-1996)

High-level and object-oriented parallel programming environment deploying both compiler technology and efficient run-time mechanisms. I conducted a large application study (parallel volume rendering using surface extraction) for the Illinois Concert C++ system.