Perflint: A Context Sensitive Performance Advisor for C++ Programs, 2009 International Symposium on Code Generation and Optimization, 2009. ,
DOI : 10.1109/CGO.2009.36
Binary analysis for measurement and attribution of program performance, PLDI, 2009. ,
Identifying potential parallelism via loopcentric profiling, Proceedings of the 2007 International Conference on Computing Frontiers, 2007. ,
DOI : 10.1145/1242531.1242554
Visualizing potential parallelism in sequential programs, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, 2008. ,
DOI : 10.1145/1454115.1454129
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.5696
Benchmark health considered harmful, SIGARCH Computer Architecture News, 2001. ,
DOI : 10.1145/503205.503206
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.6730
OptiScope: Performance Accountability for Optimizing Compilers, 2009 International Symposium on Code Generation and Optimization, 2009. ,
DOI : 10.1109/CGO.2009.26
Producing wrong data without doing anything obviously wrong, ASPLOS, 2009. ,
DOI : 10.1145/1508284.1508275
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.163.8395
Dynamic run-time architecture techniques for enabling continuous optimization, Proceedings of the 2nd conference on Computing frontiers , CF '05, 2005. ,
DOI : 10.1145/1062261.1062296
Blind Optimization for Exploiting Hardware Features, Conference on Compiler Construction, 2009. ,
DOI : 10.1145/268424.268469
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.1442
Fast, automatic, procedure-level performance tuning, Proceedings of the 15th international conference on Parallel architectures and compilation techniques , PACT '06, pp.173-181, 2006. ,
DOI : 10.1145/1152154.1152182
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.77.4089
Automatically Tuned Linear Algebra Software, Proceedings of the IEEE/ACM SC98 Conference, 1998. ,
DOI : 10.1109/SC.1998.10004
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.3487
Confessions of a performance monitor hardware designer, " in Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA Amd lightweight profiling specification, 2005. ,
Continuous profiling: where have all the cycles gone, SOSP '97: Proceedings of the sixteenth ACM symposium on Operating systems principles, pp.1-14, 1997. ,
DOI : 10.1145/265924.265925
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.124.5957
Profileme : Hardware support for instruction-level profiling on out-of-order processors Instruction-based sampling: A new performance analysis technique for amd family 10h processors, International Symposium on Microarchitecture, pp.292-302, 1997. ,
Refining performance monitor design, Proceedings of the 2004 Workshop on Complexity Effective Design (WCED), 2004. ,
Automatic performance model construction for the fast software exploration of new hardware designs, Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems , CASES '06, 2006. ,
DOI : 10.1145/1176760.1176765
Performance monitoring hardware will always be a low priority, second class feature in processor designs until, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
Methods for modeling resource contention on simultaneous multithreading processors, 2005 International Conference on Computer Design, 2005. ,
DOI : 10.1109/ICCD.2005.74
URL : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.470.4203&rep=rep1&type=pdf
Using Model Trees for Computer Architecture Performance Analysis of Software Applications, 2007 IEEE International Symposium on Performance Analysis of Systems & Software, 2007. ,
DOI : 10.1109/ISPASS.2007.363742
A general compiler framework for speculative optimizations using data speculative code motion, CGO '05: Proceedings of the international symposium on Code generation and optimization, 2005. ,
Hardware performance counters for detailed runtime power and thermal estimations: Experiences and proposals, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
Adaptive thread scheduling for simultaneous multithreading processors, Boulder, CO, 2006. ,
Analyis of Path Profiling Information Generated with Performance Monitoring Hardware, 9th Annual Workshop on Interaction between Compilers and Computer Architectures (INTERACT'05), pp.34-43, 2005. ,
DOI : 10.1109/INTERACT.2005.3
Learning and leveraging the relationship between architecture-level measurements and individual user satisfaction, ISCA, 2008. ,
DOI : 10.1145/1394608.1382158
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.144.2358
What we need to be able to count to tune programs, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
Efficient collection of information on the locality of accesses, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
The NUMA challenge, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
Us patent no. 5953530. method and apparatus for run-time memory access checking and memory leak detection ,
Hardware-Based Profiling: An Effective Technique for Profile-Driven Optimization, International Journal of Parallel Programming, vol.7, issue.7, 1996. ,
DOI : 10.4135/9781412985451
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.40.9370
Interaction cost and shotgun profiling, ACM Transactions on Architecture and Code Optimization, vol.1, issue.3, 2004. ,
DOI : 10.1145/1022969.1022971
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.70.1364
A programmable co-processor for profiling, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture, 2001. ,
DOI : 10.1109/HPCA.2001.903267
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.786
Can hardware performance counters be trusted? " in IISWC, 2008. ,
DOI : 10.1109/iiswc.2008.4636099
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.1880
Performance monitoring with papi using the performance application programming interface, Dr. Dobb's, 2005. ,
Towards a flexible and realistic hardware performance monitor infrastructure, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
Managing The Complexity Of Performance Monitoring Hardware: The Brink Andabyss Approach, The International Journal of High Performance Computing Applications, vol.22, issue.4, 2006. ,
DOI : 10.1109/MM.2002.1028477
Memory performance and cache coherency effects on an intel nehalem multiprocessor system ,
Performance hardware if i ran the world, Workshop on Hardware Performance Monitor Design and Functionality colocated with HPCA, 2005. ,
Complementing Missing and Inaccurate Profiling Using a Minimum Cost Circulation Algorithm, HiPEAC, 2008. ,
DOI : 10.1007/978-3-540-77560-7_20
Inferred call path profiling, OOPSLA, 2009. ,
Pin: building customized program analysis tools with dynamic instrumentation, PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pp.190-200, 2005. ,
Valgrind: A framework for heavyweight dynamic binary instrumentation, Proceedings of ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI 2007), 2007. ,
Efficient, transparent, and comprehensive runtime code manipulation, 2004. ,
Simics: A full system simulation platform, Computer, vol.35, issue.2, 2002. ,
DOI : 10.1109/2.982916
Microarchitecture-Independent Workload Characterization, IEEE Micro, vol.27, issue.3, pp.63-72, 2007. ,
DOI : 10.1109/MM.2007.56
System support for automated profiling and optimization, SOSP, 1997. ,
DOI : 10.1145/269005.266640
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.137.2900
Bursty tracing: A framework for low-overhead temporal profiling, 4th ACM Workshop on Feedback-Directed and Dynamic Optimization, 2001. ,
A framework for reducing the cost of instrumented code, SIGPLAN Conference on Programming Language Design and Implementation, pp.168-179, 2001. ,
Shadow Profiling: Hiding Instrumentation Costs with Parallelism, International Symposium on Code Generation and Optimization (CGO'07), 2007. ,
DOI : 10.1109/CGO.2007.35
Performance prediction based on inherent program similarity, Proceedings of the 15th international conference on Parallel architectures and compilation techniques , PACT '06, 2006. ,
DOI : 10.1145/1152154.1152174
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.109.2528
Heap profiling for space-efficient java, PLDI '01: Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, 2001. ,
DOI : 10.1145/381694.378820
Exploring application performance: a new tool for a static/dynamic approach, Los Alamos Computer Science Institute Symp, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00141071
Finding parallelism for future epic machines, Proceedings of the Fourth Workshop on Explicitly Parallel Instruction Computer Architectures and Compiler Technology (EPIC), 2005. ,
A fast and accurate method for determining a lower bound on execution time, Concurrency: Practice and Experience, pp.271-292, 2004. ,
DOI : 10.1002/cpe.774