In a method for scheduling instructions executed in a computer system including a processor and a memory subsystem, pipeline latencies and resource utilization are measured by sampling hardware while the instructions are executing. The instructions are then scheduled according to the
A method is provided for optimizing a program by inserting memory prefetch operations in the program executing in a computer system. The computer system includes a processor and a memory. Latencies of instructions of the program are measured by hardware while the instructions are process