| Patent Number |
Title Of Patent |
Date Issued |
| 7555607 |
Program thread syncronization for instruction cachelines |
June 30, 2009 |
| In a method of and system for program thread synchronization, an instruction cache line is determined each of a plurality of program threads to be synchronized. For each processor executing one or more of the threads to be synchronized, execution of the thread is halted at a barrier |
| 7296136 |
Methods and systems for loading data from memory |
November 13, 2007 |
| According to an exemplary embodiment of the present invention, a method for loading data from at least one memory device includes the steps of loading a first value from a first memory location of the at least one memory device, determining a second memory location based on the first |
| 7194609 |
Branch reconfigurable systems and methods |
March 20, 2007 |
| The invention is a system and method for executing programs. The invention involves a plurality of processing elements, wherein a processing element of the plurality of processing elements generates a branch command. The invention uses a programmable network that transports the branch co |
| 7146480 |
Configurable memory system |
December 5, 2006 |
| A configurable memory system is disclosed, which includes a processor-to-memory network, a memory-to-processor network, and a plurality of memory modules. Both networks in turns include a plurality of transport cells that can be configured to implement various transport networks, one |
| 7086038 |
System and method for creating systolic solvers |
August 1, 2006 |
| One embodiment of the invention is a method for forming a solver for a loop nest of code, the method comprising forming a time and space mapping of a portion of the loop nest, performing at least one optimization that is dependent on the time and space mapping to the portion of the loop |
| 7000091 |
System and method for independent branching in systems with plural processing elements |
February 14, 2006 |
| The invention is a system and method for executing a program that comprises a plurality of basic blocks on a computer system that comprises a plurality of processing elements. The invention generates a branch instruction by one processing element of the plurality of processing elemen |
| 6993639 |
Processing instruction addressed by received remote instruction and generating remote instructio |
January 31, 2006 |
| Embodiments of the invention relate to a processing cell for use in computing systems. Generally, a processing cell generates remote instructions to be received and processed by at least one other processing cell. A processing cell may include a program counter, an instruction memory |
| 6952816 |
Methods and apparatus for digital circuit design generation |
October 4, 2005 |
| A technique for synthesizing digital circuit designs by incorporating timing convergence and routability considerations. In one aspect, the invention provides a system and programmatic method for generating a circuit design from a functional specification according to at least one de |
| 6651222 |
Automatic design of VLIW processors |
November 18, 2003 |
| A VLIW processor design system automates the design of programmable and non-programmable VLIW processors. The system takes as input an opcode repertoire, the I/O format of the opcodes, a register file specification, and instruction-level parallelism constraints. With this input speci |
| 6581187 |
Automatic design of VLIW processors |
June 17, 2003 |
| A VLIW processor design system automates the design of programmable and non-programmable VLIW processors. The system takes as input an opcode repertoire, the I/O format of the opcodes, a register file specification, and instruction-level parallelism constraints. With this input speci |
| 6457173 |
Automatic design of VLIW instruction formats |
September 24, 2002 |
| A computer-implemented method automates the design of efficient binary instruction encodings of VLIW instruction formats. The method automatically finds compact instruction formats that can express and exploit the full parallelism specified in the underlying processor microarchitectu |
| 6408428 |
Automated design of processor systems using feedback from internal measurements of candidate sys |
June 18, 2002 |
| An automated design system for VLIW processors explores a parameterized design space to assist in identifying candidate processor designs that satisfy desired design constraints, such as processor cost and performance. A VLIW synthesis process takes as input a specification of proces |
| 6385757 |
Auto design of VLIW processors |
May 7, 2002 |
| A VLIW processor design system automates the design of programmable and non-programmable VLIW processors. The system takes as input an opcode repertoire, the I/O format of the opcodes, a register file specification, and instruction-level parallelism constraints. With this input speci |
| 5999738 |
Flexible scheduling of non-speculative instructions |
December 7, 1999 |
| A technique for flexible scheduling of a code sequence wherein a set of instructions for determining a a fully-resolved predicate for each of a set of non-speculative instructions contained in the code sequence is generated. An optimized code sequence is then generated that includes the |
| 5920716 |
Compiling a predicated code with direct analysis of the predicated code |
July 6, 1999 |
| A compiler of a predicated code includes a data flow analysis system that manipulates and queries predicate expressions of the predicated code to (1) analyze data flow properties of the predicated code and (2) annotate the predicated code with the analyzed data flow properties. A pre |
| 5850553 |
Reducing the number of executed branch instructions in a code sequence |
December 15, 1998 |
| A compiler technique for reducing the number of executed branches in a code sequence. Multiple condition branch instructions in a program sequence are replaced with a single combined conditional branch instruction thereby eliminating the time-consuming execution of multiple branch instru |
| 5778219 |
Method and system for propagating exception status in data registers and for detecting exception |
July 7, 1998 |
| A method for supporting speculative execution includes designating operations as speculative or non-speculative, and then deferring exceptions generated by speculative operations while immediately reporting exceptions by non-speculative operations. If a speculative operation uses a r |
| 5710912 |
Method and apparatus for enabling a computer system to adjust for latency assumptions |
January 20, 1998 |
| A method and system are disclosed which allow a computer program to execute properly in object code compatible processing systems which have latencies different from those with which the program was created or compiled. This resulting compatibility of the computer program is achieved bec |
| 5692169 |
Method and system for deferring exceptions generated during speculative execution |
November 25, 1997 |
| A method for supporting speculative execution includes designating operations as speculative or non-speculative, and then deferring exceptions generated by speculative operations while immediately reporting exceptions by non-speculative operations. If a speculative operation uses a r |
| 5664135 |
Apparatus and method for reducing delays due to branches |
September 2, 1997 |
| An improved computer architecture and instruction set that reduces the delays produced by branch instructions. The invention utilizes a branch processor having a branch memory for storing information specifying a plurality of branch instructions that are contained in a code sequence. |
| 5615386 |
Computer architecture for reducing delays due to branch instructions |
March 25, 1997 |
| An improved data processing system for executing branch instructions which has lower latency times and which only rarely requires the instruction pipeline to be flushed is disclosed. The data processing system utilizes a register file to hold the information needed to execute a branch |
| 5404484 |
Cache system for reducing memory latency times |
April 4, 1995 |
| The improved cache system reduces the effects of latency times by utilizing a preload instruction inserted by the compiler into the code. The preload instruction is sent sufficiently in advance of the corresponding load instruction to guarantee that the relevant data is in the cache memo |