next up previous contents
Next: 4.5 Branch Prediction Up: 4 Defoe: An Example Previous: 4.3 Instruction Encoding   Contents

4.4 Instruction Dispersal and Issue

A traditional VLIW with fixed width MultiOps has no need to disperse operations. However, when using a compressed format like that of the Defoe, there is a need to expand the operations, and insert NOPs for function units to which no operation is to be issued. To make the dispersal task easy we make the following assumptions:

Apart from reducing wastage of memory, another reason to prefer a compressed format VLIW over an uncompressed one is that the former provides better I-Cache utilization. To improve performance, we use a predecode buffer that can hold up to 8 uncompressed MultiOps. The dispersal network can use a wide interface (say 512 bits) to the I-cache to uncompress up to 2 MultiOps every cycle and save them in the predecode buffer. Small loops of up to 8 MultiOps (maximum 48 operations) will experience repeated hits in the predecode buffer. It may also help lower the power consumption of a low-power VLIW processor. Defoe supports in-order issue and out of order completion. Further, all the operations in a MultiOp are issued simultaneously. If even one operation cannot be issued, issue of the whole MultiOp stalls.


next up previous contents
Next: 4.5 Branch Prediction Up: 4 Defoe: An Example Previous: 4.3 Instruction Encoding   Contents
Binu K. Mathew