The trade-off between energy consumption and performance is a common modern design choice. Increasing performance almost always involves increasing the energy requirements. As a result, it is misleading to compare solely on the basis of either energy or performance. This dilemma is even more meaningful for the real-time embedded perception applications that are the driving force for this work. The ability to process faster than real time simply means that power is being wasted. Therefore a common tactic in such cases is to either reduce clock frequency, supply voltage, or both. The fine grain scheduling capability of the perception processor also enables the work rate to be scheduled, which is a more intuitive mechanism and achieves results similar to clock frequency scaling.
An attractive and intuitive metric is to compare designs based on the energy expended to perform work at some rate . Gonzalez and Horowitz showed that , or its inverse, the energy delay product, is a good metric of architectural merit . Both architecture and semiconductor process influence the energy delay product. Since the feature size of the process, , has such a large impact it is necessary to normalize any design comparison to the same process. The normalization techniques applied to the results were described in Section 3.3.
The perception processor and the Pentium 4 are both implemented in 0.13 CMOS technology and their results need not be normalized. The XScale and the custom ASICs are implemented using 0.18 and 0.25 technologies respectively, and their results are normalized using this method to a 0.13 technology. The metrics used for evaluating the perception processor are: IPC, power, throughput, energy consumed to process each input packet, energy delay product and .