3.1 Dynamic Power Consumption

To understand how architectural strategies can provide high performance for perception applications at low power levels, it is necessary to look at the CMOS circuit dynamic power consumption equation:

\end{displaymath} (3.1)

P is the power consumed, A is the activity factor, i.e., the fraction of the circuit that is switching, C is the switched capacitance, V is the supply voltage, and F is the clock frequency [109]. If a capacitance of C is charged and discharged by a clock signal of frequency F and peak voltage V, then the charge moved per cycle is $CV$ and the charge moved per second is $CVF$. Since the charge packet is delivered at voltage V, the energy dissipated per cycle, or the power, is $CV^{2}F$. The data power for a clocked flip-flop, which can toggle at most once per cycle, will be $\frac{1}{2}CV^{2}F$. When capacitances are clock gated or when flip-flops do not toggle every cycle, their power consumption will be lower. Hence, a constant called the activity factor ($0\le A\le1$) is used to model the average switching activity in the circuit. Equation 3.1 is derived by incorporating this term into the power consumption. Custom ASICs can drastically reduce the power consumption by using specialized circuit structures and concurrency to lower $C$ and $F$ respectively. The drawback is that custom ASICs are inflexible and once fabricated, they cannot be reprogrammed. Also, their high production costs and long design times often make them an unattractive choice. While programmable perception processors are more desirable than ASICs, ASICs still represent the ``gold standard'' against which perception processors should be compared. This is because the specialized nature of an ASIC gives it significant power, performance and die area advantages when compared to a general purpose processor. So they represent the best possible implementation of a particular algorithm for a given CMOS technology.

Assume that an application is required to perform $N$ operations every $t$ seconds to keep up with real time. Then it should be the case that:

\frac{N}{IPC_{avg}\times F}\leq t

$IPC_{avg}$ refers to the average number of instructions issued per second across the whole application. Further, when $\frac{N}{IPC_{avg}\times F}<t$, the processor has too much performance, i.e., its frequency is too high and it wastes power. When handling constant rate real-time workloads, it is not useful to finish the work early and power down the circuit till the next real-time deadline. The overhead of reloading state holding data memories and the instruction memory may be in the range of several thousand cycles. It is better to slow down the processor to have just enough performance to meet real-time deadlines rather than paying the reload penalty tens or hundreds of times per second depending on the nature of the constant rate workload. Thus the ideal frequency of operation is:

F_{ideal}=\frac{N}{IPC_{avg}\times t}
\end{displaymath} (3.2)

Substituting this back in the power equation we get:
P=ACV^{2}\frac{N}{IPC_{avg}\times t}
\end{displaymath} (3.3)

Binu Mathew