The Stream Virtual Machine (SVM) is an abstract machine model that has been proposed by Labonte et al to represent the important characteristics of stream architectures and to develop techniques to compile applications and analyze their performance across different implementation architectures . Their compilation technique proceeds in two stages. First a high level compiler (HLC) reads a stream application written in a stream programming language such as StreamIt or ArrayC. The HLC also reads an abstract SVM model for a stream architecture such as the MIT RAW machine or Stanford Imagine. It then uses the abstract machine model to partition the application into kernels that will execute on particular processing resources and into data transfers between the kernels. This mapping may be described in terms of functions available in the SVM API. API functions provide for initializing local memory, scheduling kernels for execution, declaring dependence between kernels, co-ordinating DMA transfers between different units etc . The kernels are then compiled into binary form by a low level compiler (LLC) that is specific to the particular architecture.
An SVM model for a stream architecture consists of three types of components: processors, memories and links. Processors in turn come in three varieties. Control processors decide the sequence of operations performed by the entire machine. Control processors offload the compute intensive task of stream kernel execution to kernel processors (stream processors). Lastly, DMA engines are considered as processors that execute specialized kernels that transfer data between the many different memories in the system. The parameters that describe an SVM processor are its type (control, kernel or DMA), its operating frequency, function unit mix, degree of SIMD parallelism, the number and capacity of register files etc. The memories in an SVM system may be classified depending on their access mechanism into RAMs (random access allowed), FIFOs (only sequential access allowed) and caches (associative lookup allowed). Since stream processors use a hierarchy of memories that capture producer-consumer locality to economize main memory bandwidth, a natural characterization parameter for SVM memories is the bandwidth and latency they offer to entities that are above and below them in the bandwidth hierarchy. Links allow processors and memories to communicate with each other and are characterized by their bandwidth and latency.
Figure 2 shows an SVM model to which we can map our face recognizer from Figure 1. It consists of three stream processors, a control processor and a multi-channel DMA engine that can move data between the SRFs and main memory. Solid lines indicate data paths and dotted lines indicate control paths. We next describe different types of task to resource mappings for such an application.