Many embedded, media and digital signal processing applications involve simple repeated computations on a very long or never ending sequence of data. There is limited access or no access at all to past data. A good example is an image processing system such as the simplified version of a face recognition system shown in Figure 1. This surveillance system accepts a video stream from a camera, identifies the pixels that have human skin color, segments the image into regions that contain skin or no-skin, uses a neural network based algorithm to identify regions that may contain a face, uses another neural network based algorithm to locate the eyes and then tries to match the face against a database of known faces to obtain a persons identity. Details of the system may be found in . The application represents a well structured assembly of simple compute intensive algorithms. Data-flow between the component blocks is regular and predictable. The whole computation may be abstracted as a data-flow graph consisting of a few key procedures and an input stream and an output stream. We say that applications with such simple regular structures are ``stream-able'' and the style of computation is called ``stream processing''. Other examples include link-level encryption in networks, video trans-coding, video compression, cellular telephony as well as the image and speech processing. Even though stream optimized processor hardware is a relatively new area, stream oriented techniques are ubiquitous in the software world with Unix pipes being a prime example.