next up previous contents
Next: 7 Scheduling Algorithms for Up: Very Large Instruction Word Previous: 5 The Intel Itanium   Contents

6 The Transmeta Crusoe Processor

The Crusoe processor from Transmeta corporation represents a very interesting point in the development of VLIW processors. Traditionally, VLIW processors were designed with the goal of maximizing ILP and performance. The designers of the Crusoe on the other hand needed to build a processor with moderate performance compared to the CPU of a desktop computer, but with the additional restriction that the Crusoe should consume very little power since it was intended for mobile applications. Another design goal was that it should be able to efficiently emulate the ISA of other processors, particularly the 80x86 and the Java virtual machine.

The designers left out features like out of order issue and dynamic scheduling that require significant power consumption. They set out to replace such complex mechanisms of gaining ILP with simpler and more power efficient alternatives. The end result was a simple VLIW architecture. Long instructions on the Crusoe are either 64 or 128 bits. A 128-bit instruction word called a molecule in Transmeta parlance encodes 4 operations called atoms. The molecule format directly determines how operations get routed to function units. The Crusoe has two integer units, a floating point unit, a load/store unit and a branch unit. Like the Defoe, the Crusoe has 64 general purpose registers and supports strictly in order issue. Unlike the Defoe which uses predication, the Crusoe uses condition flags which are identical to those of the x86 for ease of emulation.

Binary x86 programs, firmware and operating systems are emulated with the help of a run time binary translator called code morphing software. This makes the classical VLIW software compatibility problem a non-issue. Only the native code morphing software needs to be changed when the Crusoe architecture or ISA changes. As a power and performance optimization, the hardware and software together maintain a cache of translated code. The translations are instrumented to collect execution frequencies and branch history and this information is fed back to the code morphing software to guide its optimizations.

To correctly model the precise exception semantics of the x86 processor, the part of the register file that holds x86 register state is duplicated. The duplicate is called a shadow copy. Normal operations only affect the original registers. At the end of a translated section of code, a special commit operation is used to copy the working register values to the shadow registers. If an exception happens while executing a translated unit, the run time software uses the shadow copy to recreate the precise exception state. Store operations are implemented in a similar manner using a store buffer. As in the case of IA-64, the Crusoe provides alias detection hardware and data speculation primitives.


next up previous contents
Next: 7 Scheduling Algorithms for Up: Very Large Instruction Word Previous: 5 The Intel Itanium   Contents
Binu K. Mathew