4-1PipelineSPRU733
Pipeline
The C67x DSP pipeline provides flexibility to simplify programming and
improve performance. Two factors provide this flexibility:
Control of the pipeline is simplified by eliminating pipeline interlocks.
Increased pipelining eliminates traditional architectural bottlenecks in
program fetch, data access, and multiply operations. This provides single-
cycle throughput.
This chapter starts with a description of the pipeline flow. Highlights are:
The pipeline can dispatch eight parallel instructions every cycle.
Parallel instructions proceed simultaneously through each pipeline
phase.
Serial instructions proceed through the pipeline with a fixed relative phase
difference between instructions.
Load and store addresses appear on the CPU boundary during the same
pipeline phase, eliminating read-after-write memory conflicts.
All instructions require the same number of pipeline phases for fetch and
decode, but require a varying number of execute phases. This chapter
contains a description of the number of execution phases for each type of
instruction.
Finally, the chapter contains performance considerations for the pipeline.
These considerations include the occurrence of fetch packets that contain
multiple execute packets, execute packets that contain multicycle NOPs, and
memory considerations for the pipeline. For more information about fully
optimizing a program and taking full advantage of the pipeline, see the
TMS320C6000 Programmer’s Guide (SPRU198).
Topic Page
4.1 Pipeline Operation Overview 4-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Pipeline Execution of Instruction Types 4-12. . . . . . . . . . . . . . . . . . . . . . . .
4.3 Functional Unit Constraints 4-33. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Performance Considerations 4-56. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 4