Texas Instruments TMS320C67X/C67X+ DSP Car Speaker User Manual

Performance Considerations
Pipeline4-56 SPRU733
4.4 Performance Considerations
The C67x DSP pipeline is most effective when it is kept as full as the algorithms
in the program allow it to be. It is useful to consider some situations that can
affect pipeline performance.
A fetch packet (FP) is a grouping of eight instructions. Each FP can be split into
from one to eight execute packets (EPs). Each EP contains instructions that
execute in parallel. Each instruction executes in an independent functional
unit. The effect on the pipeline of combinations of EPs that include varying
numbers of parallel instructions, or just a single instruction that executes
serially with other code, is considered here.
In general, the number of execute packets in a single FP defines the flow of
instructions through the pipeline. Another defining factor is the instruction
types in the EP. Each type of instruction has a fixed number of execute cycles
that determines when this instruction’s operations are complete. Section 4.4.2
covers the effect of including a multicycle NOP in an individual EP.
Finally, the effect of the memory system on the operation of the pipeline is
considered. The access of program and data memory is discussed, along with
memory stalls.
4.4.1 Pipeline Operation With Multiple Execute Packets in a Fetch Packet
Referring to Figure 46, page 4-6, pipeline operation is shown with eight
instructions in every fetch packet. Figure 428, however, shows the pipeline
operation with a fetch packet that contains multiple execute packets. Code for
Figure 428 might have this layout:
instruction A ; EP k FP n
|| instruction B ;
instruction C ; EP k + 1 FP n
|| instruction D
|| instruction E
instruction F ; EP k + 2 FP n
|| instruction G
|| instruction H
instruction I ; EP k + 3 FP n + 1
|| instruction J
|| instruction K
|| instruction L
|| instruction M
|| instruction N
|| instruction O
|| instruction P
... continuing with EPs k + 4 through k + 8, which have
eight instructions in parallel, like k + 3.