For many of the general telephony stream-processing tasks, the TMS320C6000 C optimizer can yield higher densities with no hand assembly coding. Other technologies require a healthy dose of optimization to reach target densities.
You can take steps to optimize C-coded reference modems to meet higher-density targets. How high? C-baseline modems, for example, can soar from 6 per 200-MHz C6201 to 28 modems per chip in four short project phases. A fifth project phase can take the number of channels to 48 per DSP.
In fact, for its MSP MEDIA Gateway line of DSP resource boards based on C6000 DSPs, Commetrex undertook the four phases, and the process worked. Our MSP-320 PCI board, with two C6201s and a quad E1/T1 network interface, needed 48 to 60 channels of processing from each DSP. For many of the general telephony stream-processing tasks, the C6000 C optimizer gave us the densities we needed with no hand assembly coding.
“Out of the box” C-coded modems, which are a reference design and written for understandability rather than efficiency, might compile to, say, six simultaneous modems. You should be able to double that by guiding the modems through the Code Composer Studio (CCS) optimizer and by ensuring that your memory layout takes advantage of the C6000’s on-chip RAM.
CCS includes an optimization tutorial that provides a recommended code development flow consisting of four phases (Figure 1). (A similar tutorial is in the TMS320C6000 Programmer’s Guide.)
Figure 1. Code Composer Studio’s optimization tutorial recommends a code development flow consisting of four phases. The first three phases focus on utilizing the optimization abilities of the ‘C6000 compiler to achieve high code performance while maintaining the code in C. The last phase involves linear assembly coding of the portions of the code whose performance needs to be improved further. (This figure is based on the one on p. 1-4 of theTMS320C6000 Programmer’s Guide).
Phase 1 involves compiling and profiling your baseline C code. Before you begin any optimization effort, use the profiling tools to identify the performance-critical areas in your code.
Phase 2 involves compiling with the appropriate optimization options and analyzing the feedback provided by the compiler to improve the performance of your code.
Phase 3 is a critical phase during which you use a number of techniques to tune your C code for better performance. One technique is to provide as much information as possible to the compiler so that it can perform adequate software pipelining, especially for MIPS-intensive loops. Another is to analyze the dependencies between instructions. If the compiler determines that two instructions are independent, it attempts to schedule them to execute in parallel. You can help the compiler make those determinations.
A third technique is to refine your C code to use the C6000 intrinsics, which are special functions that map directly to in-lined C6000 instructions. These functions are usually not easily expressed in C. Intrinsics allow you to have more precise control over the selection of instructions by the compiler.
Phase 4 is needed if the performance of certain areas of your code must be improved beyond the tuning phase. After yet another profile of the code, you can extract the performance-critical areas and rewrite them in linear assembly language. This form of assembly code doesn’t require that you provide functional unit selection, pipelining, parallelization, or register allocation; those tasks will still be performed by the compiler. It will, however, give you more control over the exact C6000 instructions to be used. You can also pass more useful information to the tools, such as which memory bank is to be used.