The Media Stream Processor: A New Value-Adding Opportunity in Computer Telephony

Is the computer telephony industry passing up an opportunity to be much more efficient — able to deliver greater value to end users for far less investment? That means lower costs to customers and greater return on investment for market participants. Imagine for a moment where the computer industry would be today had there never been the PC, and instead many Apples: Every computer manufacturer would have his own proprietary closed-architecture bus and his own proprietary closed-architecture operating system. Add-in board manufacturers would see markets one-tenth the size of today’s markets, as would application developers. And everything would cost more than with today’s open-architecture industry. This is not a pretty picture, but that is just the situation in computer telephony today only it’s worse.


In computer telephony, not only are system-level media-processing resources (the “boards”: voice, fax, speech recognition, etc.) closed architectures, they are also function-specific. It’s as if you had to buy one computer for spreadsheets and another for word processing. (Remember the stand-alone word processor and how long it lasted after the introduction of the PC?) The situation has begun to change with companies such as BICOM, Brooktrout, Linkon, Natural MicroSystems (NMS), and PIKA, beginning to take advantage of advances in DSP price-performance by developing integrated-media (multi-function) computer telephony boards, usually by adding fax modems supplied by another company. The Dialogic Antares and the Analogic TAP-800 series are open-architecture DSP-resource boards designed to support speech recognition and speech compression algorithms. And both Dialogic’s recent announcement of its DM-3 architecture and Brooktrout’s announcement of its Boston architecture are attempts to allow them to move from their soon-to-be-obsolete fixed-function architectures to the new integrated-media paradigm championed by Commetrex.


But, these DSP-resource boards have one of two limitations: either the board can’t support voice and call processing and you must use another board to supply those easily implemented functions, or you must use the call management and voice processing supplied with the board. The Dialogic Antares and Analogic TAP-800 series are examples of the former, all others are examples of the latter. According to Jeff Hill, Antares product manager, “The Antares can’t be used in any configuration other than with another voice board.” The TAP-800 is the same. All the other boards include the manufacturer’s voice and call processing. Another vendor’s voice processing could not be added to an NMS board by a third party, for example. Since even the boards which boast integrated media use closed, proprietary architectures the manufacturer must do the media integration – a major challenge. Two of these companies, NMS and PIKA, have boards that are at least partially open. But, since they are proprietary, other companies cannot develop compatible versions and neither has sought to attract large numbers of third-party developers of board-level applications (media-processing technologies).


So why would the industry benefit from a standardized open-architecture media-neutral CT resource architecture? Industry efficiency is why. Look at the PC industry. It’s the most efficient value-adding industry ever. Imagine a PC industry with only proprietary PCs–just bunch of Apple Macintoshes. It would not be an overstatement to say the world would be a different place and not for the better: There would be less to choose from, and what was available would cost more. If open is good enough for the PC industry it’s good enough for the CT industry.


What the computer telephony industry needs is what the computer industry has had for over a decade: a standardized environment that allows any company to develop the hardware and board-level software environment. This will, in turn, allow any other company to develop a higher-level of value addition, such as the software which creates basic system-resource functionality (e. g., voice, fax, and data). These media-processing resources would compete for board-level resources, such as DSP MIPS, memory, and PCM streams, in order to provide media-specific services to client processes.


Commetrex and the other members of the MSP Consortium (BICOM, Calibre Industries, Centigram, Cole Technical Services, Computer Communications Specialists, MiBridge, NKO, Pika and QNX) have undertaken the task of defining the Media Stream Processor (MSP), a system-resource-level software environment which gives the computer telephony developer the option of developing either an environment (a PC add-in board or other resource module) and its resource module-level software or media-processing products which are MSP-compliant, and will therefore run on any MSP. M.100 (the MSP specification) hides the specifics of the hardware: the media-stream interface, scalar and signal processors, and the interface to other major system elements, such as the host computer.


So a developer could embed an M.100-compliant DSP-resource module in a closed-architecture system, such as a PBX, and still use M.100-compliant media-processing products to add the desired functions to the system, all without the cost or time-to-market penalty of developing them in house or porting another company’s technology.


The current draft of the M.100 specification, is available from the MSP Consortium. It specifies a comprehensive stream-processing environment which not only meets the goal of dramatically reducing porting costs, but reduces the initial development costs of integrated-media systems and increases resource utilization.


The MSP’s steam-based environment supports Anonymous Inter-Vendor cooperation by allowing the output of one vendor’s Media Stream Transform (MST) software element to feed the input of another vendor’s MST. For example, the output of Vendor A’s vocoder could feed Vendor B’s TCP/IP stack.


The MSP improves resource utilization through reduced pre-emption. It’s called Stream-Paced Execution. Stream-Paced Execution means an MST is activated when prior MSTs have completed, and executes until it has processed a pre-defined (through its execution wrapper) increment of stream (usually isochronous) data. This means each MST executes for an optimum period of time: For a modem it might be an integral number of bauds; for a speech-recognition algorithm it might be one utterance. Because an MST relinquishes control when it has completed a “work package” rather than loses it through pre-emption, context switch time is significantly reduced.


M.100 does not define the host-system bus, the resource module’s or board’s form factor, media-stream highway, or on-board processors. But it does define the interfaces to these system elements visible to board-level applications. Actually, use of the term “board” should be avoided since the MSP specification is just as applicable to a host-based implementation(that uses MMX instructions as a DSP(as it is to a board-level implementation.


The definition of such a platform can have significant consequences: Any company can design and manufacture MSP-compliant hardware or software. The price of hardware will go down. The price of media-processing software will go down. Density and media-diversity will increase. Performance of media-processing technology will increase.


If you are interested in such a development, the MSP Consortium is interested in working with you on refining the specification and promoting the acceptance of the MSP standard. We want the MSP to meet your needs. By working together we can make sure M.100 is open for all and meets the needs of the largest possible set of industry developers at all levels of the value-adding spectrum.


Your investment would be the time to review and comment on the MSP requirements document. You will also be asked to make your participation in the MSP Consortium public.


If you’re interested call Mike Coffee, the designated “MSP Evangelist”, at Commetrex at 770-449-7775 x310 or