Yes, prior to some recent improvements, BladeWare had a lot in common with ATAs and gateways because of BladeWare’s support of both T.38 and G.711 pass-through fax. Depending on the network, all of these network elements could exhibit some not-so-good outbound-fax completion rates. You may be wondering why Commetrex, the leading vendor of these technologies, is telling you this. But as you can imagine, the story has a happy ending for anyone looking for transaction-completion rates for IP-based fax that are as good as with PSTN fax. (We’re not going to exceed PSTN rates since, ultimately, the calls we’re talking about typically end up on the PSTN.)
Bottom line? Commetrex has developed patent-applied-for technology that brings ATAs, access gateways, tandem gateways, and fax servers using our Smart FoIP technology virtually even with PSTN fax. If you’re interested in IP telephony, especially FoIP, you may find the story of how we made this technology advance interesting.
First, some background: Commetrex has been developing and licensing fax technologies since 1994 when we developed a software-only fax add-in for the NMS DSP-based voice boards. Since then, scores of telecom OEMs have licensed our modems and T.38 relay for their ATAs, such as Grandstream, and gateways, such as Sonus. Likewise, we’ve licensed our first-to-market Terminating T.38 and Multi-Modal Terminating Fax (both G.711 and T.38) to dozens of server and UM vendors for hosted and premises-based fax servers. And, as they say, we’ve eaten our own dog food, as the same technologies are the basis of BladeWare’s terminating-fax feature.
BladeWare is a heavily-muscled HMP telephony platform that supports voice, TDM fax with V.34, and IP fax, also with T.38 V3 and V.34 support. Important to this story is that, except for BladeWare and unlike ATAs and gateways, nearly all HMP fax servers only support T.38–G.711 pass-through terminating fax is not supported. On the other hand, BladeWare supports both T.38 and G.711 pass-through.
It’s also important to note that this has nothing to do with the simple case of sending a fax to the PSTN through an enterprise-based gateway. No. We’re talking about what we’ve called in previous white papers (The State of IP Fax) Phase II of IP fax. In Phase I, IP-based fax servers and ATAs were stranded on the enterprise IP island. In Phase II, ATAs and IP-based fax servers are moving beyond on-premises gateways by using SIP trunking or direct SIP peering with a service provider. Phase I was easy; Phase II is where the big problems appear. (By the way, if you are having trouble with ATAs with intra-enterprise applications, avoid needless suffering and get gateways that have a solid T.38 implementation. You can assure yourself of that by insisting on Commetrex technology inside.)
The industry is only in the first few years of Phase II since the long-haul IP carriers have only recently added T.38-capable gateways and support to their networks. With this T.38 infrastructure in place, T.38-capable ATAs, and SIP trunking available, enterprises are trying to get rid of their dedicated PSTN fax lines. But not so fast. We know that G.711 pass-through fax often has problems (it really depends on the IP network). And trying to send a T.38 fax from an ATA-connected fax terminal was a hit-or-miss affair. For example, the Bandwidth.com Website has this on its FAQ page: “…keep faxes running over POTS lines.”
So, is T.38 the problem? When we tested T.38 over the open Internet with “nailed-up” IP connections (no SIP involved), we found that it was completely reliable. When we were operating our T.38 Interoperability Test Lab we sent a 1,000-page test fax to Australia without a single error. Our 11-year experience with T.38 has convinced us that it is just what the industry needs for IP fax, but many believe it’s the reason for the problems they’ve encountered. Of course, there are plenty of less-than-capable T.38 implementations out there, and some are responsible for interop problems, but it’s not the main problem. Although we were convinced T38 wasn’t the cause of the problem (especially if it was ours), we had no idea what the real problem was. But that was before we went to work on the problem.
Was it SIP? Certainly SIP’s interoperability—or its lack—is responsible for many of the problems being encountered in Phase II. That’s why there is plenty of work for the SIP Forum’s FoIP Task Group. However, once you resolve an interop problem between an ATA and an enterprise gateway or between a fax server and a gateway, you’re off and running. But put the SIP peer inside a service provider’s network, instead of an enterprise network, and it’s a different story. Some calls work and some don’t, even if they are T.38. As the leading developer of T.38 technologies, this caused Commetrex a great deal of concern. So when Copia International, one of our BladeWare enterprise-fax OEMs, came to us with the statistical results of some fax broadcasts that were less than impressive, we jumped at the opportunity to work with them to find out why.
Copia’s CopiaFacts server is well known for its fax-broadcast features. If fax broadcast is your business or just a routine marketing tool, you want to know the details of each campaign. You have to give your customers accurate invoices. And efficiency is the key to a profitable operation. Knowing the details of each broadcast allows the vendor to fine tune his operation and list of recipients. So FaxFacts’ detailed reports were just what we needed to figure out what was going on. We wanted to know if the problem wasn’t T.38 and it wasn’t SIP, what was it?
Copia had a broadcast list of about 2,000 cleaned and “friendly” fax numbers to send to, an essential prerequisite to the testing we wanted to do. Of course, we had BladeWareenabled for both T.38 and G.711, so each broadcast test was the equivalent of hundreds of ATAs as the faxes could go either G.711 pass-through or be re-invited over to T.38 by the SIP peer embedded in the provider’s network. The BladeWareserver was connected via a DS-3 connection to a VoIP service provider that, in turn, routed all calls to an IP carrier that supported T.38, so we assumed all calls would end up being T.38. They did, even those calls that should not have gone T.38, and that turned out to be the problem.
After much data analysis, we found that the fax-completion rate was 10-15-percent lower than a T1 fax board. (We did a lot of A-B testing.) Moreover, there was a high correlation between the unsuccessful calls and high delays between the off-ramp (called) gateway’s 200 OK, meaning the outbound PSTN call has been placed, and the off-ramp gateway’s subsequent re-invite to T.38 (which means that the off-ramp gateway has determined that the called endpoint is a fax terminal). In some cases, and much to our amazement, the delay was over 20 seconds! The distribution of delays ranged up to 5 seconds for successful calls, but failures averaged over 8 seconds. We were unable to get the carrier to provide an explanation, so we theorized that many call routes involved multiple tandem connections, with each one adding to the delay, since the off-ramp gateway, rather than tandem gateways, is usually responsible for issuing the SIP re-invite. We also learned that the session border controllers (SBCs) used in some networks can issue a T.38 re-invite, resulting in T.38 “glare” since the re-invite can be sent in both directions. It then takes several seconds for the multiple tandem network elements to sort things out, further aggravating the situation.
This brings up another interesting point: The problem we’ve described here does not occur in every network. For example, some VoIP service providers or CLECs closely manage the IP transport. Since they control all ingress and egress, SIP peering between multiple networks is not a problem since it doesn’t occur (or occurs much less). Some CLECs will build proprietary high-performance metro networks. Since all off-net calls enter and leave via the provider’s gateways, signaling and media delays are small. Some regional VoIP service providers do the same thing. But there are many VolP service providers that hand every call to one of several IP-carrier partners. There is no control over the number of hops (other than the MaxHops IP parameter, which few providers change from a very high number, such as 70). We believe it is this VoIP business model that has given T.38 fax a bad name. And it’s the model Commetrex’ innovative new technology makes viable for ATAs, gateways, and fax media servers.
Here’s the details: The key finding is that the time between the network’s 200OK of the initial SIP invite (presumably the off-ramp gateway’s placing the outbound call), and the gateway’s determination that the answering terminal is a fax (which causes the T.38 re-invite) can vary greatly. So, what is the fax media server or the calling ATA doing during that time? Well, an ATA and a G.711-capable fax server, such as BladeWare, will be in G.711 pass-through mode since that’s how all these calls begin (G.711 for early media, then T.38 re-invite). Fax servers that don’t support G.711 will be playing dumb…they fake it. They will have included G.711 in their initial SIP invite for interoperability reasons, but, since they have no modem capability (they do send out an RTP recording of CNG), they just hang out, hoping that the re-invite to T.38 comes along before the called terminal disconnects due to no response to its DIS.
But the ATA and the G.711-capable server, in the absence of a re-invite to T.38, must assume this is going to be a G.711 call. The calling and called T.30 state machines plow ahead, communicating via RTP. Eventually, the calling fax terminal will receive the called terminal’s DIS (capabilities declaration) and will send DCS and training (what to do on this call). Well, in every ATA and access gateway on earth, if the re-invite is ultimately received and accepted, the G.711 media stream is shut down, and the calling terminal goes back to the beginning in T.38 mode, sometimes even initializing its protocol engine and sending CNG. End result: the fax session fails due to protocol violations or an interrupted RTP media stream, and that’s what folks are seeing.
However, in this situation, the fax server without G.711 will actually perform better up to a point since it can’t participate in any of the G.711-based T.30 session. It’s mute. The called/answering terminal simply continues to retry its DIS until its T1 timer expires (35 sec.). But, if the T.38 re-invite happens prior to the called terminal’s last DIS, the on-ramp gateway accepts the re-invite, the off-ramp gateway (see above) manages to decode the called terminal’s DIS and sends it to the on-ramp gateway in T.38 mode, the fax has a chance of completing in T.38 mode.
However, in the case of the T.38- and G.711-capable IP fax server or ATA, the fax transaction will fail for late-arriving T.38 re-invites because an earlier DIS from the called terminal was probably caught by the calling fax terminal (or the fax server), which then goes ahead with DCS and train. But the gateways will have shut down the G.711 stream for the transition to T.38, effectively killing the session established between the two endpoint terminals. You then have a failed transaction.
But this does not happen if Commetrex’ patent-applied-for technology, “Smart FoIP”, is utilized by the ATA, gateway, or fax server. The ATA with Commetrex’ proprietary relay technology attaches a V.21 modem (along with other analysis algorithms) to the media streams at the beginning of the call. The on-ramp gateway analyzes the decoded V.21 data to track the T.30 states of the calling and called terminals. The called terminal will repeatedly send its initial message (DIS) until the calling terminal sends its response. Once the calling terminal receives a complete DIS, it sends its response (DCS) within 75 milliseconds. Therefore, once this calling-terminal response (DCS) is received by the called terminal, uninterruptable modem operations have begun, and the gateways can no longer switch the session to T.38 without possible corruption of the T.30 states being maintained in the endpoint terminals.
With Smart FoIP, once the on-ramp gateway detects the preamble to the calling fax terminal’s response, it will no longer accept the T.38 re-invite, continuing the transaction in G.711 mode and avoiding the session failures caused by the transition occurring during a modem session.
In the case of an IP-based fax server, as shown below, there are no modems since the server is connected directly to the IP network, and the T.30 state machine and the T.38 protocol engine are both on the same system (embodied in the server, as shown). Therefore, there is no need for a V.21 modem since the server maintains its own T.30 calling-terminal state. The server will accept T.38 re-invites up to the point where it has received the called terminal’s DIS, refusing all subsequent re-invites.
Once we implemented these changes in BladeWare, we were amazed at the improvement in fax-transaction completion rates. Removing G.711 capability caused a 10-percent improvement; adding Smart FoIP added another five-percent, rivaling the completion rates of a multiline fax board.
But we still needed to test the solution on ATAs. For that, we teamed up with our long-time technology partner 8×8, Inc, which owns and operates Packet8, a VoIP service provider. Since the Packet8 ATAs use Commetrex’ T.38 relay, we thought that doing a before-and-after test would be relatively easy, and 8×8 agreed. (To be continued.).