Media servers have had their day
Media servers have played an important role in enabling many of the real-time – and non-real-time – telecommunications applications with which we are all familiar. Those interactive applications include many things we take for granted. They include network announcements (e.g., the ‘speaking clock’), voicemail, IVR, unified messaging (which has morphed into unified communications), and outbound diallers (think campaigns and collections).
Media servers provide a range of low-level functionality used to underpin many applications. Their functions include echo cancellation, DTMF digit detection and generation, loudest speaker detection, call recording, and the ‘front-ending’ of text-to-speech (TTS) and speech recognition servers, with such essential features as ‘grunt detection’ and ‘barge-in’ (nothing to do with Marines on exercise).
These days, with a plethora of smartphone apps generating increasing media traffic in telecommunications networks, you could be forgiven for thinking that media servers can only grow in importance. The trouble with that is a bit like poor man’s origami; it’s twofold.
For one thing, lots of functionality is taken care of by the smartphone. For example, in a speech enabled navigation app, you don’t need a TTS server, or a media server to playback its synthesised speech. That’s because the app relies on a synthetic voice on the device, not on a TTS engine and media server in the network. Nor is a media server needed for the GPS signal used by the app.
Most of the traffic involved is data associated with downloads and regular updates, either to the mapping application or to the TTS voice engine on the smartphone, and none of that requires a media server.
The other thing is that the technology behind media servers is outdated. Traditionally, they have been based on boards full of digital signal processors (DSPs). More recently, they have been implemented on servers with licensed software, which offers functionality akin to that provided by DSPs. Along the way, they have been controlled via complex, low level C APIs or intermediate markup languages, such as MSCML, MSML, and VoiceXML.
None of those technologies are sustainable in a world where many real-time telco applications are no longer revenue generating, merely revenue protecting. I fear subscribers don’t associate service providers with cool applications, and what reduces subscriber churn – or increases their numbers – is free cinema vouchers and early access to tickets for concerts and sports events.
Nowadays, increasing numbers of businesses are relying on cloud-based applications. In the United Kingdom, over 80 per cent are using cloud in one way or another1. That is significant adoption and means the concept has been well and truly accepted. What price then an old fashioned media server that doesn’t lend itself to a cloud environment? Even when you consider software-based alternatives, there are still issues associated with scalability and licensing that mean capacity and ‘just in case’ over-provisioning are issues in much the same way as they are with hardware-based media servers.
When you have a cloud-based application involving aspects of telecommunications, the only logical place to lodge the functions equivalent to those offered by a media server is in the cloud. It stands to reason. When questions are asked about scalability, the cloud is the only reasonable answer.
DSP boards are scalable at significant iterative cost. Software alternatives benefit from ever increasing processor power, and are scalable in servers and on virtual machines – provided you’ve purchased sufficient licences and can deploy/redeploy them quickly enough for purpose. In each case, however, you must make a purchase dependent on your estimation of how much (or how many licences) you will need.
With cloud, there is no such dilemma. The resources you need are available on demand, when you need them, for as long or short a time as you need them. The scalability issue simply disappears. In reality, it’s someone else’s problem. You can rely on that. It’s the beauty of cloud-based resources.
With a cloud-based approach, you needn’t worry about installing duplicated banks of media servers or redundantly paired racks of servers running pre-licensed software in a data centre. Neither do you need to concern yourself over things like brokering media resources; something the unnecessarily complicated IMS has defined along with media resource functions and media resource controllers. Those things are not needed when you’re using the cloud, because of its inherent resilience and scalability. It’s all very well being able to control upwards of 30,000 sessions at a time, but if you haven’t got the underlying media resource functions – scalable and at your fingertips – brokering is a moot point.
A cloud telephony resources platform manages its own, multiple resources and offers redundancy, resilience, and persistence across different locations, different countries and different continents. The user – enterprise or telco – does not need to concern itself with how many media sessions are needed, nor with their availability. Concerns over where to route media service requests and whether they are being handled efficiently simply do not figure. The cloud platform takes care of that.
A further benefit of cloud-based resources is they make time to market so much more of a non-issue. Instead of presenting APIs in languages such as MSML and VoiceXML, which surely no one wants to learn, cloud platforms offer RESTful APIs and APIs in popular, general purpose programming languages, such as Python and Java. Those are the languages with which today’s web developers are familiar and wish to use. When they are coding an application using APIs from several sources (a typical mashup scenario), a familiar programming language is a fundamental desire.
Traditional media servers have had their day. The days of cloud-based telephony media resources are here – today and tomorrow – and that’s the future.