Dialogic Blog

WebRTC MCU Architecture - All For One And One For All

by Jim Machi

Jan 17, 2017 10:01:07 AM

MCU architecture

The conferencing market is huge. It was expected to be over $2B in size in 2016.  And with good reason – it fulfills a business need to talk to, and interact with each other through voice and video and various collaboration techniques such as whiteboarding.  But we’ve all been on large conference calls at work where people are added and then you can very visibly tell that the performance had degraded.  And then we’re all in for a bad experience.  Many times this has to do with a slow connection – one bad apple ruining the rest.  But this can also have something to do either with the underlying media server being used, or with the conference call architecture being used. Either way, this is avoidable. 

A popular time-tested architecture is using a MCU (Multipoint Control Unit).  The MCU is where the audio/video streams from the different clients are sent to this central media server for processing.  For instance, for a video conference call, the MCU media server receives the video stream, decodes the stream, tiles the decoded frames with the streams from other participants, and then encodes the tiled video to send it back to the participant. Using the MCU topology architecture simplifies the stream the client will need to send and receive and reduces the number of streams to just one.  

Reducing the required encode/decode to one will therefore decrease both the client compute and bandwidth consumption, thus benefiting mobile type devices. Furthermore, since each stream is decoded and transcoded at the MCU media server, there is no concern for each client to share the same codec, frame-rate, and resolution profile. The incoming media streams from the various clients can now be transcoded to another codec, trans-sized to a different resolution, and trans-rated to a different frame rate thereby allowing each client to be optimized to their preferred profile.

In other words, a key benefit of the MCU topology is that it shifts the processing of the encode/decode from the client into the server, often as part of a cloud compute service, where processing resources are less expensive. MCUs typically work well with audio conferences of any size, and with small-ish scale video conferences.  For larger scale video conferences, another architecture called Selective Forwarding Unit (SFU) has been developed.

More on the SFU architecture next week.

Liked this post? Get more content like this right to your inbox. Subscribe to  the blog.

Topics: WebRTC