Dialogic Blog

How to Make a WebRTC App: Step-by-Step Guide

by Vince Puglia

Apr 28, 2016 10:54:54 AM



Back in September 2014, I published what turned out to be a popular blog about “How to Develop a WebRTC Video Conferencing Server Using PowerMedia XMS, Free in An Hour”. As you can imagine, a lot has changed since this blog relative to WebRTC and our PowerMedia XMS media server, so a ‘refresh’ to the content was long overdue.

The most notable change was by Google’s new Chrome security policy on WebRTC where, starting December 2015, the getUserMedia() API would require using HTTPS. So I needed to change the web server to use a secure connection for hosting my WebRTC conferencing application. We’ve also made some large strides in getting PowerMedia XMS optimized for a cloud environment, so I wanted to include the option of using Amazon Web Services for those interested in moving to AWS. Lastly, I decided to upgrade the code to use HD 720P resolution instead of the default VGA – the quality difference from VGA to 720P is tremendous and will really make a significant difference when being used for video conferencing.


Before getting into building the conferencing application, it’s worth noting the architecture being used for the video conference is a Multipoint Control Unit (MCU). In a MCU architecture, each WebRTC endpoint will encode and transmit their media stream to a centralized point (the PowerMedia XMS in this case) where the media will be decoded and mixed with the other conference participants before being returned to the WebRTC endpoint. This differs from a WebRTC point-to-point (P2P) mesh conference in that each participant will only be sending and receiving one media stream whereas in a P2P mesh each participant will send and receive media from each endpoint in the conference. As you can imagine, mesh conferencing (sometimes called "mess conferencing") has some serious limitations when scaling.

 Full Mesh Architecture


Multipoint Control Unit (MCU) Architecture

Making the WebRTC App

Programming Experience: Novice

Time to implement: ~1 hour

What you need:

* Bare metal server OR VMware w/ adequate processor/memory/storage OR Amazon Web Services access (see server requirements in PART 1 below)

* (1) IP address to be assigned for PowerMedia XMS


* Your favorite editor (vi, emacs, notepad++)

* PC client(s) w/ microphone, camera and Chrome installed

PART 1: Download and Install PowerMedia XMS (~5 to 30 mins)

**Two options**

Option 1 (Install using on-premise hardware)

1.) Download the latest PowerMedia XMS ISO here

Note – I used PowerMedia XMS 3.0 SU2 for these instructions

2.) Follow the "PowerMedia XMS Installation and Configuration Guide" for proper system requirements and installation. You’ll need Linux.

Option 2 (Install using Amazon Web Services AWS)

1.) Log on to your Amazon Web Services EC2 console, select Images à AMIs on the left hand side. Change the search to be “Public images” and search for ‘dialogic_xms’. The most current PowerMedia XMS images will be available for usage.

Reference the Dialogic PowerMedia XMS and Amazon Web Services guide for more detailed configuration details


Note - the PowerMedia XMS developer install comes with a (4) port fully enabled license. Contact the Dialogic sales team to discuss licensing options beyond the base four ports.

PART 2: Obtaining the WebRTC HTML/CSS/JS code (~5 mins)

**Two options**


Option 1 (manual download from github.com/Dialogic):

1.) Browse to: https://github.com/Dialogic/dialogic-html-javascript

2.) Select “Download Zip” to pull down the dialogic-html-javascript-master.zip archive


3.) Copy the dialogic-html-javascript-master.zip to the /tmp directory of the PowerMedia XMS

Option 2 (git download direct to PowerMedia XMS server):

1.) Using Putty, SSH into the PowerMedia XMS Server

2.) Login using root and default password powermedia

3.) (Optional if already installed) Install the git software: yum install git

4.) Change to the tmp directory: cd /tmp

5.) Retrieve the archive from the github repository: wget -O dialogic-html-javascript-master.zip https://github.com/Dialogic/dialogic-html-javascript/archive/master.zip


PART 3: Unzip the archive and move to appropriate directory (~2 mins)

1.) Unzip the archive: unzip /tmp/dialogic-html-javascript-master.zip

2.) Change into the new directory: cd dialogic-html-javascript-master

3.) Copy the entire contents of the ‘confServer’ directory into where the Apache web server will be hosting: cp -r /tmp/dialogic-html-javascript-master/confServer /var/www/rtcweb/html

4.) Copy the entire contents: cp /var/www/rtcweb/html/js/conf720p.js /var/www/rtcweb/html/confServer/js


PART 4: Review the HTML front-end for the user interface (~5 mins)

1.) Open the conference.html for review – there are three main sections to the user interaction:

  • loginPanel: input for the "username" variable and calls the "conferenceLoginClickHandler" function upon onclick
  • videoPanel: defines the "remoteVideo" element used to display the inbound WebRTC video feed from PowerMedia XMS. Also defines the "localVideo" element which is used to obtain the local users video feed but will be hidden from HTML display
  • conferencePanel: input for the "confID" variable and calls the "conferenceClickJoinHandler" function upon onclick

<!DOCTYPE html>
<html lang="en">
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <meta name="description" content="">
    <meta name="author" content="">

    <title>XMS Node NetAnn Conferencing Demo</title>

    <!-- Bootstrap CSS -->
    <link href="css/bootstrap.min.css" rel="stylesheet">
    <link href="css/style.css" rel="stylesheet">
    <!--- Javascript --->
    <script src="js/webrtc.js"></script>
    <script src="js/conference.js"></script>

    <div class="container">
      <div class="form-signin" role="form">
          <div id="loginPanel">
           <h2 class="form-signin-heading">Login:</h2>
           <input type="text" class="form-control" id="username" placeholder="username" required autofocus>
           <button class="btn btn-lg btn-primary btn-block"
          <div id="videoPanel" hidden>
            <video id="remoteVideo" height="300" autoplay controls>
            <video id="localVideo" height="300" style='visibility:hidden' autoplay controls muted>
          <div id="conferencePanel" hidden>
            <h2 class="form-signin-heading">Conference ID:</h2>
            <input type="text" class="form-control" id="confID" placeholder="i.e. 1234" required autofocus>
            <button class="btn btn-lg btn-primary btn-block"
    </div><!-- /container -->

    <!-- Bootstrap core JavaScript
     ================================================== -->
    <!-- Placed at the end of the document so the pages load faster -->

PART 5: Configure the JS to interface with PowerMedia XMS and the HTML front-end (~10mins)

1.) First, we'll need to define the functions being called in PART 4 - let’s start with "conferenceLoginClickHandler()" where the user enters their username and selects login. In this function, we'll create a new instance of the Dialogic Javascript library and assign to myConf.  Once the instance is created, we'll need to retrieve the 'username' element from our HTML form, which will then be passed to the PowerMedia XMS server IP address along with the username.

Note - be sure to change the "my_xms" variable to the proper IP address you set in PART 1.

Note - the "handlers" variable being passed with myConf.setHandlers is defined in step 5 below.  

    var my_xms = "XXX.XXX.XXX.XXX";
    var myConf = null;

    function conferenceLoginClickHandler() { 
         console.log("*** conferenceLoginClickHandler - ENTER ***");    
        // Create a new instance of Dialogic JavaScript library here
        myConf = new Dialogic();
        //Retrieve username info from HTML and pass for registration
        var username = document.getElementById("username");
        var my_xms_ws = "wss://" + my_xms + ":1080"; 
        myConf.register(username.value, my_xms_ws, '');    


2.) Next, we need to define "conferenceClickJoinHandler()" where the user enters the conference id and selects join. In this function, we'll first need to convert the "confID" variable into appropriate NetAnn syntax: 

    // Where YYYY is the conference ID and XXX.XXX.XXX.XXX is the IP address of the PowerMedia XMS server. 
    // Once converted, we'll use myConf.call to pass the variable to PowerMedia XMS. 
    function conferenceClickJoinHandler () {
        console.log("*** conferenceClickJoinHandler - ENTER ***");
        var confID = document.getElementById("confID");
        confID = "conf=" + confID.value + "@" + my_xms;
        console.log("confID:" + confID);
        var ret = myConf.call(confID, 'video');
        if (ret == 'ok') {
            document.getElementById("conferencePanel").hidden = true;
            document.getElementById("videoPanel").hidden = false;
        } else {
            console.log("Error attempting conferenceClickJoinHandler");

3.) The last function we need to define is "initialize()" which gets called as a result of a successful registration (see step 5 below). As part of this function, we will retrieve the local and remote video elements and pass them as part of our mediaContraints. 

    function initialize () {
        console.log("*** initialize - ENTER ***");
        var localVideo = document.getElementById("localVideo");
        var remoteVideo = document.getElementById("remoteVideo");
        var spec = { 'localVideo' : localVideo, 'remoteVideo' : remoteVideo, 'remoteAudio' : null, };
        var ret = myConf.initialize(spec);
        var mediaConstraints = { 'audio': true, 'video': true };
        if (ret != 'ok') {
            console.log("Error initializing user media");

4.) Next, we'll need to create the associated functions from the handlers. In this case we only need to add logic to the registerSuccessHandler which will issue the "initialize()" function and change the HTML form inputs from the "loginPanel" to the "conferencePanel" 

    var registerSuccessHandler = function () {
        console.log("*** registerSuccessHandler ***");
        document.getElementById("loginPanel").hidden = true;
        document.getElementById("conferencePanel").hidden = false;
    var registerFailHandler = function () {
        console.log("*** registerFailHandler ***");
    var ringingHandler = function () {
        console.log("*** ringingHandler ***");
    var incomingCallHandler = function (name) {
        console.log("Incoming call from: "+name);
    var callHangupHandler = function () {
        console.log("*** callHangupHandler ***");
    var disconnectHandler = function () {
        console.log("*** disconnectHandler ***");
    var userMediaSuccessHandler = function () {
        console.log("*** userMediaSuccessHandler ***");
    var userMediaFailHandler = function (){
        console.log("*** userMediaFailHandler ***");
    var remoteStreamAddedHandler = function () {
    var messageHandler = function () {
        console.log("*** messageHandler ***");
    var infoHandler = function () {
        console.log("*** infoHandler ***");

5.) Lastly, as mentioned in step 6, we'll need to define the handlers variable that is used to store the various events to be returned as part of joining the conference. Below is a stock list of handlers that can be implemented - note the handlers being registered are associated with function handles. 

var handlers = {
    'onRegisterOk': registerSuccessHandler,
    'onRegisterFail': registerFailHandler,
    'onRinging': ringingHandler,
    'onConnected': null,
    'onInCall': incomingCallHandler,
    'onHangup': callHangupHandler,
    'onDisconnect': disconnectHandler,
    'onUserMediaOk': userMediaSuccessHandler,
    'onUserMediaFail': userMediaFailHandler,
    'onRemoteStreamOk': remoteStreamAddedHandler,
    'onMessage': messageHandler,
    'onInfo': infoHandler,
    'onDeregister': null


6.) Save the changes and close the conference.js file

PART 6: Testing your new WebRTC Conferencing Server (~3 mins)

1.) Open a Chrome browser session and navigate to: https://<IP address of your XMS>/rtcweb/confServer/conference.html

2.) Enter your name select <enter>

3.) Select "Allow" for the browser to access the microphone and camera

4.) Enter the conference ID <enter>

5.) BOOM – we’re done! Share the URL of your new video server and Enjoy!


To watch a high overview guide on how to make a WebRTC app, watch this video.

Liked this post? Get more content like this right to your inbox. Subscribe to  the blog.

Topics: WebRTC, Guides: How-to's, Infographics, and more, Communications Application Development