Currently you will find support in the latest stable version of Google Chrome
, and the nightly build of Firefox
. Both Chrome and Firefox use vendor prefixes for the WebRTC API, and each have a couple syntax quirks. However, in the official WebRTC examples you will find adapter.js
, which converts both browsers to the proposed API naming and smooths out the syntax differences. I will be assuming usage of this shim going forward.
Creating Your First Media Stream
Obtaining and displaying video and audio streams from your webcam is really a piece of cake. WebRTC includes a MediaStream API, which provides a function called 'getUserMedia'. This function allows you to prompt the user for permission to enable and access their webcam and microphone streams.
You need to specify which streams you would like access to (video and/or audio), a success handler which exposes the stream after it is opened, and a failure handler in case something goes wrong.
Invoking the method will cause the user's browser to ask for permission to access the requested hardware. In Chrome, it looks something like this:
If the user chooses to allow access, the success callback is invoked and passes a MediaStream object which can be used to access the live stream. This is consumable by the HTML5 video tag, but it needs to be attached a bit differently depending on your browser. Adapter.js masks the differences for you, and instead provides the 'attachMediaStream' function, which hooks your stream up to a specified video tag.
If the user declines your request, your failure callback will be called with an access denied error. You can also receive a hardware unavailable error if another browser or application is already utilizing your microphone or camera. Note however, that multiple instances of the same browser are able to share the stream from your hardware without a problem.
Sharing Your MediaStream
Staring at our own video stream isn't real exciting, so let's see how we can use another WebRTC API to share your MediaStream with another client. We are going to look at sharing a video stream between two clients using the PeerConnection API and a simple SignalR signaling server.
The first thing we need to do is create an RTCPeerConnection. The peer connection is going to handle negotiating a network connection with another client, and keep an open session allowing the two to communicate directly.
Notice that you can either pass 'null' into the constructor for an RTCPeerConnection, or a collection of servers for Interactive Connectivity Establishment (ICE) to use. We will come back to what that entails in just a little bit.
After creating our RTCPeerConnection, we need to fire up the connection process, which can be done simply by adding the stream we received from 'getUserMedia' into the connection object with 'addStream'. Then we need to create and send a WebRTC offer over to the peer we would like to connect with. Sending of the offer is done via SignalR, which I'll touch on a bit down the page - but know that this could really be done via any communication mechanism.
Creation of this offer is where the ICE process comes into play. ICE is the process that drives the underlying connection discovery and negotiation between clients. Before sending over an offer, it will attempt to find possible IP/Port combinations that the other client could use to create a connection with you. Each one found is called a 'candidate', and triggers the 'onicecandidate' callback.
Each time the client finds a new candidate, it will send it over to the remote peer. When the peer receives the candidate message, they will construct an RTCIceCandidate object, and add it to the collection of ICE candidates in their RTCPeerConnection.
ICE will start with locally available connection points, and will then branch out to external possibilities. This is where the TURN and STUN servers can be utilized. At a basic level, a STUN server simply accepts a request from the client that has gone through Network Address Translation (NAT), and sends back a response containing the IP and port that the call came from. This gives the client a picture of how it can be accessed from outside its local network. A TURN server takes this a bit further, and instead of returning the client's connection info, it returns its own. It then becomes a relay for messages between the two connected parties.
As shown during creation of the RTCPeerConnection object, you do not need to supply any ICE servers. Leaving it set to 'null' simply means you have less of a chance to find a successful connection if you are communicating between clients that are walled off by firewalls and routers.
Once ICE finishes its job, the communication details that were found, along with details about the stream you added, are summarized into an SDP message that gets sent over to another client as the offer, as seen above.
When the remote client receives your SDP message, it constructs its own SDP message in response. This follows the same procedure of collecting ICE information, just as the originating client performed. We also add our video stream to the connection, since we want to share it with the call originator. This time the SDP message is sent back as an answer instead of an offer.
Once the originating client receives an answer, WebRTC has everything it needs to determine a set of connection points that can successfully communicate, and fire up our call.
When the connection becomes active, each client will be notified by their RTCPeerConnection about any streams that were added by a remote peer. This occurs through the 'onaddstream' callback, which provides an event that contains the MediaStream object. From here, we can handle it however we like - in our case, we are going to create a new video element and just toss it in the body of the document, which should make for a pretty beautiful UI.
Likewise, if the remote stream is closed, the 'onremovestream' callback will fire, indicating the stream is no longer available.
A Note on Signaling
WebRTC doesn't care how the initial setup messages get sent around, so it is your responsibility to find a way of delivering them from one client to the other. There are a lot of methods that can be used for this, and nothing very complex is required. You just need to be able to transfer some JSON between client browsers. SignalR is shown here, but there are a lot of other options - WebSockets, Google Gears, or heck - even email, which I wouldn't highly recommend.
Keep in mind that the signal server is only used for ICE setup - once a pair of valid ICE candidates are determined and connectivity is established, all communication is directly peer-to-peer.
We aren't going to get into the details of SignalR, but I will quickly explain what is happening here since we are using it as our signaling mechanism. Notice the 'hub.server.send' calls in the snippets above - these are calling a function called 'send', which is defined in the server-side SignalR hub. Below is what the server-side implementation of the send function looks like - it's as simple as it gets.
All we are doing is taking a string, which is our WebRTC signal in JSON form, and then sending it back down to the other clients who are connected to the hub. Back on the client side, the 'newMessage' function will be invoked on each connected client, and passed the JSON signal message. Please note that while this hub will serve the message up to as many clients are on the page, I am pretty sure more than 2 connected clients will break the sample code :)
If you are looking for an introduction to SignalR, Skyline's Jeremiah Billmann has written a great post on the subject.
This stuff is really fun to play with and has tons of potential, so I encourage you to take a bit of time to go hands on with WebRTC. If you are interested in taking a look at the entire working program that this post is based on, you can grab it at GitHub.
The solution on GitHub is a .NET MVC 4 project - as it includes a simple SignalR implementation as the signal server. However, you could very easily pull the couple calls out and replace them with whatever signaling mechanism you'd like.
Just open up a couple browser windows and hit the "Start Call" button in one of them, and hopefully things will fire right up!
Building Something Bigger
In addition to the simple sample project mentioned above, I have also started putting together a more advanced video conferencing project that you may want to check out.
You can hit the site live or find it on GitHub.
This project shows how you can use something like SignalR to put a management layer on top of the WebRTC calls. It also allows you to target specific clients, instead of just the first other person to hit the page.
Currently the site acts as a sort of lobby for video chat users, but transforming it into a multi-user conferencing site is a work in progress. Feel free to join in on GitHub if you have ideas, or want to use it as a starting point for your own project!