Web RTC - the unknown hero of modern pandemic times
The year is 2020., month January. Many of people are still in holiday mode. There is more and more talk about something called "Coronavirus" in the news, media and between people. Obviously, it is a virus, but what kind, how it is treated, how it spreads - no one knows. Fast forward couple of months, World is in lockdown, kids cannot go to school, students to college, many people cannot go to their jobs, movement is restricted, in some countries people must not be on the streets. But, economy still works, kids and students still attend classes, most people kept working, how is that possible? The answer is "modern technology", in big part live video streaming.
What is "(live) video streaming"?
In simple terms, video streaming is sharing of moving image between two sides, most often through internet. But here, we won't be talking in simple terms, that's for noobs. The real definition for us is "Moving chunks of encoded data with special protocols through many network layers, whose accumulation form a moving image presented in video format". Let's break down that expression:
- Moving - by dictionary means "Changing or capable of changing position"
- chunks of encoded data - small parts of specific information (or message), converted to specific format
- special protocols - set of rules for specific process
- network layers - parts of global network infrastructure in charge of transmitting information
- moving image presented in video format - set of many chunks which together forms an image, which further form the video (moving visual media)
In upper expression lies our "unknown hero", under special protocols. Further, live video streaming is video streaming in real-time, with no or minimum latency between side that transmits video and the one that receives.
It just works...
As already stated, video streaming works through network, it needs to have moving image and the minimum of two sides for data transfer (consisting of computers, phones, tablets..., or mixed devices). Computers can stream to phones, phones can stream to computes, video cameras can stream to TVs, combinations are limitless.
Live video streaming introduces more complexity into that process and to be able to have seamless experience between multiple devices, we had to agree for set of rules how that real-time streaming process will be accomplished, therefore we needed to write protocols which describe that process. There are many live video streaming protocols, but there is one that stands out from the others, that's of course Web RTC. In order keep up with increasingly demanding web pages and applications, browser and browser engine developers needed to invent protocol which will be secured, fast, heavily optimized for web and will work on almost any network-attached device, even outside the browser. Web RTC is their solution, and considering faster hardware and more complex software solutions, especially within the browser, no wonder it is the most popular video streaming protocol.
Let's get technical
If you are software developer you probably know that web is not place for low-level high performance code. Heck, "entire" client side web works on TCP protocol which relies on request-response communication and it was by no means created for low latency live video streaming. But, did you noticed that I put entire in quotes? That's because that claim is not entirely accurate, we have gray sheep in the herd of data transfer web technologies.Web RTC, because of needed high performance and as low latency as possible uses transmission protocol called SCTP (Stream Control Transmission Protocol) which is combination of two transmission protocols, mentioned TCP and UDP (hence gray sheep). Because of that, Web RTC is able to transmit chunks of image/data extremely fast and it is mostly limited just with internet speed, and network or computer performance, with no or minimal bottleneck on software side, depending on situation. There are also other reasons why Web RTC may be superior live streaming technology among web-based streaming solutions, which we shall see below.
Here is my address, give me yours!
In order to get clearer image how Web RTC works, let's get some rough example. We have user A and user B. User A uses PC while user B uses mobile phone. Both use [insert popular video chat app] inside their browsers. Connection process starts in a way that one of the users device (in this case doesn't matter which one, but let's say user A device) initiates connection and sends their connection info, external IP address, as well as some other relevant information to the server with the intention of server further transfering that towards another users device. In order for server to be able to send data to client (web application), most common case is using webSockets which allows bi-directional data transfers.
But why user A needs to send connection info to user B in the first place? Well, that's one of the strongest points for Web RTC comparing to other solutions.
Firstly, we should understand how common web communication works. We have client and we have a server. Client sends requests to server and server sends response back to client with specific data. Client part is usually on users computer, inside browser and server part is somewhere "in the cloud". There can be many clients but there is just one main server (which can have multiple instances). As we already know, live video streaming requires very low latency for best experience, but if data travels through central server 1000 miles away everytime you video chat with your neighbor, even if he's next door to you, that will introduce quite big latency and ruin experience for you both. Because of that, Web RTC establishes direct connection between two (or more) devices, commonly known as peer-to-peer connection. To be able to find each other, devices must know connection info and IP address of another device/s.
Now when that's clear, let's go back to our example. So user B received connection info from user A, now it's his turn to send his info to user A. It should be mentioned that devices does not automatically have their relevant data for connection, they need to "ask" specific servers called signaling servers to find out connection info of them.
When user A have connection info of user B and vice versa, the process of searching for the most optimal network path between two devices begins. When this is finished, two devices connects and live video streaming can start.
This all sounds complicated right? Well, if everything is set as it should be, most of these parts is done automatically within Web RTC, secured and fast. That's also big advantage over other streaming solutions, it's heavily standardized, but when done right, it just works.
And that's why Web RTC is most popular live streaming solution and is becoming even more popular, especially when people have to turn to technology in order to communicate with others. Most people will never know what's happening in the background when they receive image of person they communicate with and what is responsible for seamless experience they have.
Final words...
The idea with this post was to briefly introduce Web RTC as important technology in todays communication, with basic technical info. I am also thinking to write detailed series related to this excellent technology, where will be deep dive into concepts and other protocols regarding Web RTC, with practical examples. If you interested in something like that, please let me know.
*cover photo by israel palacio / Unsplash