How do we screenshare in a web browser without transferring images?

December 20, 2018

Traditional screensharing tools are using the power of image transferring techniques to clone the screen. Because of security constraints – this does not work within browsers that easy. In modern browsers there is WebRTC available which supports taking screenshots out of the box https://www.w3.org/TR/screen-capture/. Nevertheless, because of performance, privacy and compatibility issues we are not leveraging this API for in our core technology.

Performance: Taking the screenshot is fast, because it is a native built in function within the browser. But afterwards you have to process and transfer the image. Therefore, you need a lot of bandwidth or higher computation power to calculate the distinction between the images.
Privacy: In a website or web application you often have sensible data which shouldn’t be visible to support agents. But in critical systems you still need to support your users and want to guide them to the right section by providing the highest standards of privacy and security.
Compatibility: As I wrote before, the WebRTC technology is only available in newer versions of browsers.

Regarding all the reasons above and the limitations of current browsers we decided to build a capture and restore approach. This means, we are capturing all the content under the hood and try to restore the whole page out of it. Imagine you have a house built of a lot of different bricks and you have to rebuild it. What you do is, tracking all the bricks, their appearance, size, position and so on. You need all necessary information to rebuild the house at another place. Now imagine, that the webpage is the house and the bricks are all the small parts (HTML code) of it. We need every information to rebuild the web page in the same way. Instead of taking pictures of the whole house and its changes we are using the architecture, the construction plan of the house to rebuild it.

With this information in mind, we can think of the three points (performance, privacy, compatibility) again.

Performance: Transferring only the architecture (no “real bricks”) needs more complexity on the other side where you want to rebuild your house. Because you need to fetch or build all the bricks at your own and put each part at the right place, instead of simply showing pictures. But on the other hand, it saves a lot of performance on the side where it really matters. It is very important not to block the original house owner (customer website).<Link to “how we avoid large data transfer”>
Privacy: Because we know exactly which data we are transferring, because we only transfer the plans and its content, we are able to mask data which shouldn’t be transferred. Think of a window in a house which shouldn’t be transparent, now we can simply rebuild it as a black window because we have the exact building plan. We know how many windows we have where they are and how they should look like. With this information it’s easy to replace them. In our use case, the “sensible” windows are for example the credit card information of a customer.
Compatibility: For capturing the architecture of the house (webpage), technically we don’t need the allowance of the user to gain it and further we don’t need high sophisticated cameras to capture the whole picture of the house in all dimensions. In our case we can leverage the basic APIs of the browser to track the architectural insights.