How we keep the data traffic very low and efficient?
As I wrote in the last blogpost that keeping the traffic/bandwidth usage low is very important, I know want to explain how we achieve it.
1. JavaScript size
The first thing when traffic comes in place is the file size of the JavaScript snippet. Instead of explaining why, what, how, I just want to give you a number. In the basic configuration, our JavaScript snippet needs 12KB (uncompressed) space. So when the browser does support gzip compression, the size gets still lower. This size is only reached because we are not leveraging any kind of external library or any other code. We keep a lot of focus at code quality which results in a low script size and a low overhead.
Keeping it stupid and simple is often more complicated than building a big overloaded code.
2. CDN and Caching
To enable you the best performance all over the world, we leverage the AWS Cloudfront content delivery network. That means the JavaScript code is distributed over multiple data centers around the world which results in the best performance independent where your users are. But don't worry about privacy and security, the JavaScript code doesn't contain or held any private data.
3. Hashing
The best thing to keep the traffic usage low is to don't transfer any data and that's exactly what we try to achieve. As I wrote in the first blogpost the data what our script transfers is no image data, it's the structure of the website. This allows us to check if we already have transferred the website. We do this by calculating an hash of some specific data/points of the website which ensures that the structure and the content matches. Afterwards which check this hashcode serverside if the hash is already there (only your tenant). As you might think it wouldn't make a lot of sense to simply calculate an hash of the whole structure, because then we would always get a different hashcode and the data is always transferred. You're right, because the structure of most of the websites differs in small areas which are mostly not visible for the user. Therefore we only check specific points/data which allows us a high match rate for the hashes and ensures that the hashcode is always different if the content for the user differs.
If a hashcode matches, we only transfer the following user activities.
Especially for static websites, our hashcode algorithm works pretty well.