Upgrading my crawler to do HTTP/2.0, TLSv1.3 and other cool stuff!

I've upgraded my crawler fleet many times over the years to look at new security mechanisms, gather deeper insight and more data and this change is continuing that trend. My crawler now has support for HTTP/2.0, TLSv1.3 and a few more metrics to make the data available for analysis.



The Crawler

For those not familiar, I run a crawler fleet over at Crawler.Ninja and it crawls the Top 1 Million websites in the World every day. I gather a heap of data every single day that is publicly available and every 6 months I produce a report to see what progress we have made. The latest additions to the capabilities of the crawlers are the support for HTTP/2.0 and TLSv1.3 which I'm pretty excited about, so let's dig in!



HTTP/2.0

I originally wrote about h2 way back in 2015 with HTTP/2 is here! and since then I've looked at it various different times on my blog, but now it's time to look at how everyone else is using it.



Out of all of the sites scanned there is already quite a good portion of them using HTTP/2.0 at 37.3% of the total, HTTP/1.1 still remains dominant at 63.1% and for some reason there are still sites out there that only support HTTP/1.0 at 0.3%!

Looking at the distribution of support and we can see what I would expect and that there is a huge portion of the sites that support HTTP/2.0 at the top of the ranking and as you start to look towards the lower ranked sites, HTTP/2.0 support tails off and HTTP/1.1 support starts to increase.



TLS Protocol Support

The crawler fleet also got an upgrade to be able to support TLSv1.3, the latest version of the protocol released in late 2018, and as such the TLS Protocol Support numbers were updated.



This one is much closer with TLSv1.3 being used on 40.62% of sites that are redirecting to HTTPS and TLSv1.2 still being used on more than half at almost 54%. It's great to see that Legacy TLS is really not hanging around with no sites depending on TLSv1.1 and only 0.37% of sites depending on TLSv1.0 too.

Again, looking at the distribution of support for each protocol version we can see a similar trend of more support for higher protocol versions in the higher ranked sites, but it's not quite as pronounced as I'd have expected.



Cipher Suites

With the addition of TLSv1.3 there comes a requirement to update cipher suite information too. I decided to split them out into pre-TLSv1.3 ciphers and TLSv1.3 ciphers so we could see what was happening between the two sets of ciphers.



Interestingly, as expected, ECDHE-RSA-AES128-GCM-SHA256 is the most popular suite by quite some margin at the top of the ranking but it very quickly loses out to ECDHE-RSA-AES256-GCM-SHA384 which is overall the most popular suite by far.



Things are a little more clear when we look at TLSv1.3 cipher suite support though and TLS_AES_256_GCM_SHA384 is by far the most popular suite choice. There is a little show for TLS_AES_128_GCM_SHA256 and TLS_CHACHA20_POLY1305_SHA256 just makes an appearance down at the bottom of the X axis but that's about it.



Looking at the share for the TLSv1.3 suites, and don't forget this is the % of sites redirecting to HTTPS using these suites and you'd need to add in the pre TLSv1.3 numbers above for the total, there is again a huge portion going to AES256.



Going Forwards

The HTTP/2.0 and TLSv1.3 protocol/cipher support is now being pumped out by the crawler fleet in the daily data so if you want to go and play around with that you can do. I've also updated the TLS Protocols file, updated the TLS Ciphers file and updated the HTTP Versions file too. Have fun!