Top 1 Million Analysis - September 2019

Yes, it's that time of year again and the last 6+ months have flown by. Time for a look at the state of security in the Top 1 Million sites on the web! Here are the results, updates, trends and analysis for the progress we've made over the last 6 months.



But first an update

It's worth pointing this out right up front as it's pretty important. The data source for the Top 1 Million sites has changed. I used to use the Alexa Top 1 Million for this research but I've been having issues with the list. They tried to remove access at one point and while I managed to have it restored, there are other issues too. The accuracy of the data has been called into question and also the list itself has been having weird issues recently like not returning 1 million entries... Yep, that's right, the Alexa Top 1 Million list has been returning, in some cases, only ~650,000 entries recently, which is of course a problem. You can still access the list for now, here's the link, but generally speaking I've just not been happy with the source of data, so I changed it. I've swapped it out for the Tranco Top 1 Million and you can read all about it on their site, but it's a much more reliable source. The list takes the same format so it's a drop in replacement for the Alexa list and accessed via a similar URL too. You can download the last copy of the Tranco list here and this scan is based on the 13th September list.


The crawl

As always, the data for this scan is taken from Crawler.Ninja and it's all available, in raw form too, over on that site. This is now the 9th report I've done on the Top 1 Million sites over 5 years!



September 2019

I am a little late on this report so it's a September report instead of August as usual, but I just wasn't happy with the Alexa data and wanted to give the Tranco list a little time to test it out. With that in mind, there is an alteration in the data source so some of the trends may not be quite where they were because of that. With that said though, things look very familiar which is a great sign when using a different data source.



HTTPS

One thing we do see with the new crawler data is that the HTTPS metrics have actually dropped very slightly. Another thing that's clear is that it's also smoothed out the line and removed all of the 'noise' that had become apart in recent Alexa based scans. I've changed the line colour for the new HTTPS scan to red so it's more apparent here and you can see the same trend is still present too.



This has of course also had an effect on the % of sites redirecting to HTTPS but again, given the change in data source it's good to see that things are still as close as they are.



Given the drop in HTTPS is was also expected to see a drop in something like HSTS, and we did, but we're still seeing solid usage of the header, especially in the higher ranked sites.



Certificates

With the results for HTTPS being down you'd expect a similar story in certificates, and while that is true, there's still an interesting trend that shows in the data. Look at Let's Encrypt's presence in the Top 1 Million now.



Despite an overall reduction in HTTPS, they've managed to still have a quite significant gain in higher ranked sites but still show an overall drop. That's similar to EV certs that have gained slightly at the top, but only marginally, and seen a drop across the rest of the top 1 million sites.



Another interesting change is in the issuers of certs, with Cloudflare now jumping straight up to 2nd in the ranking and dropping one of the Comodo certs out as they switch their issuance to their own CA.


Top 10 Certificate Issuers:
C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 161,092
C = US, ST = CA, L = San Francisco, O = "CloudFlare, Inc.", CN = CloudFlare Inc ECC CA-2 47,844
C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = COMODO ECC Domain Validation Secure Server CA 2 45,542
C = US, ST = Arizona, L = Scottsdale, O = "GoDaddy.com, Inc.", OU = http://certs.godaddy.com/repository/, CN = Go Daddy Secure Certificate Authority - G2 36,050
C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = COMODO RSA Domain Validation Secure Server CA 24,969
C = GB, ST = Greater Manchester, L = Salford, O = Sectigo Limited, CN = Sectigo RSA Domain Validation Secure Server CA 23,911
C = US, O = Amazon, OU = Server CA 1B, CN = Amazon 21,118
C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA 16,221
C = US, ST = TX, L = Houston, O = "cPanel, Inc.", CN = "cPanel, Inc. Certification Authority" 15,974
C = US, O = DigiCert Inc, OU = www.digicert.com, CN = RapidSSL RSA CA 2018 11,614


CAA

Despite the hits on HTTPS and certificates, CAA still saw strong growth going from 12,991 sites to 14,138 sites using it.



General Stats

Total Rows: 873390 

Security Headers Grades:
A 19,784
A+ 3,003
B 16,552
C 28,629
D 113,110
E 14,092
F 678,104
R 116

Sites using strict-transport-security:
117,763

Sites using content-security-policy:
45,031

Sites using content-security-policy-report-only:
2,289

Sites using x-webkit-csp:
664

Sites using x-content-security-policy:
1,804

Sites using public-key-pins:
647

Sites using public-key-pins-report-only:
38

Sites using x-content-type-options:
139,079

Sites using x-frame-options:
152,231

Sites using x-xss-protection:
109,475

Sites using x-download-options:
16,523

Sites using x-permitted-cross-domain-policies:
15,245

Sites using access-control-allow-origin:
35,918

Sites using referrer-policy:
30,753

Sites using feature-policy:
2,656

Sites using report-to:
11,080

Sites using nel:
10,934

Sites using security.txt:
1,615

Sites redirecting to HTTPS:
503,556

Sites using Let's Encrypt certificate:
161,092

Top 10 Server headers:
Apache 198,501
nginx 139,603
cloudflare 134,633
Microsoft-IIS/7.5 25,972
Microsoft-IIS/8.5 25,832
LiteSpeed 17,767
Microsoft-IIS/10.0 15,556
openresty 10,937
nginx/1.14.1 10,099
Apache/2 9,414

Top 10 TLDs:
.com 487,280
.org 62,858
.net 41,338
.ru 31,529
.de 15,094
.uk 14,471
.cn 10,569
.br 9,267
.jp 8,968
.in 7,643

Top 10 Certificate Issuers:
C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 161,092
C = US, ST = CA, L = San Francisco, O = "CloudFlare, Inc.", CN = CloudFlare Inc ECC CA-2 47,844
C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = COMODO ECC Domain Validation Secure Server CA 2 45,542
C = US, ST = Arizona, L = Scottsdale, O = "GoDaddy.com, Inc.", OU = http://certs.godaddy.com/repository/, CN = Go Daddy Secure Certificate Authority - G2 36,050
C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = COMODO RSA Domain Validation Secure Server CA 24,969
C = GB, ST = Greater Manchester, L = Salford, O = Sectigo Limited, CN = Sectigo RSA Domain Validation Secure Server CA 23,911
C = US, O = Amazon, OU = Server CA 1B, CN = Amazon 21,118
C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA 16,221
C = US, ST = TX, L = Houston, O = "cPanel, Inc.", CN = "cPanel, Inc. Certification Authority" 15,974
C = US, O = DigiCert Inc, OU = www.digicert.com, CN = RapidSSL RSA CA 2018 11,614

Top 10 Protocols:
TLSv1.2 307,766
TLSv1 3,109
TLSv1.1 54

Top 10 Cipher Suites:
ECDHE-RSA-AES256-GCM-SHA384 129,484
ECDHE-RSA-AES128-GCM-SHA256 97,367
ECDHE-ECDSA-AES128-GCM-SHA256 55,438
ECDHE-RSA-AES256-SHA384 11,663
DHE-RSA-AES256-GCM-SHA384 2,728
0 2,416
ECDHE-RSA-AES256-SHA 1,785
AES256-SHA 1,374
ECDHE-ECDSA-AES256-GCM-SHA384 1,313
AES256-GCM-SHA384 1,195

Top 10 PFS Key Exchange Params:
ECDH, P-256, 256 bits 284,064
ECDH, P-384, 384 bits 9,467
ECDH, P-521, 521 bits 5,148
DH, 1024 bits 3,014
DH, 2048 bits 995
DH, 4096 bits 117
ECDH, B-571, 570 bits 31
DH, 3072 bits 7
ECDH, brainpoolP512r1, 512 bits 1
ECDH, brainpoolP256r1, 256 bits 1

Top Key Sizes:
2048 bit 230,917
256 bit 56,253
4096 bit 20,271
384 bit 499
3072 bit 373
1024 bit 175
8192 bit 14
4056 bit 3
4048 bit 2
4086 bit 1

Sites using CAA:
14,138


Get the data

My Google Sheet with all of the raw data for this particular scan includes all of the tables and graphs, and lot more, so check that out if you want to look at a deeper analysis. The raw data from each daily scan can be found at https://crawler.ninja/ and if you'd like to download the daily backup of all scan data you can get those here, but be aware that it's often 2.5Gb+ of data per day!