Tell me I'm reading this blog post wrong. It reads as if Cloudflare is admitting to reading the login credentials of users of sites that use Cloudflare.
"Our data reveals that 52% of all detected authentication requests contain leaked passwords found in our database of over 15 billion records, including the Have I Been Pwned (HIBP) leaked password dataset."
h/t: @0xF21D
https://blog.cloudflare.com/password-reuse-rampant-half-user-logins-compromised/
I too was looking for a methodology, but they clearly are matching passwords to a list of leaked passwords from what they wrote.
Cloudflare is MITM for all traffic that passes through their network though. TLS traffic is terminated at their edge nodes first and then re-encrypted (or not depending on the origin setup) before it heads to the origin.
@mookie @jonathankoren @0xF21D OTOH: "To understand human behavior, we focus on successful login attempts (those returning a 200 OK status code), as this provides the clearest indication of user activity and real account risk."
I wonder if they considered how many poorly architected systems are out there that will return a 200 "Login Failed" page?
@mookie @jonathankoren @0xF21D They are comparing *hashes* of the passwords (which they have in the clear because they handle the TLS layer for customers) to the dumps of password hashes associated with various breaches, such as the HaveIBeenPwned data.
Details (linked to in their post) at https://blog.cloudflare.com/helping-keep-customers-safe-with-leaked-password-notification/#how-does-cloudflare-check-for-leaked-credentials
It is a feature site owners can choose to not use.