July 26, 2021

Phishing Test Click-Rate Metrics: a Measure of Email Marketing, not Phishing Resilience

Note: Verizon Media is now known as Yahoo.

The Paranoids' mascot fishing for passwords.

Question: What could be worse than making people feel that cold dread in the pit of their stomach when they realize they just FAILED a phishing assessment?

Answer: Doing so for no good reason and little impact on an overall defensive posture.

We have to think more critically about how we construct phishing simulation programs. For the last decade, I’ve observed an overwhelming majority of both professionals and vendors build phishing simulation programs around the goal of measuring and reducing “click-rate”, which is the chance that someone will click on a link in a shady email.

This approach is so commonplace that most employees, executives, and info-sec professionals alike would describe a phishing test this way, and I am regularly asked for board-level metrics that compare our click rates to an industry average. Despite this, I have successfully stood by my decision to abandon the “click-rate” phishing metric for multiple years, and as a result, I know firsthand that not only can it be done, but it’s definitely worth doing.

Why? For starters, training users not to click links might have been the best advice we had 10 years ago, but today it’s a misguided and futile goal. Clicking on links is how the internet works, end users today are exposed to opportunities to click a link hundreds of times a day.

It isn’t just the web, every email you get arrives with a link returning you to a web application to complete an action because online marketers know that works. Nir Eyal outlines how well this works in his 2014 book, Hooked: How to Build Habit-Forming Products.

We also know our brains reduce cognitive load by automating common repeatable actions into habits. By thinking about every click, you wouldn’t be very productive at work, trying to do so would not only be a fight against human nature but would also be a tough sell in any results-driven work environment. This is bad news for anyone who makes it their mission to make people “stop clicking links”. To make matters worse, telling people to “always be alert,” is essentially asking them to harm themselves. We know for certain that it’s bad for our mental health to be in a constant state of fight-or-flight. Fact. Prolonged stress response is bad for our health.

But the greatest tragedy is when we realize that measuring who clicks on a deceptive hyperlink isn’t even measuring a behavior that would impact your security posture.

I understand the logic that got us here. I remember a time when a single click on a link could easily result in malicious code execution in the browser or on a workstation. But as the saying goes, what got us here, won’t get us there. If your organization has reasonable patch management on endpoints, congratulations, you are “here” already. People clicking links should not be keeping you up at night. It’s time to think about the actions and behaviors beyond the click. If you’re reading this, and you are without patch management, I suggest using your phishing program to drive users towards installing updates.

Today, one-click exploits still fuel news headlines, and that’s because in today’s world they are increasingly rare and novel. But by focusing on these attacks, we miss the more common threats.

We most often see attacks aimed at trying to capture credentials or alter payment routing information. So, we train and condition our people to directly combat those attacks, rather than continuing to request a constant state of distrust for the most common action required to perform their jobs.

So, I ask you, what behavior could you measure with your phishing program that would be most likely to impact your security posture? (And why is it not clicking on a link?)

Our answer: Measure how many users give up their credentials on phishing simulations because bad actors aim to do just that and SSO portals are ubiquitous. Plus it gives us a number that we can improve upon -- one rooted in a real security outcome.

Here’s how we do it: Our tests all lead to a fake single sign-on (SSO) page simulating a feasible corporate credential capture attack.

We try to get our click rate as high as possible when we run a phishing simulation because that’s what an attacker would do.

We craft the most advanced lures we can while keeping them aligned with what would be realistically possible for an external attacker. For example, we don’t cheat or bypass things like external email labeling but we also don’t throw in purposeful misspellings.

We then collect data on who actually enters in their username and password and calculate the following metrics:

  • Credential Capture Rate aka “Chance to be Tricked”: the number of users who gave up their credentials out of the number of users who landed on the fake page. This measures and often confirms an automatic behavior. I suspect you will see numbers above 50%. But this can be driven down with efforts that focus on effective password manager use, adoption of compatible WebAuthn technology, or anything that adds friction to the habit of typing a password.
  • Susceptibility Rate or “Chance for Attack Success”: the number of users who gave up their credentials out of the number of emails sent. This has replaced our “click-rate” metric since it’s measured in a similar way and typically falls into a range that will appear comparable 0-20%. This will always be a lower number than your previous “click-rate” measures, because it represents two actions, clicked and captured. But this gives us a more accurate picture of the real risk. The users who clicked the link, but then did not give up credentials are your superheroes, and you should celebrate them.
  • Report Rate, the number of users who report the phish out of the number of emails sent. You may even measure this against emails opened if you can reliably track that. It should go without saying, either way, you want this to be as high as possible. Strive towards 100%.
  • Credentials Captured vs Reports, the raw count of credential captures stacked against the raw count of reports. It is best expressed as a 100% stacked bar graph, like this:
     
An example of a stacked bar graph.


  • This is my personal favorite metric because it visually shows the contrast between possible outcomes rather than a raw percentage. When you have far more reports than you have compromises you can celebrate not only the value of your metrics but the impact of your approach.

We use this data not only to inform our strategy but also as a mechanism to incentivize behavioral change. All of these metrics are reported to executive leadership in a way that incentivizes friendly competition.

You can read more about our approach to influencing Cybersecurity Culture in a jointly published case study with MIT’s CAMS research consortium.

But that doesn’t mean you have to focus on any of the measures I outlined above. Instead, consider attacks that matter to your organization. A Red Team can help you figure this out.

You may decide to measure anything from replying to a phishing test with specific information, downloading software as a result of a prompt; focus only on reporting phish, or any other actions that represent the success of a real-world attack.

Remember, in most cases, measuring the rate at which you can trick a person to click a link is a measure of email marketing success. Not security.

About the Author

Josh Schwartz is a Senior Director of Technical Security for the Paranoids, the information security team at Verizon Media. He oversees an organization focused on offensive security assessments; red team methodology; building products that support security culture; and behavioral change initiatives.