The direct answer: no, a rigorously sourced, methodologically transparent statistic isolating the percentage of data breaches specifically caused by certificate or encryption failures, paired with a dedicated average cost figure, does not currently exist in any major breach research body’s published data. This article explains what was searched for, what was found instead, why one frequently circulated number should not be trusted, and what legitimately adjacent data does exist.
The Number That Circulates But Has No Traceable Source
A specific figure appears repeatedly across SSL statistics listicle sites: ‘12% of security breaches were caused by incorrect SSL settings or expired SSL certificates.’ This exact phrase, often word for word, appears on multiple aggregator pages. None of them attribute it to a named study, survey, sample size, methodology, or even a specific year. It is presented as a freestanding fact with no citation trail.
This is the same pattern this site’s other research has flagged repeatedly when examining statistics circulating in this space: a number with no traceable primary source, repeated across content aggregator pages that compile broad ‘SSL statistics’ listicles without distinguishing rigorously sourced findings from unsourced assertions. Searching specifically for the original study, survey, or report behind this 12% figure did not surface one. Until a primary source for this specific claim is identified, it should not be cited as fact, and this article does not repeat it as one.
What Legitimately Exists Instead: Certificate Population Characteristics
A real, named, methodologically described study does exist in this space, but it measures a different question than breach root-cause attribution. Enterprise Management Associates (EMA), in a study commissioned by AppViewX in 2023, measured the state of SSL/TLS certificates currently in use across the internet, not the cause of past breaches.
- Methodology: EMA gathered data from Google Trends (May 2018 to April 2023), Stack Exchange (2009 to 2022), and a Shodan scan conducted in May 2023 specifically targeting servers with SSL/TLS certificates on port 443. Ken Buckler, Director of Information Security Research at EMA, is named as leading the research.
- Finding: up to 25% of all certificates on the internet are expired at any given time, broken down further as 10% expired and 15% self-signed specifically.
- Finding: nearly 80% of TLS certificates in use are vulnerable to man-in-the-middle attacks, because only 21% of internet-facing servers use TLS 1.3.
- Finding: 45% of IP addresses exposed to the top 10 known vulnerabilities also had expired (22%) or self-signed (23%) certificates, suggesting correlation between poor certificate hygiene and broader unpatched vulnerability exposure on the same infrastructure.
This is genuinely valuable, real research. But it measures certificate population health (what percentage of certificates currently in existence are expired or weak), not breach attribution (what percentage of actual confirmed breaches were caused by a certificate failure). These are different questions, and conflating them, as the uncited 12% figure appears to do, produces a statistic that sounds precise but is not actually traceable to evidence answering the specific question it claims to answer.
A separate, narrower 2025 study on TLS in healthcare web applications, cited by Paubox, found that 17% of healthcare websites examined had certificate-related weaknesses, specifically impending expiry within 30 days or incomplete certificate chains. This is, again, a certificate population characteristic specific to one industry’s web infrastructure, not a breach-cause attribution statistic.
The Equifax Case: The Most Commonly Cited Example, Correctly Understood
Equifax’s 2017 breach is the example most frequently invoked when certificate failures and data breaches are discussed together, and it deserves precise treatment rather than being flattened into ‘a certificate caused the Equifax breach.’
The Equifax breach was caused by a software vulnerability in Apache Struts that attackers exploited to gain unauthorized access. A separate, contributing factor was an expired SSL certificate on an internal monitoring device. That expired certificate disabled the monitoring device’s ability to inspect encrypted traffic, which meant the unauthorized access happening through the Apache Struts vulnerability was not detected for an extended period, allowing the breach to continue and expand undetected for longer than it otherwise would have.
The accurate framing, consistent with how this exact case is described elsewhere on this site’s coverage of the Equifax incident: the certificate expiry was a detection-delay contributing factor, not the root cause of the breach. Even sources that cite Equifax as an SSL-related breach example tend to hedge this correctly when describing the mechanism in detail, noting that most high-profile breaches result from a combination of vulnerabilities and security oversights rather than any single cause.
Why a Clean, Isolated Statistic Is Structurally Hard to Produce
Major breach research bodies, including IBM’s Cost of a Data Breach Report and Verizon’s Data Breach Investigations Report, both covered in detail elsewhere on this site, categorize breaches by initial access vector and contributing factors using their own defined taxonomies. Neither report’s published category breakdown includes a standalone ‘certificate failure’ or ‘encryption failure’ root-cause bucket with its own dedicated percentage and cost figure.
The Verizon DBIR’s categories include vulnerability exploitation, credential abuse, phishing, supply chain compromise, and several others, none of which is specifically ‘certificate or TLS failure.’ This is not necessarily because certificate failures never play a role; the Equifax case demonstrates they sometimes do. It is because, structurally, certificate failures most often show up as a contributing or detection-delay factor embedded within a different primary attack vector category, exactly as in Equifax (primary vector: vulnerability exploitation via Apache Struts; contributing factor: certificate expiry disabling detection), rather than functioning as a standalone initial access method that a research taxonomy would track as its own first-order category.
This structural reality is itself the most useful and honest finding this research can offer in place of a fabricated isolated statistic: certificate and encryption failures are real, documented, and meaningfully damaging, but they function predominantly as force multipliers and detection-delay factors layered onto other root causes, rather than as a freestanding category of breach origin large enough to warrant its own dedicated line item in major breach research taxonomies. A rigorous researcher attempting to isolate this specific percentage would need to go back through individual breach post-mortems and judgment-call whether a certificate issue was a meaningful contributing factor in each one, a research project that, as far as this search was able to determine, no major published report has actually undertaken at scale.
What a Rigorous Version of This Statistic Would Require
For context on why this gap exists and what would need to happen to fill it credibly:
- A large sample of confirmed, publicly disclosed breaches with detailed root-cause and contributing-factor documentation for each, similar in scale to Verizon’s DBIR dataset
- A consistent definition of what counts as a certificate or encryption related contributing factor, distinguishing primary cause from detection-delay factor from unrelated coincidence
- Independent verification of each classification, since post-mortem reports from breached organizations do not always disclose this level of technical detail publicly
- A cost attribution methodology that can isolate the marginal cost impact specifically attributable to the certificate-related factor within a breach that also had other causes, which is methodologically difficult even when the underlying breach data is complete
This is a genuinely hard research project, which is likely part of why no major report has published it. Anyone encountering a confident-sounding isolated number for this specific question in the future should ask for exactly this kind of methodological detail before treating the figure as reliable.
Frequently Asked Questions
If this statistic doesn’t exist, why does it keep appearing on SSL statistics pages?
The pattern observed in this research, and in other statistics-focused articles on this site, is that aggregator and listicle-style content frequently compiles numbers from a wide range of sources without consistently verifying primary sourcing for each individual claim, and once an unsourced number appears on one site, it tends to be copied, sometimes verbatim, to other similar pages, creating an appearance of consensus or repetition that can be mistaken for verification. The repetition of the same uncited 12% figure across multiple near-identical pages is itself evidence of this copying pattern rather than evidence of multiple independent confirmations of the underlying claim.
Does this mean certificate-related breaches are not a real or significant problem?
No. This site’s other coverage, including detailed analysis of the Equifax case, the documented certificate outage cost data from Red Sift and CyberArk’s research, and the EMA certificate population study summarized in this article, all demonstrate that certificate mismanagement is a real, well-documented, and costly problem. What does not exist is a single clean isolated percentage-of-breaches-plus-dollar-cost figure specifically for this category, distinct from the broader breach cost and cause statistics published by IBM and Verizon. The absence of one specific statistic does not mean the underlying risk category is not real or not worth addressing.
Should I cite the EMA certificate population statistics (25% expired, 79% MitM-vulnerable) as breach-cause data?
No, and this article specifically distinguishes the two to prevent that conflation. The EMA figures describe what percentage of certificates currently in existence on the internet are expired, self-signed, or vulnerable to specific attack types. They do not describe what percentage of confirmed past breaches were caused by those certificate weaknesses. Citing the EMA population statistics is accurate when the claim being made is about certificate hygiene across the internet; it becomes inaccurate if restated as a breach-cause attribution claim, which is a different measurement the EMA study was not designed to produce.
