Security

AI Code Is Leaking 29M Secrets: What Developers Must Know Now | Dwayne McDaniel, GitGuardian | TFiR

GitGuardian's report reveals 29M secrets leaked on GitHub, with AI coding tools showing 2x baseline leak rates. Dwayne McDaniel explains why detection alone won't solve secret sprawl.

By Monika Chauhan 26 minutes ago

0

AI coding assistants promise speed and productivity, but they’re also leaking credentials at alarming rates. GitGuardian’s 2026 State of Secret Sprawl Report reveals 29 million hard-coded secrets hit public GitHub in 2025 alone—a 34% year-over-year increase. Claude Code-assisted commits showed a 3.2% leak rate versus 1.5% for human-only commits. Meanwhile, supply chain attacks are stealing secrets directly from developer laptops, and over 24,000 secrets have already leaked from MCP config files.

The shift from perimeter-based security to assume-breach strategies has never been more urgent. Attackers are no longer breaking in—they’re logging in with stolen credentials.

The Guest: Dwayne McDaniel, Senior Developer Advocate at GitGuardian

Key Takeaways

AI coding tools are leaking secrets at 2x the baseline rate, with Claude Code-assisted commits showing a 3.2% leak rate in 2025
Supply chain attacks like Shai-Hulud targeted developer laptops—44% of compromised machines contained 10+ secrets, including GitHub and GitLab tokens
Over 24,000 secrets leaked from MCP config files in the first year, including Postgres credentials and AI platform API keys
Internal repos are 6x more likely to leak secrets than public ones due to a false sense of security
Non-human identity (NHI) governance is the next evolution beyond detection: inventory, lifecycle management, and workload-based authentication

***

In this exclusive interview with Swapnil Bhartiya at TFiR, Dwayne McDaniel, Senior Developer Advocate at GitGuardian, discusses the escalating secret sprawl crisis, how AI coding tools are changing the threat landscape, and why enterprises must move beyond detection-only strategies to comprehensive non-human identity governance.

The State of Secret Sprawl: 29 Million Credentials Leaked in 2025

GitGuardian has been monitoring public GitHub commits since 2018, scanning every new commit and tracking credential leaks in real time. The 2025 data reveals an unprecedented surge in hard-coded secrets, driven by AI coding tools, supply chain attacks, and a fundamental misunderstanding of where secrets are safe.

Q: What exactly is GitGuardian tracking, and why should developers care about these numbers?

Dwayne McDaniel: “GitGuardian has been around since 2017. We first got our start looking at GitHub public events, anything that’s in the public event stream, and we realized pretty quickly, the founders realized, hey, there’s a lot of secrets in here. There’s a lot of credentials, access tokens, API keys and whatnot. So they started making phone calls and realized companies weren’t aware of this. Since 2018 we’ve looked at every new commit that hits GitHub publicly, and anything that became public that used to be private. And we collect those stats when we find a secret right then and there, we email the committer and say, hey, you did this publicly. You should know about this.”

Dwayne McDaniel: “We’re talking about leaked credentials that allow direct access. One of my favorite definitions of a secret is it’s a piece of data that, by itself, allows access. So if you can read it in plain text, anyone that gets it can read it in plain text, be it a database connection URL, a username password pair, your old classic API key. The fact is, we’re building in such a way now that there’s no perimeter anymore. You can basically make a call to an endpoint, and if you can reach that endpoint and the key works, then you can exfiltrate data, you can take over machines, and do a lot of nefarious things.”

The report documented 28.64 million hard-coded credentials across public GitHub repositories, representing a 34% increase from the previous year. More alarmingly, this represents a 152% increase from GitGuardian’s first report in 2021, using the exact same research methodologies. Meanwhile, the developer population only grew 98% during the same period—meaning secrets are leaking faster than the growth of new developers.

AI Coding Tools: 2x the Leak Rate, 3-4x the Code Volume

The rise of AI coding assistants like Claude Code, GitHub Copilot, and Cursor has promised unprecedented developer productivity. GitGuardian’s data reveals a more complex picture: AI-assisted commits are producing more code—and more leaks.

Q: Your report shows Claude Code-assisted commits had a 3.2% secret leak rate versus 1.5% baseline. Are AI tools making us less secure?

Dwayne McDaniel: “I want to be crystal clear. We’re not saying AI in and of itself is bad. What we have is the data. The data says, if you co-signed your commits with Claude Code—that’s an ability you have in Git; multiple people can sign the same commit with annotations. If you let Claude Code go ahead and sign the commit, or co-sign the commit, over the course of 2025, you were over twice as likely to commit a secret. Also, you were much more likely to commit a lot more code, because you produced two to four times the number of lines of code per commit across the year.”

Dwayne McDaniel: “What we’re inferring from that is, if you have Claude do the work and then you’re checking it, you’re more than likely just signing the commit yourself. But if you’re letting Claude do the work, running a quick test, and saying, yep, that functions okay—Claude, just complete the transaction—that’s when we see the co-commit signature showing up. We saw nothing at the beginning of the year because no one was signing their commits with Claude Code at the beginning of the year. And by August, it had ramped up dramatically, almost 4x the baseline. Then they released a new model, Opus, and it started edging back down. It never crossed back to parity. It never went back below the base human level, but the models are making it slightly better.”

The types of leaked secrets also shifted dramatically. Twelve of the top 15 most commonly leaked secrets with the largest year-over-year growth were AI tools themselves: OpenAI keys, DeepSeek credentials, OpenRouter tokens, and the infrastructure surrounding AI development platforms.

Q: When Anthropic shipped the new Claude model version, did it change the leak rate, or did developer behavior stay the same?

Dwayne McDaniel: “We did see a marked improvement. The tracking over the year starts with nothing being co-signed by Claude Code. We see a precipitous drop-off after August where it edges back down. It never gets quite to the same level of the base human level, but it did improve. The ultimate issue with letting Claude just run and just build this stuff is it’s still trained on templates. It’s still trained off of how we’ve built everything before. If it’s trained on a lot of data that says hard-code the credential, eventually, sometimes, even if it has explicit instructions to never do that, it’s still going to do that sometimes.”

Supply Chain Attacks: The Shai-Hulud Worm and Developer Laptop Compromise

While public repository leaks dominate headlines, the more insidious threat comes from supply chain attacks targeting developer machines directly. The Shai-Hulud worm, distributed via npm packages, executed credential-stealing malware upon installation and exfiltrated secrets to public repositories via double Base64 encoding.

Q: You used the Shai-Hulud supply chain attack to look inside developer laptops. What did you find, and what does that tell us about where secrets are actually living today?

Dwayne McDaniel: “Shai-Hulud was not the first in the line of attacks. It really started with the NX attack and maybe TJ Actions. The goal of the supply chain attack is to steal developer credentials. We’ve seen that here in 2026 with npm packages, such as the Trivy attack, Hix Lite, and LLM-related attacks a few weeks ago. We looked at the same paths, like, what did the actual malware try to do? Where did it look? What credentials did it try to steal?”

Dwayne McDaniel: “It was an npm package. It would execute upon install as a credential stealer, take any credentials it found, double-Base64-encode them, and then push them to a public repo. Since it’s public and we could find those, we did, and we looked inside. 44% of the compromised machines that we could see evidence of contained 10 or more secrets—10 or more credentials to something. These could be production instances or GitHub tokens. In fact, that’s the majority of the actual secrets we found: 581 Personal Access Tokens and 101 GitLab tokens that were over-permissioned. So if someone had that, they basically could take over your entire codebase.”

Dwayne McDaniel: “The tactic has shifted from attacking the running instance of production software to try to get directly to that customer data to—since the perimeter is literally everywhere—attacking the person who has the most permissions and the most privilege out there, and that is the developer. That is the CI/CD workflow. No matter how the actual code is executed, no matter how the prompt is injected, the end result is the same: gather all the credentials from the local machine and ship them somewhere so the cycle can keep on going.”

MCP Security: 24,000 Secrets Leaked in Year One

Anthropic’s Model Context Protocol (MCP) revolutionized how AI assistants interact with development tools and external services. It also introduced a new attack surface. GitGuardian discovered over 24,000 unique secrets leaked from MCP configuration files within the first year of MCP’s release.

Q: Over 24,000 secrets leaked from MCP config files in the first year. Is this a problem with MCP design or how configs are configured?

Dwayne McDaniel: “We had over 24,000 unique secrets we found. But if you look at the distribution of it, almost 20% of that was Google API keys. There’s this truth that Google API keys, at one time, depending on the service we were talking about, didn’t really make a big deal. Google Maps is a classic example—they publish your key and it’s just more of an identifier. But that actually changed with Gemini, so now Gemini is relying on the same API key setup. How many of those keys now allow access? That’s further research that needs to be done.”

Dwayne McDaniel: “Postgres database credentials and connection strings, Firecrawl API keys, Perplexity, Brave Search—these are the kind of secrets that we’re seeing leaked out there. In the case of Postgres, we know exactly what you do with that: you get into the production database. We’ve never handled identity-based authentication very well at scale, and we’re seeing the evidence of that with this new round of technology. MCP servers are just API connectors at the end of the day, and if you’re relying on API keys, things that hold authorization and authentication information with standing privilege—that, in and of itself, is a much bigger, broader issue. We’re just seeing it manifested in a new way with new tech.”

Internal Repos: 6x More Likely to Leak Secrets

Conventional wisdom suggests private repositories are inherently safer than public ones. GitGuardian’s data reveals the opposite: internal repos are six times more likely to leak secrets than public repositories, driven by a false sense of security.

Q: Internal repos are six times more likely to leak secrets than public ones. Why is that assumption wrong, and what’s actually happening inside internal repos?

Dwayne McDaniel: “Short answer is, there’s a false sense of security. There was a time when private meant it was really hard to get to, and meant that the likelihood of it being a danger was fairly minimal. We had closed networks, but the perimeter is the internet now. We’re seeing cases where internal repos that were never supposed to see the light of day suddenly see the light of day. Claude’s code itself saw the light of day. That code shouldn’t have been exposed publicly. The Cisco breach—over 300 repos have been adversarially made public.”

Dwayne McDaniel: “The idea that it will be okay because only my team will see it—we have to assume breach. That’s one of the tenets of zero trust. We simply must assume, if I can see it in plain text, then an attacker probably already has it. An attacker is probably already doing something nefarious with it. If it’s never in plain text, or it’s only in plain text for the second I make it and I paste it into the vault, and that’s it, and then it’s rotated as quick as I can after that, that is the only real defense we have at this point—to just eliminate these plain text secrets floating around.”

From Detection to Non-Human Identity Governance

GitGuardian scans billions of commits and sends millions of alerts annually, yet the leak numbers continue to rise. McDaniel argues that detection alone cannot solve the secret sprawl crisis—enterprises must adopt comprehensive non-human identity governance frameworks.

Q: You folks scan billions of commits and send millions of alerts, yet the numbers keep going up. At what point do we admit that detection alone is never going to solve this problem?

Dwayne McDaniel: “Detection is part of the answer, and we shouldn’t discredit that. Not knowing about the problem means you can’t do anything about it. But it’s definitely not the solution. Noise without signal is just noise. All these alerts that you should do something about, but there’s no plan to do something about it. We’ve architected our platform so it starts with detection absolutely, but then we also want to build a full inventory—what’s in the vaults as well, and what is your plan to move to a better system.”

Dwayne McDaniel: “If all you’re doing is swapping secret one for a new secret, but it’s also hard-coded, you haven’t solved anything. You’ve kicked the can down the road. If you’ve moved everything to vaults, you’ve taken a great step. If you’ve moved completely away from secrets and you’re using detection just to see what we have left to do, moving to SPIFFE/SPIRE or those kind of workload-based or identity-based workload authentication mechanisms—you can’t get there immediately. The messy reality of the enterprise is you have to have a plan on what you’re doing to be more safe, and the more you can automate that, the safer you’re going to be.”

Q: From a practical practitioner’s point of view, where does a security team start Monday morning?

Dwayne McDaniel: “Governance is an outcome, it’s not a product. The first thing you should do in threat modeling, the first thing you should do in any good security exercise: understand what you have. Take an inventory, figure out what secrets are out there. Start with your local machine, start with your production environments, and start there and move out. Governance is the observable state of things, and if it’s in or out of your plan. We give full visibility into secrets themselves and unify that view. Is this owned by a person? Who is that person? When was it issued? When should it sunset? What permissions does this have? Is it over-permissioned?”

Dwayne McDaniel: “In an emergency, when a breach does happen, you need to reach for that information immediately and automate the response as much as possible. So: is the secret, and therefore the non-human identity that uses that secret, in a good state? Non-human identities are all of the running instances of software, the physical machines themselves, anything that’s not a person that exists within a larger ecosystem that needs to connect to other systems to do its work. In order to make those connections happen, you need some way to authenticate. The way we’ve been doing that traditionally has been secrets, API keys, certificates—things that can be found publicly and in private repos, found in plain text, and then reused.”

Read Full Transcript & Technical Deep Dive

Watch the Full Interview

Watch the complete conversation with Dwayne McDaniel on TFiR’s YouTube channel

You may also like

AI Token Costs Are Spiraling — Rob Hirschfeld of RackN on Hybrid Infrastructure | TFiR

By Monika Chauhan3 days ago

Cloud Native

AI Agents Fail in Production Without Workflow State Recovery | Mark Fussell, Dapr | TFiR

By Monika Chauhan3 days ago

Cloud Native

MITRE ATLAS and ATT&CK Navigator: How CISOs Are Securing AI Systems Against Real Threat Groups | Steve Winterfeld, Akamai | TFiR

By Monika Chauhan3 days ago

Security

Why 87% of Organizations Are Running Exploitable Vulnerabilities | Andrew Krug, Datadog | TFiR

By Monika Chauhan3 days ago

Security

Patching Shouldn’t Kill Production: Dave Bermingham, SIOS Technology | TFiR

By Monika Chauhan4 days ago

AI Infrastructure

How Does JDK 26’s HTTP/3 API Transform Microservices Performance Using UDP | TFiR

By Monika Chauhan4 days ago

Cloud Native