Mexico Government Cyberattack via Anthropic's Claude

Mar 9

Summary

From December 2025 to January 2026, an attacker used two well-known AI tools- Anthropic's Claude and OpenAI's ChatGPT- to breach 10 Mexican government agencies and a financial institution. By jailbreaking Claude through persistent prompting framed as a bug-bounty exercise, the attacker generated working SQL-injection payloads, credential-stuffing scripts, and data-exfiltration tools. Over roughly one month, 150GB of data was stolen, exposing an estimated 195 million identities, including civil registry files, tax records, and voter data. Israeli cybersecurity firm Gambit Security discovered the breach.

Why it matters

This incident shows how ‘simple’ cyberattacks have become. The threat actor involved did not demonstrate advanced technical skills, nor did he rely on nation‑state resources or custom‑built malware. Instead, he used two widely available AI tools- tools used by thousands of people every day. This underscores a pretty growing concern: with only basic computer knowledge, individuals can now carry out attacks that previously required significant skill.

The case also demonstrates that jailbreaks remain a persistent challenge. Modern AI models are designed with guardrails to prevent harmful outputs, yet determined users can still overthrow these protections through repeated or carefully reframed prompts. In this instance, the attacker eventually coerced the model into generating content it should have refused. Similar incidents have occurred before, including cases where users manipulated prompts to extract sensitive information such as software keys. These examples reveal a meaningful vulnerability in current AI safety mechanisms.

AI continues to be an incredibly powerful tool in cybersecurity- capable of processing workloads in minutes that would take human teams months. However, I think we can all agree AI is a double-edged sword. In this attack, the AI system assisted with reconnaissance, exploit development, and scripting, taking on many roles. Cyber incidents occur daily, whether driven by sophisticated adversaries or inexperienced script kiddies experimenting with malicious activity. As AI capabilities increase, so does the potential for misuse.

My thoughts

Honestly, I think it’s wild. Reading about this attack made me realize that I could have done the same thing with the tools I already use. This wasn’t some skilled hacker backed by a government. He didn’t write custom malware or build anything that sophisticated. He literally just used AI tools that anyone can access. Someone out there with the same level of skill could get inspired by this attack.

I remember asking ChatGPT about malware once- just a hypothetical scenario- and immediately getting a warning about violating guidelines. That’s what I expected. But then you look at tools like WormGPT, which was basically ChatGPT’s evil twin. No guardrails, no ethics, nothing. I know it got shut down, but dupes keep popping up. I checked one out because it was mentioned in a course I was taking, and the comments section was full of people from all over the world asking for the “right prompt” to generate new malware. It was kind of funny.

Then there’s HackerGPT. The name alone sounds sketchy, and I’m sure people have tried to use it for malicious stuff. When I tested it and asked the same question I asked ChatGPT, it didn’t warn me- it just answered. It is a great tool, though (when used ethically). That was the moment I realized how easy it is for someone to misuse these tools if they really want to.

People always talk about AI taking over jobs, but I don’t think AI will replace humans. Humans are still smarter, more adaptable, and more creative. AI can support analysts and researchers, but it can’t fully replace human judgment or intuition. Still, this whole situation shows how AI can be a double‑edged sword. It can help defend systems, but it can also push people in the direction of pushing out nefarious attacks.

Sources/links

https://venturebeat.com/security/claude-mexico-breach-four-blind-domains-security-stack

https://securityaffairs.com/188696/ai/claude-code-abused-to-steal-150gb-in-cyberattack-on-mexican-agencies.html

https://hackergpt.app/auth/login

Rachel Ku

Mexico Government Cyberattack via Anthropic's Claude

Summary

Why it matters

My thoughts

Sources/links

Concern Over Medical Data Recording and Cybersecurity Risks