BIP America News & Media Platform

collapse
Home / Daily News Analysis / A fake AI agent skill passed every security scanner and reportedly reached 26,000 agents

A fake AI agent skill passed every security scanner and reportedly reached 26,000 agents

Jun 24, 2026  Twila Rosenbaum  2 views
A fake AI agent skill passed every security scanner and reportedly reached 26,000 agents

In a stark demonstration of the vulnerabilities inherent in modern AI agent ecosystems, security firm AIR has revealed how a deliberately fake AI agent skill bypassed every major security scanner and reportedly reached approximately 26,000 agents, including some on corporate accounts. The experiment, detailed in a recent report, exposes a critical blind spot in how AI agent marketplaces vet skills for malicious behavior.

The fake skill, dubbed "brand-landingpage," claimed to help users build a landing page using Google's Stitch design tool. It was aimed at non-technical users such as marketers, salespeople, and designers. To make the skill appear legitimate, AIR leveraged two trust signals that the ecosystem still treats as proof of safety: GitHub stars and a clean scanner verdict.

For the GitHub stars, AIR opened a pull request to a skill marketplace repository that had around 36,000 stars and 156 skills. The pull request was merged after a few days, so the skill inherited the repository's star count, instantly gaining credibility. Then AIR ran an Instagram ad targeting the intended audience, who installed the skill and put it to work.

The scanners tested by AIR—including tools from Cisco, NVIDIA, and those built into the major skill registries—analyze only the package submitted. That means they inspect the skill definition file and anything shipped with it. AIR's skill carried no malicious setup instructions of its own. Instead, it told the agent to install the "Stitch SDK" by following documentation at an external link controlled by AIR, not the genuine Google domain.

Initially, that link pointed to the real Stitch documentation, so the scanners saw a clean package pointing at a plausible setup page and cleared it. The page the agent would actually fetch and follow sat outside the scan. Once the skill was installed widely, AIR swapped the page behind that link to one that instructed the agent to download and run a script. The payload was harmless by design, collecting only the user's email address, but AIR says a real attacker could have used the same foothold to read files, move data, or hit internal systems.

The Structural Blind Spot

The technique is not new. Three weeks before AIR published its results, security firm Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners built into the major skill registries. Their conclusion was clear: a scanner checks a fixed package while an attacker can keep tweaking the payload until it passes. Real campaigns have used the same trick for months, keeping the submitted skill clean and hosting the payload on a site the agent only fetches at install time.

The problem is structural. The scan happens once, but the page a skill points to can be rewritten at any time afterward. Anthropic's own documentation warns that skills fetching external URLs are risky for exactly this reason, since the content can change after the skill is vetted. Separate research this year found that seven major scanners agree on fewer than one in five hundred of their combined flags, because each one judges a skill in isolation, blind to external links and to what changes after review.

Trust Signals Under Scrutiny

The AIR experiment highlights how easily trust signals can be manipulated. GitHub stars, often used as a proxy for a project's popularity or reliability, can be borrowed by forking a repository or through pull requests. A clean scan verdict from an automated scanner is similarly unreliable because the scanner sees only a static snapshot. These signals have become the default way for users to decide whether a skill is safe, but they are proving to be dangerously insufficient.

The scale figures from AIR alone—26,000 agents reached, including corporate accounts—deserve a skeptical read. The firm is launching a managed skill marketplace and closes its write-up pitching it, so those numbers are not independently confirmed. However, what holds up is the method: the named scanners really do judge only the submitted package, the external-link blind spot is real and has been independently demonstrated, and the trust signals AIR borrowed are exactly the ones the ecosystem still treats as proof.

The experiment lines up every weak trust signal around agent skills into one run: stars that can be borrowed, a scan that reads a snapshot, and a link that can be rewritten after the check clears. Whether the real figure is 26,000 or a fraction of it, the gap it walks through is one that defenders still have not closed.

Wider Implications for AI Security

This attack vector is part of a broader trend of supply chain attacks targeting AI agents. As companies increasingly rely on AI agents to automate tasks, the security of the skills and plugins they use becomes paramount. In recent years, similar vulnerabilities have been found in browser extensions and mobile apps, where malicious code is introduced after approval. The AI agent ecosystem faces the same challenge but with the added complexity that agents can execute code in response to external instructions, making the external-link blind spot particularly dangerous.

For security teams, the immediate takeaway is the same one researchers keep landing on: treat skills as software, not text. Vetting must go beyond what ships inside the package to include what a skill points to. Organizations should route new skills through a single source they control, re-check them when anything changes, pin versions, and hold agents to the least privilege. In addition, runtime monitoring and behavioral analysis can detect when an agent fetches an external URL that deviates from the expected pattern.

The AI industry is still in its early stages of understanding the security implications of agentic workflows. The AIR experiment serves as a wake-up call that the current trust model is fundamentally broken. As researchers continue to demonstrate the ease with which these systems can be compromised, the pressure will mount on marketplace operators to implement more robust security measures, such as sandboxing external network requests, requiring signed payloads, and performing continuous monitoring rather than one-time scans.

The problem is not that the scanners are bad at their job—they are designed to inspect static packages, and they do that well. The problem is that the threat model has shifted. Attackers no longer need to embed malicious code directly in the skill; they can simply point the agent to a resource they control and change it after approval. Until the ecosystem adapts to this reality, the gap will remain open for exploitation. The AIR experiment, whether replicated at the claimed scale or not, proves that the trust signals used today are insufficient, and that a more dynamic, continuous approach to security is necessary to protect both individuals and enterprises from the next wave of AI supply chain attacks.


Source: TNW | Artificial-Intelligence News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy