● PRISM - Cybersecurity

The AI Stack Got Poisoned: LiteLLM Supply Chain Attack Hit 47,000 Developers in 46 Minutes

By PRISM Bureau | March 26, 2026 | Cybersecurity / AI Infrastructure

A three-stage malware campaign hit LiteLLM - one of the most widely used AI model routing libraries - through a poisoned PyPI package. SSH keys, cloud credentials, Kubernetes clusters. All of it in under an hour.

Supply chain security attack visualization

Supply chain attacks target the infrastructure developers trust most. (Pexels)

On March 24, 2026, at 10:52 UTC, a malicious version of litellm landed on PyPI. Within 46 minutes, it had been downloaded 46,996 times. By the time PyPI quarantined the package, anyone who had run a fresh pip install litellm during that window had already had their SSH keys, cloud credentials, and crypto wallet files silently exfiltrated to an attacker-controlled server.

LiteLLM is not some obscure utility. It is the connective tissue of the AI developer ecosystem - a library that lets applications talk to OpenAI, Anthropic, Google, Mistral, and hundreds of other model providers through a unified API. According to PyPI download statistics, it sees roughly 3.4 million downloads per day. It sits deep inside the dependency trees of 2,337 other packages. When it gets poisoned, the blast radius is not small.

The attack, attributed to a threat actor group known as TeamPCP (also tracked as PCPcat, Persy_PCP, ShellForce, and DeadCatx3), did not start with LiteLLM. It started five days earlier with Trivy - an open source security scanner. That detail matters. Understanding why requires tracing the full chain.

46,996

Malicious Downloads

46 min

Window Open

2,337

Packages At Risk

88%

Had No Version Pin

How You Compromise a Security Scanner to Steal Developer Keys

The attack began weeks before LiteLLM was touched. (Pexels)

The chain of trust in modern software development is long and fragile. Developers trust their CI/CD pipelines. CI/CD pipelines trust GitHub Actions. GitHub Actions trust the repos they pull from. Each link is an attack surface.

TeamPCP identified a pull_request_target workflow vulnerability in the trivy-action GitHub repository. Trivy is a popular open source vulnerability scanner built by Aqua Security, widely used in CI/CD pipelines - including LiteLLM's. The workflow misconfiguration (tracked as GHSA-9p44-j4g5-cfx5) allowed an attacker to open a pull request that would run in the context of the target repository, giving them access to repository secrets including deployment credentials.

In late February 2026, a pull request from an account called "MegaGame10418" exploited this flaw to exfiltrate the aqua-bot GitHub credentials. With those credentials, the attackers could push to the trivy-action repository as if they were the maintainer.

On March 19, they used that access to rewrite Git tags in the trivy-action repository. Version tag v0.69.4 was silently pointed to a malicious release containing the same credential-harvesting payload that would later appear in LiteLLM. Any CI pipeline pinned to @v0.69.4 - including LiteLLM's own pipeline - was now running attacker code.

Four days later, the attackers used the foothold in LiteLLM's pipeline to extract the PyPI publisher credentials for the LiteLLM package maintainer. With those credentials, they could publish new versions of LiteLLM to PyPI directly, as if they were the legitimate maintainer. No code review. No merge request. Straight to the package index.

"The attack started five days before anyone noticed LiteLLM was compromised. The security tool was used as the entry point to compromise the software it was supposed to be scanning." - Snyk Security Research Team, March 24, 2026

On March 23, the attackers registered models.litellm.cloud - a domain name designed to look like legitimate LiteLLM infrastructure - as the exfiltration endpoint. They also prepared infrastructure around checkmarx.zone, mimicking Checkmarx, another security vendor. The entire operation had been months in planning.

Two Versions, Two Attack Vectors, One Objective

LiteLLM 1.82.7 and 1.82.8 used different delivery mechanisms but harvested the same data. (Pexels)

When TeamPCP finally moved on March 24, they published two distinct malicious versions thirteen minutes apart. Each used a different technical mechanism, suggesting the second was an escalation or fallback - deployed specifically because the first had a narrower blast radius.

LiteLLM 1.82.7 injected a payload into proxy_server.py. When the file was imported, it dropped a secondary script p.py that executed the credential-harvesting routine. This version primarily affected users running LiteLLM's proxy server mode - a common deployment for organizations routing multiple AI model calls through a centralized endpoint. If you used LiteLLM only as a direct SDK and never imported litellm.proxy, version 1.82.7 would not have triggered its payload. Exfiltration target: checkmarx.zone/raw.

LiteLLM 1.82.8 was more aggressive. It included a .pth file named litellm_init.pth embedded directly in the Python site-packages directory. The .pth mechanism is a rarely-discussed but highly dangerous feature of Python's startup machinery: any .pth file placed in site-packages containing a single line of Python code will execute automatically on every interpreter startup - before any application code runs, before any imports, before anything the developer controls.

The implication is stark: the payload in 1.82.8 ran during the pip install itself. The act of downloading and installing the package was sufficient to trigger the malware, even if you never ran a single line of your application. CI/CD pipelines that installed litellm in a fresh environment during the 46-minute window were compromised before they executed a single test. Exfiltration target: models.litellm.cloud.

The fork bomb that ultimately exposed the attack was an unintended bug in 1.82.8's design. Because .pth files trigger on every interpreter startup, the payload's use of subprocess.Popen to spawn a child Python process caused a chain reaction: the child also triggered the .pth, spawning another child, ad infinitum. A machine at FutureSearch rapidly filled with 11,000 Python processes, consuming all available RAM. That was the first visible sign something was wrong.

What the Malware Actually Did

The payload operated in three stages: collect, exfiltrate, persist. (Pexels)

Once the fork bomb was identified and traced, FutureSearch researcher Callum McMahon decoded the litellm_init.pth payload - 34,628 bytes, double base64-encoded. What he found was a professionally built, three-stage credential theft and persistence tool.

Stage 1 - Collection. The payload systematically harvested sensitive files from the compromised host. The target list reads like a complete inventory of developer credentials:

SSH private keys and config files (~/.ssh/id_rsa, ~/.ssh/config)
.env files in the current directory and parent directories
AWS credentials (~/.aws/credentials, ~/.aws/config)
GCP Application Default Credentials (~/.config/gcloud/)
Azure CLI token cache (~/.azure/)
Kubernetes configuration files (~/.kube/config)
Database passwords and connection strings
.gitconfig (which often contains embedded tokens)
Shell history files (~/.bash_history, ~/.zsh_history)
Crypto wallet files matching common naming patterns
All current environment variables via os.environ
Cloud metadata from instance metadata service (IMDS) endpoints

Stage 2 - Exfiltration. Collected data was not sent in plaintext. The payload encrypted the harvested files using AES-256-CBC with a randomly generated session key, which was then itself encrypted with a hardcoded 4096-bit RSA public key. The result was bundled into a tar archive and POSTed to the attacker's server. Even if someone intercepted the traffic, they could not read it without the attacker's private RSA key.

Stage 3 - Persistence and Lateral Movement. If the malware detected a Kubernetes service account token - indicating the compromised machine was operating within a Kubernetes cluster - it escalated significantly. The payload read all cluster secrets across all namespaces and attempted to create a privileged alpine:latest pod on every node in kube-system. Each pod mounted the host filesystem and installed a persistent backdoor at /root/.config/sysmon/sysmon.py, maintained by a systemd user service. On non-Kubernetes machines, the same backdoor was installed locally.

For organizations running LiteLLM in Kubernetes environments - a common pattern for teams building AI application infrastructure - the potential blast radius extended from a single compromised developer machine to the entire cluster.

The Disclosure Fight: When the Maintainer Account Spammed Its Own Bug Report

Developer at computer screen debugging code

The attacker used the compromised maintainer account to suppress the initial disclosure. (Pexels)

At 11:48 UTC on March 24 - less than an hour after the malicious packages were uploaded - Callum McMahon at FutureSearch opened a public disclosure issue on the LiteLLM GitHub repository. The issue, #24512, laid out exactly what had been found: the malicious .pth file, the decoded payload, the exfiltration infrastructure.

What happened next was a secondary attack. At 12:44 UTC, the issue was flooded with bot comments - hundreds of automated responses designed to dilute and obscure the disclosure thread. At 13:03 UTC, the issue was closed as "not planned" using what appeared to be the compromised maintainer account. The attacker was actively using their access to the LiteLLM GitHub account to suppress awareness of the very attack they had just carried out.

The suppression attempt failed. By 12:36 UTC, the Hacker News community had picked up the disclosure - the thread ultimately reached 324 points. McMahon updated his disclosure post with confirmation that the issue had been closed by the compromised account. A clean tracking issue (#24518) was opened. The broader developer community was already alerting each other on Reddit's r/LocalLLaMA and r/Python communities.

PyPI quarantined both versions at approximately 13:38 UTC - roughly three hours after the first malicious version was published. The litellm maintainers confirmed at 15:09 UTC that all GitHub, Docker, and PyPI credentials had been rotated and the compromised maintainer accounts moved to new identities. By 15:27 UTC, the compromised versions were deleted and the package was unquarantined.

ATTACK TIMELINE - MARCH 24, 2026 (UTC)

10:39 UTC

LiteLLM 1.82.7 published with proxy_server.py payload. Exfil target: checkmarx.zone/raw

10:52 UTC

LiteLLM 1.82.8 published with .pth file attack. Exfil target: models.litellm.cloud

11:48 UTC

FutureSearch opens public disclosure issue #24512 on GitHub

12:36 UTC

Hacker News thread goes live. Reaches 324 points.

12:44 UTC

Attacker floods disclosure issue with bot comments using compromised maintainer account

13:03 UTC

Issue #24512 closed as "not planned" by compromised account

13:38 UTC

PyPI quarantines both versions - 46 minutes after 1.82.8 upload

15:09 UTC

Maintainers confirm full credential rotation, compromised accounts moved to new identities

15:27 UTC

Compromised versions deleted. Package unquarantined on PyPI.

The Blast Radius: 88 Percent of Dependent Packages Were Exposed

The true blast radius extends beyond direct installs to every package that depends on litellm. (Pexels)

In the aftermath, FutureSearch analysts queried the BigQuery public PyPI dataset to calculate the full scope of the exposure. The numbers are unsettling.

2,337 packages on PyPI list litellm as a dependency. Of those, 88 percent - or 2,054 packages - had version specifications that would have allowed their package resolver to pick up 1.82.7 or 1.82.8 during the attack window. Only 283 packages (12%) were pinned to a safe version or had upper bounds that excluded the compromised versions.

The breakdown of exposure by version constraint type reveals how widespread unpinned dependencies are:

283 packages (12%) - No constraint at all (just litellm). Fully exposed.
1,388 packages (59%) - Lower bound only (>=X). Fully exposed.
383 packages (16%) - Upper-bounded but range includes 1.82.x. Exposed.
74 packages (3%) - Upper-bounded, excludes 1.82.x. Safe.
209 packages (9%) - Pinned to exact safe version (==X.Y.Z). Safe.

The analysis also revealed a significant difference between package managers. The malicious versions were downloaded predominantly via pip, while uv users showed stronger protection because lock files prevented resolvers from picking up newer versions mid-build. 1.82.8 was downloaded six times more than any safe version in the attack window - almost entirely through pip resolvers picking the latest available version.

It is worth emphasizing that the analysis covers direct dependencies only. The true blast radius is larger. If a popular framework that is itself unpinned happens to pull in litellm as a transitive dependency, every package that depends on that framework was also exposed. Transitive dependency analysis at scale remains an unsolved problem in the ecosystem.

The .pth File Attack: A Python Feature Almost Nobody Thinks About

Python's .pth mechanism has existed for decades and is rarely discussed as a security risk. (Pexels)

The technical choice to use a .pth file in 1.82.8 deserves specific attention, because it represents a meaningful escalation in attack sophistication and a vector that is poorly understood even among experienced developers.

Python's site-packages directory supports .pth files as a mechanism for extending the Python path - historically used to make packages available across multiple environments. The Python documentation notes that .pth files can contain executable Python code on lines starting with import, but the practical implication of this is rarely discussed in security contexts: any .pth file in site-packages runs before any application code, before the interpreter is considered "initialized," before anything in the user's control.

The MITRE ATT&CK framework classifies this technique as T1546.018 - "Python Startup Hooks." It is not a new technique. It has been demonstrated in academic security research for years. But the LiteLLM attack is one of the first high-profile real-world deployments of it in a supply chain context, and the scale of the deployment makes it significant.

The irony of the fork bomb is instructive. The attacker chose .pth specifically because it runs during installation - they wanted the payload to execute even in CI/CD environments that never actually ran the application. They got exactly that. But the recursive subprocess spawning bug caused the machine to become unresponsive in a way that was impossible to miss. Had the malware been written more carefully - spawning only a single non-recursive subprocess - it might have silently collected and exfiltrated credentials from tens of thousands of machines without a single alarm going off.

"Every pip install of 1.82.8 executed the payload. The .pth file ran during installation itself. The 23,142 pip installs of 1.82.8 represent 23,142 environments where the malware ran before any application code." - FutureSearch Security Analysis, March 25, 2026

The lesson for developers building CI/CD pipelines is stark: installing a malicious package is now sufficient for compromise. Running it is not required.

TeamPCP and the Broader Supply Chain Campaign

TeamPCP has been active across multiple open source projects. (Pexels)

TeamPCP is not new to the threat intelligence community. The group, tracked under multiple aliases (PCPcat, Persy_PCP, ShellForce, DeadCatx3), has been active across several open source ecosystems. Snyk's research into the LiteLLM attack identified the group as responsible for the prior Trivy compromise and traced the infrastructure connections - the use of checkmarx.zone as an exfiltration domain mirroring the name of a legitimate security vendor, the same credential-harvesting payload structure appearing across multiple attacks.

The Trivy compromise on March 19 was itself preceded by the Checkmarx KICS GitHub Action being compromised on March 23, with C2 domains registered the same day. The sequencing shows deliberate preparation: establish persistence in a widely trusted security tool, use that foothold to compromise CI/CD pipelines of high-value targets, extract package publishing credentials, then deploy malicious package versions timed for maximum download volume.

The timing within the day - uploading at 10:52 UTC on a Monday - is not accidental. That is the peak of the North American developer workday overlap with European working hours. CI/CD pipelines rebuild most frequently at the start of the work week. The attack window was chosen to maximize the number of automated installs that would pull the latest version.

The use of the compromised maintainer account to close the disclosure issue on GitHub is a detail that elevates this beyond a simple credential theft operation. The attackers were actively monitoring for discovery and had a suppression plan ready to execute. This is not script kiddie behavior - this is an organized team with pre-planned incident response on behalf of their own attack.

The Second-Order Effects: Who Actually Got Hit

AI developer working on cloud infrastructure laptop

The LiteLLM attack specifically targeted AI infrastructure teams building on cloud platforms. (Pexels)

The choice of LiteLLM as a target was not random. LiteLLM is disproportionately used by exactly the type of developer who has the most valuable credentials: engineers building AI applications on cloud infrastructure, often with access to expensive API keys, production Kubernetes clusters, and multi-cloud environments.

A developer building an AI application with LiteLLM almost certainly has OpenAI API keys, Anthropic API keys, and possibly Google Vertex or AWS Bedrock credentials configured in their environment. The same machine likely has ~/.aws/credentials with production-level access, SSH keys to development and staging servers, and Kubernetes configs granting cluster-admin access to cloud environments burning significant compute budgets.

The attacker's target list was precisely optimized for this population. Crypto wallets were an added bonus - developers in the AI space disproportionately hold crypto assets. Shell history is uniquely valuable for understanding what internal systems exist and what credentials might be embedded in commands typed at the terminal.

For organizations that had LiteLLM running in Kubernetes environments, the Kubernetes worm component represents the most severe potential impact. A compromised developer machine in a Kubernetes cluster - even one with limited RBAC permissions - can often read service account tokens that grant broader cluster access. The payload's attempt to enumerate all cluster secrets across all namespaces and plant persistent pods in kube-system represents a potential full cluster takeover for any environment where the compromised account had sufficient RBAC permissions.

How many organizations actually experienced full credential exfiltration remains unknown. The 46,996 download count represents environments where the installer ran. Not all of those environments would have had valuable credentials to steal - many downloads occur in clean CI/CD containers with no persistent secrets. But even a single-digit percentage of those installs resulting in successful credential exfiltration from production environments represents a significant data breach by any measure.

What Needs to Change - and What Won't

Security best practices developer workflow

The LiteLLM attack exposes fundamental structural problems with how the Python ecosystem handles trust. (Pexels)

The LiteLLM attack was discovered not because of any security control that was supposed to catch it, but because a developer noticed their laptop becoming unresponsive and traced it to a newly installed Python package. That is not a detection system. That is luck.

PyPI quarantined the package in approximately 46 minutes once the disclosure was posted - but 46,996 downloads had already happened. The package was available for a total of roughly three hours before being yanked. PyPI has improved its incident response capabilities significantly in recent years, but the fundamental problem remains: the barrier to publishing a new version of an existing package is intentionally low, and once a malicious version is published, the window before it is detected and removed is measured in minutes to hours, while the window during which automated systems pull it is nearly instantaneous.

Several structural changes would reduce the attack surface. Lock files are the most immediately actionable: uv.lock, poetry.lock, and pip-compile output prevent resolvers from picking up new versions without explicit developer action. The analysis showed that uv users were significantly more protected during the attack window. This is not a new recommendation, but the LiteLLM attack quantifies exactly how much protection lock files provide.

CI/CD pipeline permissions need hardening. The Trivy compromise worked because the pull_request_target workflow ran in the context of the target repository, giving it access to repository secrets. GitHub's own documentation warns about this exact pattern. The fix is not exotic - it requires explicitly denying secret access in workflows triggered by external pull requests. But many open source projects have not applied these controls.

The .pth attack vector needs broader awareness. Python could theoretically sandox .pth file execution or restrict it to explicit allowlists, but any such change would break legitimate tooling that has relied on this mechanism for years. The more realistic near-term fix is for security scanning tools - including Trivy, which was itself compromised in this attack - to flag unexpected .pth files in installed packages.

Package signing and provenance attestation, which PyPI has been rolling out incrementally, would not have prevented this attack. The attacker had legitimate publisher credentials. Signed packages from a compromised account are still malicious packages. The harder problem is detecting when publisher credentials themselves have been stolen.

What will likely happen: some developers will pin their dependencies more carefully for a few months. Security teams at large organizations will do incident response reviews. The PyPI ecosystem will continue to grow more complex and interdependent. The next supply chain attack on a widely used AI infrastructure package will not announce itself with a fork bomb.

IF YOU INSTALLED LITELLM ON OR AFTER MARCH 24, 2026

Check for compromise immediately:

Run pip show litellm - if version is 1.82.7 or 1.82.8, you were affected
Check for the malicious .pth file: find ~/.cache/uv -name "litellm_init.pth"
Check for persistence: ls -la ~/.config/sysmon/sysmon.py
In Kubernetes: kubectl get pods -n kube-system | grep node-setup
Rotate ALL credentials on affected machines: SSH keys, AWS/GCP/Azure creds, API keys, database passwords, crypto wallet keys
Audit Kubernetes cluster secrets for unauthorized access if running in a cluster

The Irony That Won't Leave You

Trivy is a security scanner. Its job is to find vulnerabilities in software. Its CI/CD pipeline was compromised to attack a security-conscious developer community building AI infrastructure. The attacker used a security tool to defeat security - and the disclosure issue on GitHub was closed by the very account it was supposed to be reporting to.

LiteLLM is a library that manages API keys for AI models. Its job is to handle credentials securely across providers. The attack specifically targeted those credentials - the OpenAI keys, the Anthropic keys, the cloud provider tokens - because that is exactly what a developer using LiteLLM would have configured.

The AI development ecosystem has grown at a pace that has outrun its security infrastructure. New libraries appear weekly. Dependency trees deepen faster than anyone audits them. GitHub Actions workflows are forked and deployed without careful review of the permissions they request. This is not unique to AI development - it is the condition of modern open source software at scale - but the AI ecosystem's particular combination of valuable credentials, cloud infrastructure, and rapid adoption makes it a high-value target for exactly the kind of patient, multi-stage supply chain operation that TeamPCP ran against LiteLLM.

The 46-minute window is the headline number. The real number is however many organizations rotated all their credentials after checking whether they were affected - versus however many organizations simply updated to 1.82.9 and moved on. That second group still has compromised machines. They just don't know it yet.

Get BLACKWIRE reports first.

Breaking news, investigations, and analysis - straight to your phone.

Join @blackwirenews on Telegram