All articles

What is malicious code? Examples, how it spreads, and how to stop it

The Chainguard Team
AppSecSoftware Supply Chain
Key Takeaways
  • Malicious code arrives through trusted channels — signed packages, verified registries, compromised maintainers — not obvious intrusions.

  • Scanners miss threats with no CVE, obfuscated payloads, or binary-source mismatches — structural prevention is required upstream.

  • Remove tools like curl and wget from containers, enforce least privilege, and replace individual trust decisions with enforced policy.

  • Chainguard rebuilds every package from verified source, ships near-zero CVE images, and enforces supply chain policy before code reaches developers.

Malicious code: what it is, how it spreads, and how to stop it

A modern cyberattack doesn't need to break in anymore; like an old-world vampire, they're simply invited inside. They arrive signed by legitimate maintainers, published through verified registries, and wrapped in code that passes review — because at every layer, the tools meant to catch them are designed to trust exactly what the attacker delivered.

In early 2024, a Microsoft engineer noticed that SSH logins on a test system were taking a few hundred milliseconds longer than they should. That anomaly turned out to be CVE-2024-3094, a backdoor quietly inserted into XZ Utils, a compression library bundled with nearly every Linux distribution. The attacker had spent two years contributing useful code to the project before earning the maintainer access needed to plant it. It passed code review, arrived through a trusted channel, and was only caught because one engineer noticed something felt off during unrelated work. The SolarWinds Sunburst cyberattack ran a similar playbook on a larger scale, distributing malicious code to roughly 18,000 organizations through a poisoned build pipeline. These incidents are why the White House issued Executive Order 14028, which mandates detailed manifests of software components, known as SBOMs, across federal software procurement.

The industry's default response is to "shift left" by scanning for every potential cyber threat earlier in the development cycle. Chainguard's position is that scanning is downstream of the real problem. The alternative is to start left: control the supply chain itself so that verified artifacts arrive before vulnerabilities can appear. This article covers what malicious code is, how it operates, and what prevention against malicious code attacks actually looks like.

Let's define malicious code

Malicious code is code intentionally written or modified to cause harm. A few distinctions matter here.

Malware refers to complete, standalone malicious software applications, such as ransomware or spyware. Malicious code is broader: it can be a package, a function, a dependency, or a single string inserted into an otherwise legitimate codebase. The XZ Utils backdoor was not a separate application. It was a few hundred lines embedded in a widely trusted open-source library.

Poorly written code creates vulnerabilities too, but those are accidents. Malicious code is deliberate, placed by hackers, cybercriminals, or threat actors specifically to create a pathway into systems that would otherwise be out of reach. A well-placed dependency in a popular package can create vulnerabilities across thousands of downstream applications at once.

What are some examples of malicious code?

XZ Utils (2024): A threat actor spent roughly two years contributing legitimate code to XZ Utils before gaining maintainer access and inserting a backdoor targeting OpenSSH on systemd-based Linux systems. It was discovered almost by accident when a Microsoft engineer noticed unusual CPU usage during benchmarking.

LiteLLM/Trivy (2026): In March 2026, two compromised versions of LiteLLM (1.82.7 and 1.82.8) were uploaded directly to PyPI, bypassing the project's official CI/CD workflows entirely. LiteLLM is a widely used LLM framework that integrates with models hosted on platforms like Hugging Face, making it a high-value target: a credential-stealing AI security tool can reach many sensitive environments, including those with access to model training datasets and cloud infrastructure. The malicious payload was designed to harvest environment variables, SSH keys, cloud provider credentials, Kubernetes tokens, and database passwords, then exfiltrate them to an attacker-controlled domain mimicking an official LiteLLM URL. The compromise originated from a separate supply-chain attack on Trivy, in which stolen credentials were reportedly used to gain unauthorized access to the LiteLLM publishing pipeline. One supply chain breach cascaded into another.

Malicious Axios versions on npm: Chainguard researchers identified malicious Axios versions published to npm that mimic the popular HTTP client. Chainguard customers were protected because Libraries rebuilds packages from a verified source rather than mirroring the public registry.

The pattern is consistent across all three incidents: the threat arrived through a trusted channel, stayed quiet during development, and activated only in production.

Types of malicious code

Beyond the supply chain, though, malicious code takes many forms. Here's a reference for the categories worth knowing:

Type

How It Spreads / Acts

Primary Goal

Key Characteristic

Virus

Attaches to a legitimate file or program

Corrupt data, delete files, crash systems

Generally requires human action (like opening an email attachment) to spread

Worm

Self-replicating across a network

Spread to as many nodes as possible

Does not require human interaction to move between computer systems

Trojan Horse

Disguises itself as legitimate software

Create a backdoor for attackers

Relies on social engineering or phishing to trick users into installing it

Ransomware

Encrypts files or locks the system

Extort payment for a decryption key

Often threatens to leak sensitive data if the ransom isn't paid

Spyware

Runs silently in the background

Steal sensitive information and login credentials

Designed to stay undetected as long as possible

Adware

Delivers unwanted advertisements

Generate revenue through forced views or clicks

Often delivers pop-ups and is bundled with free software

Rootkit

(Unauthorized Access)

Gains root access to the operating system

Hide other malware, maintain long-term access

Designed to be invisible to antivirus software

Keylogger

Records keystrokes made by the user

Capture login credentials, intercept private messages

Can be hardware-based (USB) or software-based (malicious script)

Logic Bomb

Lies dormant until a specific trigger is met

Sabotage or data destruction

Usually planted by an insider; triggered by a date or system event

How does malicious code get into systems?

Understanding attack vectors matters because most conventional security tools are designed to catch threats after they've already arrived. Supply chain attacks succeed by exploiting channels that those tools inherently trust.

Typosquatting and dependency confusion. Attackers publish packages with names nearly identical to popular ones, or trick build systems into pulling a malicious public package instead of an intended internal one. Both attacks succeed because package managers resolve dependencies automatically, often without human review.

Compromised maintainer accounts. When a trusted maintainer's account is compromised (as in the LiteLLM case, where credentials stolen via the Trivy attack were used to publish malicious packages directly to PyPI), the resulting packages bypass authentication checks and appear through legitimate channels. This often starts with a compromised GitHub account or stolen publishing credentials. A policy engine that trusts the registry has no basis to flag them.

Living off the land. Once threat actors gain access to a container or build environment, they typically don't bring their own tools. They use whatever's already there: curl, wget, python, and standard utilities present in most base images. From there, targets include databases, internal APIs, and secrets stores. This makes malicious scripts harder to distinguish from legitimate activity, since they use legitimate software.

Alert fatigue. Standard images often ship with hundreds of known vulnerabilities. Security teams become desensitized to the constant stream of "High" and "Critical" findings, allowing a real malicious injection to hide behind CVE noise.

Chainguard's build-from-source approach addresses each of these directly. Chainguard Libraries rebuilds every package in an SLSA Level 2-compliant environment, so tampered binaries never reach a developer. Chainguard Containers are minimal by design: when curl and wget are absent, attackers cannot use them. And because Chainguard Containers ship with near-zero CVEs, any real anomalies stand out rather than get buried.

Why scanners can't solve this problem alone

Scanning is not useless. But it's worth being clear about what scanning actually means, because there's a gap most teams don't realize exists.

The most fundamental issue is the binary-source mismatch problem. When a published binary — the actual software artifact you install and run — doesn't match its source repository, a source-code scanner will never find the injected payload. It only exists in the compiled binary. Most security tools don't verify that the binary and source code are consistent because they assume the registry can be trusted.

The second issue: malicious packages that are never assigned a CVE pass through every scanner and patcher completely undetected. The malicious LiteLLM versions were live on PyPI for roughly 40 minutes before being quarantined. No CVE was issued. Any team that pulled the package during that window had nothing for a scanner to match against.

Third, payloads encoded as base64 strings or buried inside post-install hooks are designed to defeat static analysis. Scanners work by matching known patterns. Novel obfuscation, by definition, doesn't match any of them.

Scanners like Snyk, Wiz, and Grype provide real value for real-time CVE visibility. Policy engines like Sonatype Firewall and JFrog Curation add useful governance. Patchers like Seal Security and HeroDevs address known issues reactively. None of them verifies that a binary matches its source. Chainguard Factory rebuilds every package from a verified source before it reaches any developer, eliminating binary-source mismatches as a structural matter rather than trying to detect them after the fact.

The deeper problem: public registries run on inherited trust

Most cybersecurity discussions focus on the malicious code itself. The harder problem is the distribution layer that delivers it. PyPI, npm, and Maven Central all operate on a trust-by-default model: anyone can publish, binaries aren't verified against source, and maintainer accounts can be transferred or compromised without any formal review process. AI-driven development is making this worse: as AI tools generate more code and pull in more dependencies at speed, the attack surface for supply chain security — including the AI supply chain — expands faster than most risk management frameworks are built to handle.

Over 512,000 malicious packages were discovered in the past year alone, roughly one every minute, a 2.5x year-over-year increase. Critically dangerous packages jumped approximately 4x quarter over quarter. A policy engine layered on PyPI still pulls from PyPI. A scanner running on an npm package still trusts that the package is what it claims to be. The registry's provenance problem does not disappear because governance has been added on top.

Chainguard Repository adds policy enforcement on top of a verified distribution model. Policy enforcement here means rules applied at the point of distribution: configurable cooldown periods that block 47% of malicious packages before they can be installed, CVE blocking that prevents packages above a defined severity threshold from resolving, and license controls that flag or reject packages that don't meet organizational requirements. Developers pull from a verified endpoint. Rogue packages from public registries are simply not invited in.

Practical steps for protecting against malicious code

Adopt a minimalist container strategy

When tools like curl, wget, or python are absent from a container, malicious scripts that depend on them are dead on arrival. Chainguard Containers provide 1,300+ minimal, zero-CVE images rebuilt from source daily, with 97.6% fewer vulnerabilities than industry alternatives and a contractual SLA for patching.

Implement automated vulnerability management

The time between CVE disclosure and patch deployment is exactly when attackers move. Chainguard OS and the Chainguard Factory automate this: every package is tracked upstream, rebuilt on updates, and tested before distribution. Anduril went from an unmanageable CVE backlog to zero known vulnerabilities, freeing engineers to focus on building features.

Verify your source with SBOMs and signatures

Code that looks safe is not the same as code proven to be tamper-free. Every Chainguard artifact ships with Sigstore signatures, SLSA Level 2 provenance, and a complete SBOM. Chainguard Repository enforces this at the distribution layer and integrates with JFrog Artifactory, Sonatype Nexus, and Cloudsmith without workflow disruption.

Enforce the principle of least privilege

Applications should not run as root unless absolutely necessary. When malicious code executes as a non-privileged user inside a read-only file system, its ability to establish persistence or perform lateral movement is severely constrained. Scope permissions at the application, container, and infrastructure levels.

Replace individual trust decisions with policy

Security training matters, but it doesn't scale. Individual developers making trust decisions on every upstream dependency is not a sustainable application security model, and it's increasingly untenable as AI tools accelerate how quickly code is written and dependencies are pulled in. Chainguard vets every package before it reaches the pipeline. Chainguard Actions extends this to CI/CD with secure-by-default workflows that prevent tag hijacking, dependency confusion, and secret exfiltration. Switching requires a 20-character configuration change.

What end-to-end supply chain verification actually looks like

Traditional workflow. A developer runs pip install and pulls from PyPI. A scanner runs post-install. If nothing is flagged, the package ships. The binary is never verified against the source.

Verified workflow with Chainguard. Teams point their package manager at Chainguard Repository instead of PyPI directly — a one-time configuration change. From that point on, every pip install resolves through the verified distribution layer rather than the public registry. Chainguard Factory, an AI-powered build system, detects upstream changes, triggers a clean rebuild from a verified source, runs compatibility tests, signs the artifact with Sigstore, generates an SBOM, and distributes it through Chainguard Repository with policy enforcement applied. The developer runs pip install and gets a verified package. Same command, entirely different provenance.

Each layer of the Chainguard stack maps to a specific part of this problem: Chainguard Libraries for dependency security (98%+ of ecosystem malware prevented, drop-in compatible with pip, npm, and mvn), Chainguard Containers for minimal images, Chainguard OS and Chainguard Factory for the build system, Chainguard Repository for policy enforcement, and Chainguard Actions for CI/CD security.

Anduril reduced its CVE backlog to zero. Appian achieved FedRAMP ATO in months rather than their original estimate of over a year. GitGuardian achieved a 100% reduction in container CVEs. The vampire doesn't get invited in if you control who's at the door.

Get a demo of Chainguard Libraries today to see how Chainguard prevents malicious code from entering your software supply chain.

Share this article

Frequently Asked Questions

Related articles

Execute commandCG System prompt

$ chainguard learn --more

Contact us