I triaged this vulnerability from Astral's side, so I wanted to make a clarificatory point that's also present in the advisory[1]: parser differentials can be extremely bad, but the risk to the Python ecosystem in this particular case is moderated by the fact that tar is only used with source distributions, which already possess arbitrary code execution at resolution/install time by design.
In other words: this is an obfuscation vector within Python packaging, but it doesn't grant the attacker a novel privilege.
(This doesn't detract from the overall severity of the bug itself: there are plenty of ecosystems and contexts where this is a serious issue, and there's no easy way to assert that they aren't affected by the bug in this family of async-tar packages. Edera has done an excellent job of highlighting this, and I thank them for their disclosure!)
Package management systems are scary before packages are abandoned too. Your production infrastructure is trusting some random developer/s to both do the right thing and not get hacked.
That’s not to say oss cannot be trusted, but it certainly makes trusting smaller projects and packages scary.
There's plenty of open source things from Google and Microsoft that's been abandoned too; so you'd need to evaluate the project independently of the sponsor.
This doesn't apply to close source things because you wouldn't be able to use it in the first place.
Sorry I should have clarified that I was referring to language based systems (cargo, pip, npm, etc). But you do raise a good point, it’s less about the concept of package management and more around the point of curation and central security guarantees / policies / procedures. In theory RHEL package management system could have similar problems to cargo or npm, but they are much better funded and thus managed.
In practice, not principle. Virtually every non-trivial upstream package in debian/fedora/arch/whatever has at least a handful of distro-specific patches. Sometimes they're just configuration, sometimes they're distro-maintained security fixes, etc...
But people exercise those features regularly and distros are not shy about maintaining software. It's a very different world from "We Just Ship What They Give Us" in npm/cargo/etc...
I really hate it when various packages expect users to add their custom repo. Especially for something where I don’t care about updates.
Feels like every little thing should be in its own docker container with limited filesystem access. Of course that is a whole lot of trouble…
The dependency trees in cargo/pip also greatly bother me.
VS Code extensions are also under appreciated. Some turd makes a “starter pack” for rust/python/etc with a great set of common extensions… plus a few that nobody has heard of… Over time, they reach 50k-100k downloads and start to appear legit… Excellent way to exfiltrate trade secrets!!!
I think that would be pretty disruptive, and would break some assumptions around crate integrity that are deeply held.
My understanding is that the left-pad incident is not directly analogous, since it involved restoring a deleted package rather than modifying an extant package.
To the best of my knowledge, nobody has ever seriously claimed that Rust (or any other general purpose programming language) can fully prevent logic errors.
Rust's advantage is that it can prevent logic errors from becoming memory safety vulnerabilities (and separately, its type system makes some - but not all - classes of logic errors more difficult to introduce).
This doesn't appear to be a memory safety bug. It's a data handling error, and the "RCE" in question is that the tar code can be fooled by a malicious tarball into writing files with arbitrary permissions at arbitrary paths (which is... actually something all tarballs can do, so I'm not really following why this is being treated as severe).
But to your point: yes, it's a good example about how security bugs live at all layers of the stack and that being checked against memory corruption does nothing to prevent you from writing bugs in the semantic space.
I personally suspect it's an effect of the over reliance of the package manager approach to software development that rust and a few other languages use, which itself is an unintended to consequence of a well designed library import system.
Languages where importing a library is hard, libraries tend to grow quite large. Large libraries have larger backing, more established development and security protocols. When OpenCV, TinyUSB, Numpy, nimBLE start to struggle, it's easier to notice and companies relying on them may step up to fork, maintain of fund its continued use.
Languages where importing and creating a library is easy, we see small atomic packages for small utility programs, over large libraries. This spreads the software supply-chain wider, into smaller teams of maintainers. If the same amount of code is fractured over 50 small libraries maintained by 1-3 people each, the likelihood of one or two becoming abandoned grows.
I've been a bit weary about the dependency and package manager approach more modern languages use. It trades convenience for some pretty scary supply-chain attacks.
Newer languages have made packaging and importing dependencys significantly easier, but have done this while increasing coupling and making switching dependencys harder. This results in brittle dependency trees.
Using a private package manager, intermixing private and public, and substituting arbitrary dependencys with compatible alternatives, i.e. modularity, should be easy. Only then does solving the problem become easy.
What we used to have is a big ball of mud. Modern languages made it easier to decompose a system into components. But what we really want is easily decomposing a system into modular components which is yet unsolved.
I suspect because a lot of Rust library projects are just clones of a project (usually in C) translated to Rust. This is a good way to learn Rust. However, after you learn it, why maintain it? There isn't really much incentive. Add to that the politics of the Rust ecosystem and it causes churn in the dev population.
I suspect there are other reasons too. There is a cost to fad languages being used. Replicating the ecosystem of libraries around a language is a huge job. Its rare that a language ever gets the same size and quality ecosystem as say C or Java. But the fans of the language will try. This leads to a lot of ported projects and a small number of devs maintaining a huge number of projects. That's a recipe for abandonware. I suspect a lot of student projects too which is also likely to do the same.
I think the null hypothesis would be that they’re no more common in Rust, but that Rust’s low-friction packaging ecosystem makes them more apparent to you than they would be in C or C++.
(Think about the last time you checked whether the stack of GNU libraries on your Linux desktop were actively maintained. I don’t think anybody thinks about it too hard, because the ecosystem discourages thinking about it!)
Compared to C, Rust makes it easy to push stuff to a place that people will find it. That's the major difference. With C, people push their half-finished projects to Github where they drown in Github's poor discovery; in Rust they push their half-finished stuff to crates.io where people have more than half a chance to actually find it via a casual search.
3. rust is widely used by a lot of people, including juniors which don't know (yet) that it can be quite a pain to maintain a package and that it comes with some responsibility
4. so small hobby projects now can very easily become widely used dependencies as people looked at them and found them to have decent quality
5. currently "flat" package structure (i.e. no prefixes/grouping by org/project) there has been discussions for improving on it for a long time but it's not quite here yet. This matters as e.g. all "official" tokio packages are named tokio-* but thats also true for most 3rd party packages "specifically made for tokio". So `tokio-tar` is what you would expect the official "tokio" tar package to be named _if there where one_.
---
now the core problem of many unmaintained packages isn't that rust specific
it's just that rust is currently a common go to for young developers not yet burned by maintaining a package, it's supper easy to publish
on the other hand some of the previous "popular early carrier go to languages" had either not had a single "official" repo (Jave) or packaging was/is a quite a pain (python). Through you can find a lot of unmaintained packages in npm too, just it's so much easier to write clean decent looking code in rust that it's more likely that you use one in rust then in JS.
Optimistically because the component was considered self-contained, and done?
If you build things with wires, diodes, multiplexers, breakers, fuses and keyed connectors there's less maintenance needed than if you try and build a system entirely out of transistors and manually applied insulators.
I haven't looked at the package itself, but was it built on top of the C libraries with like, bindgen?
e: a glance suggests thats not the case, but perhaps they were ported naively by simply cloning the structure without looking at what it was implementing? that's definitely the path of least resistance for this type of thing. On top of that the spec itself is apparently in POSIX, some parts of which are, well, spotty; compared to RFCs
1. Radical new paradigm that critiques and disregards most of the traditional infrastructure.
2. Completely FOSS, barely any salaried devs, if any they are donation based.
3. Culture for code "reuse" instead of actually coding. Everyone wants it in their own flavour (we have tar, but I kinda want async-oop-tar)
4. Cognitive dissonance between 3 and 1, rusties don't want to succumb and use a standard tar library because of performance (self inflicted performance hit from creating an incompatible ecosystem) or pride (we need a version written in rust). All of this to download software that is probably written in C and from another ecosystem anyways. (An encoding/compression is a signature and tarballs are signature CLinux)
> rusties don't want to succumb and use a standard tar library because of performance (self inflicted performance hit from creating an incompatible ecosystem) or pride (we need a version written in rust).
Pure BS. If I wrote something in Rust rather than a binding it was because using often Linux based C libs on all Tier 1 platform is as smooth of a process as swimming in shards of glass.
Hi! Could you elaborate on the first attack scenario?
> Target: Python package managers using tokio-tar (e.g., uv). An attacker uploads a malicious package to PyPI. The package's outer TAR contains a legitimate pyproject.toml, but the hidden inner TAR contains a malicious one that hijacks the build backend. During package installation, the malicious config overwrites the legitimate one, leading to RCE on developer machines and CI systems.
It seems to imply that you’re already installing a package uploaded by a malicious entity. Is the vulnerable workflow something like “you manually download the package archive, unpack it with system tar, audit all the files and then run uv install, which will see different files”?
Someone could release a malicious package that looks okay to a scanner tool, but when installed using uv can behave differently, allowing attackers to masquerade executable code.
In addition, for OCI images, it is possible to produce an OCI image that can overwrite layers in the tar file, or modify the index. This could be done in a way that is undetectable by the processor of the OCI image. Similar attacks can be done for tools that download libraries, binaries, or source code using the vulnerable parser, making a tar file that when inspected looks fine but when processed by a vulnerable tool, behaves differently.
It is possible to exploit this bug by crafting a file that has tar contents without a header, thus making it hard to detect even with recursive archives.
Since this came up specifically for `uv` (i.e. since the Python ecosystem relies on source distributions packaged as .tar.gz): has the Python standard library implementation (which is used by pip) been checked for a similar vulnerability?
In other words: this is an obfuscation vector within Python packaging, but it doesn't grant the attacker a novel privilege.
(This doesn't detract from the overall severity of the bug itself: there are plenty of ecosystems and contexts where this is a serious issue, and there's no easy way to assert that they aren't affected by the bug in this family of async-tar packages. Edera has done an excellent job of highlighting this, and I thank them for their disclosure!)
[1]: https://github.com/astral-sh/tokio-tar/security/advisories/G...
reply