back to article Alert: 15-year-old Python tarfile flaw lurks in 'over 350,000' code projects

At least 350,000 open source projects are believed to be potentially vulnerable to exploitation via a Python module flaw that has remained unfixed for 15 years. On Tuesday, security firm Trellix said its threat researchers had encountered a vulnerability in Python's tarfile module, which provides a way to read and write …

  1. DeathSquid
    WTF?

    You could claim classic tar(1) has exactly the same security hole. Or GNU tar with the -P option.

    1. Joe W Silver badge

      surely noone would be stupid enough to run it as root?

      ok.

      yeah.

      right.

    2. DS999 Silver badge
      Facepalm

      If you extract an archive from an untrusted source as root, while using the -P option without knowing what it does, then you deserve everything that happens to you.

      1. Graham Cobb Silver badge

        -P is an option for writing tar files, not for reading them according to the tar manpage on my system.

        1. Graham Cobb Silver badge

          However, the manpage lies... -P is used for extracting as well.

        2. VoiceOfTruth

          This is why you should use a proper operating system, like FreeBSD: "By default, absolute pathnames (those that begin with a / character) have the leading slash removed both when creating archives and extracting from them."

          1. Anonymous Coward
            Anonymous Coward

            What....? Explain. In fact, the article doesn't explain how the exploit works at all, just that ".." is used somehow.

            The usage of .. doesn't magically traverse to root so, how this exploit works is probably due to the parsing of the specific python module and not how the system interprets paths or any other bin tool like the first post suggests.

            1. VoiceOfTruth

              A couple of ways to create a "bad" tarball:

              1. Have absolute paths - /etc/passwd /.bashrc or whatever. When the tarball is extracted it will overwrite those files. Absolute paths are usually seen as bad practice, but they do have their uses (particularly for backups and restores).

              2. Use relative paths which go up to / eventually (../../../../../../../../../ you have have dozens of these) and then overwrite something like /etc/passwd or /root/.bashrc or whatever.

              To cause this overwriting a few things need to happen.

              A. You need a "bad" tarball. This is easy to create.

              B. You need to be root or have root privileges.

              C You, with your root privileges, need to be fool enough to extract this untrusted tarball as root.

              Some people are complaining that point C is too easy to do and the system, or rather the python tarfile module, should not allow it (particularly the extracting part). It's a reasonable point - I gave a quote from the FreeBSD tar man page that to make use of absolute paths you need to at least specify an option to do so - it is not the default behaviour. However those same complainers seem to be overlooking the fact that a root user has extracted an untrusted tar file in the first place.

              1. Anonymous Coward
                Anonymous Coward

                @VoiceOfTruth

                Alex?

                Thought you were in jail for five years.

            2. that one in the corner Silver badge

              Put in a lot of ../../.. and you'll eventually ascend to root.

              So, having a ridiculously deep directory tree is actually a security feature! Well, until the bad guys put in a ludicrously large number of ../

            3. VoiceOfTruth

              I meant to add: I'm not sure why your question was downvoted. I don't mind people not knowing and not understanding. Don't be discouraged from posting due to down votes.

              1. Anonymous Coward
                Anonymous Coward

                This isn't about sitting at a shell prompt and 'cd ../../../../../../../boot' or what you can or cannot do outside of this "tarfile", this is about the module's tarfile.extract(). If "tarfile" simply passes the string to the OS, what is the point of the library versus simply using subprocess.run() for both tar and mkdir?

                While off topic, there's also the question of why isn't "tarfile" reading the paths for validity regardless? Shoot, .tar doesn't have any built in parity types, not even 16bit CRC, so why assume the paths are correct at all?

                It could be argued that path checking will make the extraction process painfully slow on low memory systems that can't cache but, blindly extracting any archive without looking should be a non-default option. Or more simply, protections should already be in place (to be optionally bypassed).

            4. John Robson Silver badge

              from / the path ../ is just /, so ../../../../../../../../../../../../../../../../../../ is almost certain to be the root of the fs when you extract a tar.

              So you just prefix etc/passwd or etc/shadow with that and robert is the brother of one of your parents.

            5. Doctor Syntax Silver badge

              It seems a bit garbled. I read it as saying that if you back up through enough parent directories you will end up at / but that ignores what happens if there are too many .. levels. Will it stop there or throw an error and do nothing? The latter would mean that malicious use would require either a lucky guess or steering the user to a specific level in the hierarchy in the first place.

              The real problem here is running an untar out of immediate user control. It's best to run tar -t or use the equivalent functionality in a GUI archive manager such as Ark. That way you can be sure you know what might be affected.

              1. VoiceOfTruth

                -> what happens if there are too many .. levels. Will it stop there or throw an error and do nothing

                The root inode is special. You can't go above it. If you do 'ls -lid / /. /..' you will see they have. the same inode number (on FreeBSD, MacOS, etc). I don't know what happens on Windows.

                1. Michael Wojcik Silver badge

                  Same thing on Windows – that is to say, on NTFS and FAT32 and other filesystems normally supported by Windows.

                  Using .. for path traversal up to root is an ancient technique; it was widely used in exploits in the previous century. Kind of surprised it's not well-known to most Reg readers. And, yes, this is a problem if you're running the SUS (what succeeded POSIX) tar command or similar with excess privileges, since by default it honors .. in path components.

                  It's sufficiently well-known that when I wrote a package installer for one of our products around the turn of the century, the specification for the unpacker was that it would discard any paths that weren't in or below the current directory. (The package directory itself was created empty as part of the installation process, so tricks like creating symlinks within it weren't available to attackers who didn't already have a better foothold in the system.)

    3. herman Silver badge
      Devil

      Nobody’s problem

      Fortunately nobody uses tar anymore, so only nobodies are vulnerable.

    4. bombastic bob Silver badge
      Devil

      I usually use the '-t' option to test tarballs before extracting, usually to see if it has a top level directory or is more of a "tar bomb" i.e. no top level directory (meaning I have to change directories before extracting).

      maybe a quicky utility could be writttten to use 'tar -t' to scan for files with ".." in the path, then flag it or something like a malware scanner would.

  2. Kevin McMurtrie Silver badge

    Zip too

    Lots of unzip utilities and code libraries have the same vulnerability. The path is an opaque string to be interpreted by the local filesystem.

    It's amazing that they're not hacked all the time.

    1. Richard 12 Silver badge

      Re: Zip too

      Many (most?) of the popular zip libraries fixed these things at least five or six years ago.

      I remember when QuaZip fixed theirs - it was still on Sourceforge, so ages ago.

      Path traversal attacks are a well-known logic flaw, and no, you can't fix it by saying "user needs to check all the paths before extracting"

      That's abrogating your responsibility as a library maintainer. Bad Gustäbel, no cookie.

      1. Flocke Kroes Silver badge

        Re: PFYs today

        Only five or six years ago? Malicious archives and naive tools were a hazard when I was a PFY. Now that my beard is grey checking an archive's content before extracting is an ingrained habit. Back then there were unwashed illiterates who put some files in the archive to extract to the current working directory instead creating a new directory to put everything in.

        Next you will be telling me people run 'make install' as root instead of 'make install DESTDIR=/var/tmp/sandbox' from an account that cannot access their home directory.

        1. Duke of Source

          Re: PFYs today

          The illiterates still exist. Take Shopware for example: having no IPv6 address, publishing PHP crapware and of course having the release ZIP file extract in the current directory.

      2. VoiceOfTruth

        Re: Zip too

        If you extract a tar file from an untrusted source, whose fault is that? You are an example of somebody who should not have root privileges. You are not a member of the wheel group. Goodbye.

        1. Richard 12 Silver badge
          Mushroom

          Re: Zip too

          I've cleaned up the mess many times after people like you believed themselves smart enough to dance through a minefield.

          It is the job of a software engineer to remove and defuse as many landmines as is reasonably practicable.

          If you refuse, then you have no place writing software or scripting.

  3. DrXym

    Zip slip

    That's the slang name for this class of vulnerability. Probably affects any software that accepts archived files - zips, rars, tars, 7z etc. and then extracts them somewhere without checking if files can escape out of the target directory via either a ../../ style trick or a soft link. Soft links are potentially a more insidious issue to deal with correctly.

    1. Michael Wojcik Silver badge

      Re: Zip slip

      Yes, though when the phrase "zip slip" was coined, path traversal in archive extraction was already an old technique.

      CVE-2001-1267 is a ".." path-traversal vulnerability in GNU tar. It might have been spurred by this BUGTRAQ post.

      I think Snyk coined the term "Zip Slip" in 2018.

  4. Anonymous Coward
    Anonymous Coward

    Performing general file operations as admin!! Really does not matter what file it is? Tar or other compiled library.

    Can we please STOP spooking the management teams and generating impact!!!

    1. the spectacularly refined chap

      Often it is a requirement to ensure file ownership and permissions are set correctly.

      And no, it's not a bug, this is defined, documented behaviour. If the user is unaware of the potential consequences it's a simple PEBKAC error.

      1. Anonymous Coward
        Anonymous Coward

        This is why security is a discipline separate from coding

        "The software does this really dangerous thing to anybody who is even slightly less than completely diligent all the time but it's not a bug because it's documented" is an attitude that should have died out decades ago.

        "Don't open archives if you don't trust the source" is equally unhelpful. Look at how many attacks involve compromising or impersonating a trusted source. (My spam folder currently gets at least one email a day purporting to be from a known contact and asking me to open an archive of alleged photos from "that thing last week"...).

        We've got a dangerous security risk, and we can either fix it in (a) one library or (2) 350,000 individual projects, assuming those projects are being actively maintained and the maintainer is made aware of the problem. Do the math.

        Sadly this attitude seems to show no sign of declining, which is why security needs to remain a discipline distinct from coding.

        1. the spectacularly refined chap

          Re: This is why security is a discipline separate from coding

          "The software does this really dangerous thing to anybody who is even slightly less than completely diligent all the time but it's not a bug because it's documented" is an attitude that should have died out decades ago.

          Alternatively don't do it as root.

          Remember, with great power comes great responsibility... If you don't like that that put your stabilisers back on and let the big boys deal with it. It's documented behaviour, desirable or even essential in many cases.

          1. Richard 12 Silver badge

            Re: This is why security is a discipline separate from coding

            That only protects the OS. Arguably the easiest thing to fix.

            It doesn't stop user data from getting trashed.

        2. Henry Wertz 1 Gold badge

          Re: This is why security is a discipline separate from coding

          Agreed 100%. Imagine the outrage if one of those many PHP directory traversal flaws that had been found years ago, if the response had been "Welp, better be careful with that!" with no fix.

          In the modern era, I really don't expect a tar utility to be able to traverse above the current working directory (...or the directory you specify the output to go to with the appropriate tar flag..), and I doubt the users of the Python method expect this either.

          I mean, you could still set up development environments like C of 30-40 years ago -- no input validation, C-style strings with no explicit size and no implcit size on string functions (allowing for easy buffer overflows), no compiler warnings, and so on. Just put in the docs "You better check those strings! Please don't let your buffers overflow!", just put warnings in the docs to not do things instead of the compiler warnings, because after all you should read the docs and not do that. Most can see how silly that sounds, and I think this "well, the tar thing is documented so we're good' is just as silly.

          The Pythonic fix here would be to fix this, and (if there's any legitimate use case where someone relies on "../" in their tars) put in a flag to retain the old behavior.

  5. Phil O'Sophical Silver badge

    No need for path games

    If you just generate a tarfile containing /etc/paswd and find an admin stupid enough to open it as root you'll have the same issue.

    The problem lies with the admin, not the tool.

    1. OhForF' Silver badge

      Re: No need for path games

      I agree, when extracting a tarball from a not 100% trustworthy source you have to know it can overwrite anything accessible to the user running it.

      If you need it to only access stuff inside directory other than / you can use chroot.

      Assuming python (or tar or whatever) will behave like it was in a chrooted environment is just a wrong assumption and not a python problem.

      While that ../ path squashing might be unexpected and considered weird by some it shouldn't be a security problem.

    2. herman Silver badge

      Re: No need for path games

      If the admin is a tool, then the problem lies with the tool.

  6. Ace2 Silver badge

    I was going to ask, “So is this a security issue or not,” but the sentiment in the comments is pretty clear that it’s a molehill.

    Brings to mind this Techdirt post from a few days ago: https://www.techdirt.com/2022/09/16/mudges-testimony-shows-he-was-acting-as-an-activist-not-an-executive/

    A few jobs ago I was plagued by an IT security outfit with no sense of cost-benefit, and the current $JOB seems to be heading that way too.

    1. Brewster's Angle Grinder Silver badge

      Tell me a use case for trying to extract absolute paths or paths in parent directories. As a common courtesy, unarchivers should block this unless you swear on your mother's grave it's what you want.

      Think less of /etc/passwd and more of messing with .bashrc to run scripts that download malware etc...

      1. Mike Pellatt

        Exactly. What sort of techniques do you think APTs use to achieve P?

        1. VoiceOfTruth

          They get people who don't know what they are doing to extract files that they should not. These same people would execute rm -rf / if it was in a script, and execute it as root if that it what it said to do.

          1. petef

            It is not always that obvious. I had a real instance of that happening some years ago (the resulting system restore involved fifteen 5¼" floppies). A colleague had asked me to release my storage on their machine. I deleted my home directory but then modified my home to be / so that I could still log in. I informed the machine owner that I had cleared my disk usage. Unfortunately they then opted to remove my user account. Part of that procedure was to remove the user's home directory. Tears ensued.

            I raised an issue with Sun who at that stage had become the owner of Interactive UNIX. They declined to put protections in place. I wonder what became of them?

      2. VoiceOfTruth

        You are still extracting an untrusted file. Why would you do that? The same can be said of some crappy shell script which you downloaded, and you executed that without checking it. The same can be said, and with far stronger words, to those people extracting untrusted files as root. Why would they do that?

        I have used absolute paths when I want to know absolutely that is where the files will be extracted to. I have to be careful doing it. I take responsibility for that. I can work round that and do 'cd' and whatever if I want to.

        1. Brewster's Angle Grinder Silver badge

          I download tarballs off the internet all the time. It's part of being a dev. If I unzip them in ~/x I expect them to stay there. That's part of the contract with any archiver.

          Checking these files meet that contract is exactly the sort of thing computers excel at. It would be very easy for me, a human, to miss some dots in the middle of the path string, especially if they are hidden by some Unicode shenanigans or terminal escape sequences.

          1. VoiceOfTruth

            -> If I unzip them in ~/x I expect them to stay there

            Congratulations. You are not a member of the wheel group.

            1. Richard 12 Silver badge

              You aren't perfect

              Sooner or later, you'll forget to check through absolutely every path in a tarball containing tens of thousands of files, and lose something important.

              And you probably won't even realise it happened until weeks later.

              1. Michael Wojcik Silver badge

                Re: You aren't perfect

                Or more generally: We have decades of overwhelming evidence that expecting perfect vigilance from users does not work and was a monumentally stupid idea to begin with. And we have ample evidence that technically sophisticated users are, on average, as bad or worse than unsophisticated ones at maintaining good security posture.

                The commentators posting some form of "it's the user's responsibility" are the problem. That's how we got to the situation we're in. Systems should default to secure, and to minimizing the violation of expectations.

                You want an option for your extractor to traverse above the current directory? Fine, make it an option. Don't make it the default.

            2. claimed Silver badge

              No, but might be a member of the BYOD group, so let's ensure it's as hard as possible for everyone to shoot themselves in the foot, and we'll all still be better off, even if all the benefit we see is not being leaned on by a limper.

              IMO Unintended consequences are problems, let's make sure the default case doesn't have any, and additional effort is required to point rifles at feet.

              Personally and professionally I wouldn't run an extract as root, but for many it's only a 'sudo' away

        2. Doctor Syntax Silver badge

          Yes, if it's an incoming archive from any source, whether or not you believe it to be trusted, treat it as untrusted and look where it's going to extract to. At the very least it saves you from having to tidy up a directory if you the archive's path doesn't start with a daughter directory - or even if it has a daughter directory with the same name as an existing one.

          1. VoiceOfTruth

            Doctor Syntax has hit the nail on the head. I did my time learning UNIX. Colleagues would send around fork bombs, butt ugly tarballs which exploded all over the place. Not fun to clean up, but also a good way to learn not to just blindly extract files or run scripts.

            1. Michael Wojcik Silver badge

              "I suffered, and so should everyone else. Damn the consequences."

          2. Michael Wojcik Silver badge

            This is good advice. It is in no way a substitute for software that defaults to maximizing security and minimizing surprise.

  7. rob_marion

    Redhat's assessment

    From the Redhat team (for your consideration):

    "Red Hat is aware of this issue and is tracking it via the following bug: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=263261 The Red Hat Security Response Team has rated this issue as having low security impact, a future update may address this flaw."

    - https://nvd.nist.gov/vuln/detail/CVE-2007-4559

    Still, I think it should be fixed.

  8. Lorribot

    The attitude of some of the posters herein is exactly the reason why Linux will never be suitable as an OS for all, as newbies are often met by this sort of attitude when trying to find information.

    People range from "plain stupid" through "ambivalent" to "know enough to be dangerous" and then on up to the posters levels of skill, caring and knowledge.

    Unfortunately 90% of the world falls in to the first three categories and these people need to be protected from themselves, saying stupid people get what they deserve is at best condescending and shameful and at worst sneering and smug, it helps no one.

    Every compromised device has the potential to be used against you and the services you use to destroy your life or business, it is us against them why do we fight so much amongst ourselves?

    Will you still say they are stupid when your lights go out, bank collapses or the hospital is closed just when you need it because you were busy being smug?

  9. Claptrap314 Silver badge

    Torn feelings...

    I realize that it's the 20's, so no one is allowed to take a position that isn't 100% on one side or another, but here I am.

    I still remember 15 years ago when I was experimenting with a multidrive system and having CentOS blow up because the .spec files did the "../../../whatnot" garbage. This. Is. POSIX. I can set a mount a however I deem it appropriate, and YOU DO NOT KNOW where ".." goes.

    I don't care that POSIX requires tar to support '..'. On a POSIX system, '..' is not well-defined between systems. Tarballs that use '..' are at best fragile. For that reason alone, use of tar requires care.

    --

    But these "researchers" are being more obnoxious than tar is. Unless you carefully examine the use of the library, you cannot know if it is actually "Insecure" or not. Are there really 350,000 open source packages out there using this library that are intended to run as root? I seriously doubt it. So the security implications are no where near what this group is implying.

    --

    Of course, root or not, directory traversal is a problem. If the proposal were to add a "safe mode" that prevented root or .. components, that would be great. But we're unfortunately 30 years too late to change the default behavior.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like