back to article Credit-card-stealing, backdoored packages found in Python's PyPI library hub

Malicious libraries capable of lifting credit card numbers and opening backdoors on infected machines have been found in PyPI, the official third-party software repository for Python. That's according to the JFrog security research team, which documented its findings here at the end of last month. A package dubbed noblesse, …

  1. Snake Silver badge

    Coming back to haunt the code supply chain?

    We found out, thanks to the Uni of Minnesota ordeal, that F/OSS supply chains could be compromised thanks to trust through blind acceptance of source point.

    So now it's being found in many more places but a rational discussion of a solution still hasn't been forwarded because it hurt developers' feelings that they were being "experimented" on. Whilst the users remain the ultimate guinea pigs.

    1. Anonymous Coward
      Anonymous Coward

      Re: Coming back to haunt the code supply chain?

      Also because it's demolishing two beliefs of FOSS activist:

      1) Open source code can't be bad because "many eyes" peruse it - actually people use it without looking at the code, and many user won't even able to understand what some code does.

      2) The "trust model" can scale at will. Actually the "trust model" of open source doesn't scale horizontally. When you start to have millions of libraries someone will be able to slip in malicious code, and won't be caught for a while. In the meantime someone will use those libraries. When caught, they can just simply change account and start again.

  2. vincent himpe

    But.. you have the sourcecode right ?

    first : Apparently some people did read it and found this issue, kudos to them !

    The bigger question is : why did it not get read BEFORE the commit. It makes me wonder if / why there is no gatekeeper. This distributed review process does not work. Banking on the users to review the libraries is an utopia. 99.9% never read the source packages, they got enough work with their own development , let alone inspecting and understanding the inner workings of a library. That's why you get a library in the first place ! if you have to spend time going through the entire library's code, you may as well build your own library. And every time there is an update you can start over.

    There is no responsibility for OSS. Any time there is an issue there is a bunch of handwaving and a statement along the lines of "it's open source, you can read it". Basically use at own peril.

    Because of the widespread usage of libraries, and the lack of centralized vetting, this is a tremendous honeypot for miscreants.

    Of course i will get flamed for this, as it exposes a big painpoint. Yes we got the source, but in reality, who reads it ?

    1. Anonymous Coward
      Anonymous Coward

      Re: But.. you have the sourcecode right ?

      You seem to be making the assumption (or implying) closed source is better in this regard, there is no evidence that it is. Actually, having open source code, means potentially more eyeballs on the code, not less.

      As an example to poor coding, Microsoft added the user's Downloads directory to it's 'Clean Up' tool in Windows 10 1809 (the Downloads directory option has now been removed again in a subsequent release).

      There was no warning to users that the Download directory had been added to the list of optons on running the new version of the tool (or the fact on realising this 'error' that it had been removed again in subsequent release). Many people run this tool, using the 'Select all' option, thinking it only deleted redundant system files, not user downloaded files. Both are based on trust, trusting those employees not to add malicious code, and for managers to make sure code is checked.

      As a coding decision, this proved just as destructive, and anyone (a senior developer) with any knowledge of personal user space v OS temporary files, should have flagged this change immediately, as potentially very destructive and dangerous, in terms of unintentionally* deleting personal files.

      * You could argue it resulted in a 'positive' for Microsoft, because it made the case for using One Drive Cloud Storage, but let's not get into conspiracies.

      i.e. Closed source is not better by design, and no one should make that assumption, with closed source, the processes to spot malicious code, are just hidden from us, and they clearly don't always work either.

      1. Phil O'Sophical Silver badge

        Re: But.. you have the sourcecode right ?

        Closed source is not better by design

        It is certainly no freer from bugs than OSS code, but the opportunites to insert malicious code are fewer simply because the group of people who can contribute is smaller, and closed. It also has more accountability, the reputation of the company that supplies it is on the line for any problems, which should make them more careful about checking it than an amorphous "community". As always, where it's "everyone's" responsibility to do something that invariably means that it's no-one's responsibility.

        1. unimaginative
          Linux

          Re: But.. you have the sourcecode right ?

          You can have open source repositories with a small and closed group of contributors and gatekeepers - like OpenBSD.

          People prefer conveneience to security - otherwise we would be all using OpenBSd, running our web broiwsers with JS turned off, etc.

      2. vincent himpe

        Re: But.. you have the sourcecode right ?

        i implied nothing ! Having the sourcecode is good. I am merely highlighting the issue that 99.9% of the users work in the following mode : i need to do xyz , is there a library ? yes ! , download , study API and figure out how to use it to solve their problem. They don't look inside. In an ideal world they should not need to. That is the principle of a code library. A chunk of reusable, vetted code, maintained by a gatekeeper. I am a librarian (for a corporate CAD library). Every day i have to fight off the hordes moaning why it takes so long to 'release' something to the library and why can't we change xyz, it would fit better for their application. it's only a small change... Well , it would, but it would also break hundreds of existing designs ! if it is a mistake : we will fix it. If it is an inconvenience for you: live with it, everyone else using it is happy. Every request leaves a revision trail that is fully documented about what was changed and why. Only two people can alter the library. Parts go through a vetting process where their status is elevated from Design , to prototype to production. Only when a prototype has been built and verified to be correct does it get released. The end result is a library of parts you can trust to follow their specifications. That doesn't mean they are usable for your application, it only means they follow the specification. A new revision means there was an alteration to the specification.

        The library behaves like a WORM (write once, read many) any alteration is logged and cannot be removed (you can undo it, but you cannot remove the previous one. it leaves a trail and the data is there to inspect. No committing something, then removing it and posting an updated version. You can only post newer versions.

        The current model does not work. You can state that the end responsibility is the end users, but that too is bogus. You cannot expect every developer to understand ever line of code in the application he is working on.

        The whole intent of a library is to reuse and speed up development. For that , the library needs to be trusted. And that is the pinch point. Too many people can fuddle with the libraries. Each library should have a closed group of maintainers and only they can post updates.

        This lack of gatekeeping , combined with the popularity of library xyz makes them enormous honeypots. slip in some malicious stuff and let it spread. Nasties used to be small binaries dropped into a working machine, now they are lines of source code hidden in public libraries.

    2. Anonymous Coward
      Anonymous Coward

      Re: But.. you have the sourcecode right ?

      Banking on the users to review the libraries is an utopia

      The company I used to work for had a strict policy for OSS:

      - You had to identify a specific version you wanted to use, downloaded from a designated location on a given date. If it was compilable code you had to build it, no downloading of binaries.

      - It had to be recent, have an active community for support, and no known security issues that would affect the use case.

      - The security lead for the product had to sign off (in writing) that it was acceptable.

      Only then could you use that specific version, and you had to ship that exact version with your product.

      Any product found to be downloading OSS 'on-the-fly' would be pulled until remediated, and if the development team survived the resultant bollocking they would be monitored extremely closely for their next one.

      1. Anonymous Coward
        Anonymous Coward

        Re: But.. you have the sourcecode right ?

        Rinse and repeat for every fix in the library? Those processes are nice, but also usually mean you're going to use outdated libraries after a while... how do you handle any security vulnerability found in the approved code? How long does it takes to deliver a patched version of your application?

        Most industries expect the supply chain performs most of the certification of their products...

        1. Anonymous Coward
          Anonymous Coward

          Re: But.. you have the sourcecode right ?

          you're going to use outdated libraries after a while

          True, but if they are updated with each product release they still meet the needs of the product. Trying to always have the latest of everything is a security nightmare, and not what our customers wanted. Stability was far more important to them than unnecessary bells & whistles.

          how do you handle any security vulnerability found in the approved code?

          Always a problem, and why they insisted that the product had to have an active community. Preferred option was to incorporate a fixed version in a patch or later release. For critical issues it might mean an internally-developed fix until the next community release could incorporate it.

          How long does it takes to deliver a patched version of your application?

          Regular releases were likely quarterly, an emergency fix could be out in a week, less for cloud-based deployments.

          Most industries expect the supply chain performs most of the certification of their products...

          They shouldn't just expect it, it should be written into the SLA, but I know few OSS 'products' that will acecpt that, so "trust, but verify" is the best approach.

      2. A.P. Veening Silver badge

        Re: But.. you have the sourcecode right ?

        If it was compilable code you had to build it, no downloading of binaries.

        That assumes a reliable compiler, which is sometimes a bit problematical.

    3. unimaginative
      Linux

      Re: But.. you have the sourcecode right ?

      Who rang their hands and says you should read the source code?

      The point is that people can read the source code, and someone should read it. Even more, people should check what they are installing - a lot of issues come from automatic installation of a tree of dependencies.

      Its very rare for this to happen with things like Linux respositories and similar because packages can only be created by trusted maintainers (not anyone who registers an account), and that also means the dependencies are also only available if packaged by someone from that trusted pool. When was the last time someone got malicious code into Debian or Red Hat official repos? Or OpenBSD?

      Using proprietary software will not help because it now incporates vast amounts of open source.

      Obligatory XKCD: https://xkcd.com/2347/

      You are right insofar as someone should be checking. The language repos are too large and too focused on having as much available as possible to do that.

      What developes can do is to minimise dependencies (so do not use a library for something you could implment your self - leftpad), use only trusted dependencies, and to check what their idirect dependencies are (there are tools to do this).

      I also think there are some bad practices among developers. For example, it is regarded as best practice with python virtualenvs to install everything in the env and block access to system packages (not use --system-site-packages). I prefer to use system packages where possile and get the automatic updates from the OS and scrutiny from OS package maintainers.

      1. Anonymous Coward
        Anonymous Coward

        Re: But.. you have the sourcecode right ?

        You are right insofar as someone should be checking.

        If your name is on the final product, and your reputation is on the line, somone must be checking. Anything else is negligent.

      2. Anonymous Coward
        Anonymous Coward

        Re: But.. you have the sourcecode right ?

        scrutiny from OS package maintainers.

        You think the OS suppliers perform scrutiny? Maybe for the stuff the OS uses, but anything else which is provided for the convenience of developers is just passed straight though. The lawyers will check the licenses to make sure you can't be sued, but there's rarely any technical oversight (I know, I've been there).

      3. vincent himpe

        Re: But.. you have the sourcecode right ?

        The point is that people CAN read the sourcecode, but DONT. Concept of a library : a chunk of code ready for reuse. When you go get a book from the library , do you check nobody has inserted some additional pages ? changed a paragraph or two ? No, because it's hard to do that in a book. Should be the same with code libraries. Keep it open and accessible and readable but lock it for 'write'.

        1. Version 1.0 Silver badge

          Re: But.. you have the sourcecode right ?

          I think that a lot of people assume that open source is reliable because someone somewhere must have checked the code so they just move on and apply it because it's free without spending time verifying it themselves.

          Reading all the comments here makes me think of updating an old Brendan Behan quote from the days before software; "The big difference between sex software for money and sex software for free is that sex software for money usually costs a lot less."

  3. Lil Endian Silver badge
    Stop

    Application Overreach

    "...and browser-stored credit card numbers..."

    @Browser Scope Writers: stop getting your thin client to do more than be a thin client.

    We can educate users to not use these features, but it'd be way better if applications remained within their original scope, or at least near it. Eggs 'n' baskets, baby!

    1. Yet Another Anonymous coward Silver badge

      Re: Application Overreach

      Or even better we could have a payments system that didn't rely on the secrecy of a 16 digit (10 of which differ) number that we give out to every waiter / gas station / store clerk

      1. vincent himpe

        Re: Application Overreach

        Something like a physical thing ? like bits of paper with pictures of dead presidents (or other people) and a few watermarks hidden . If i hand one of those to you , you can't steal more of mine as they have no relationship to each other. Maybe someone should invent one-time-use payment tokens.

        1. Yet Another Anonymous coward Silver badge

          Re: Application Overreach

          I think that idea may have currency, I'll take notes

          1. Lil Endian Silver badge

            Re: Application Overreach

            That's a sterling idea, to coin a phrase. I'd buy that, much cachet gained by you!

      2. Michael Wojcik Silver badge

        Re: Application Overreach

        the secrecy of a 16 digit (10 of which differ) number that we give out to every waiter / gas station / store clerk

        Virtual credit cards, like those issued by privacy.com, avoid this problem. One number per merchant, locked to that merchant, and with other limits set by the user: single use, maximum per transaction, maximum per time period, etc. Push notifications every time the card is used.

        But since they're virtual they're online-only (unless you're up for manufacturing your own physical cards), and there are still some payment processors who won't accept them (due to not following the standards properly).

        privacy.com makes its money off transaction fees, so there's no additional cost to the consumer over using a bank-issued card.

        I have no relationship with privacy.com except as a customer.

  4. Brewster's Angle Grinder Silver badge
    Pint

    Kudos to those doing the hard graft necessary to find downright malicious code rather than taking the easy option of running a static analyser over the code and grabbing some headlines with the large number of results.

    It's the difference between finding blood and proving it's human blood that belonged to the suspect or victim.

  5. katrinab Silver badge
    Alert

    Snyk Advisor tells me there's "no known security issues" with this package. Fortunately I've never relied on their advice before, and I won't be doing so in future.

  6. coddachubb

    Mmm, how to minimise such dependencies when they want it all, yesterday.

    Back to the good old days, a pukka Linux distro and more use of native command line toolz?

    The truth is that IT is now so complex, no-one really knows what they have any more.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like