back to article Boffins rate npm and PyPI package security and it's not good

The Open Source Security Foundation (OpenSSF), as its name plainly states, aims to help make open source software more secure, but improvements flowing from its efforts are hard to find. Computer scientists at North Carolina State University have put one of its tools to the test by evaluating software package registries npm …

  1. thames

    Not Impressed.

    I can't comment on NPM, because I haven't used it, but I do have projects on PyPI. I had a look at the paper, and it's pretty clear that the "problems" listed are not in PyPI, but rather in Github.

    To start with, they don't actually look at PyPI except to get a list of projects which they then look for on Github. There is no link between PyPI and Github. You can have packages in PyPI without having a Github account or any code in Github. They are two completely independent things.

    Their scorecard is entirely based on the assumption that you do everything through Github and use all of it's workflow features. If you use Github just as a place to publish code for the public, then you will get a low score. If you use all the Github bells and whistles and use them the right way, then you get a high score.

    In other words, part of the score is based on result, and part of it is based on "process". And by "process" they only mean is your process conducted in Github rather than somewhere else.

    A good example is "maintained". If a project doesn't get at least one commit per week to Github, then it is is marked down. There's no reason why that should be a valid criteria. The project may not be unmaintained. It may simply be stable and isn't getting updates because there isn't anything wrong which needs fixing. Or you could be working away on new features, but Github is just where you publish the source code as opposed to the place where you actually work from.

    This is why there are so many projects which score highly in terms of not having anything wrong, but most seem to have low scores in terms of making use of Github's automated work processes.

    I have a Github account and I have packages in PyPI. Part of my work process is to push code to Github for source publishing and to upload packages to PyPI for users. I have my own testing and QA processes which I run on my own hardware as I have no intention of locking myself into Github. It's just a convenient place to host the source code for anyone who wants it. I have been planning to also push source to another Git repo aside from Github to reduce my dependency on them for some time, but I simply haven't got around to it yet.

    Overall, I'm not impressed with the report.

    P.S. "Standard" security mode (the most relaxed standard setting) in Firefox seems to give The Register fits and result in a page not found error. I can only post on this site by fiddling with the security settings and manually turning off tracking protection. I've no problems anywhere else. El Reg should get a "fail" on the testing and maintaining score card.

    1. Anonymous Coward
      Anonymous Coward

      Re: Not Impressed.

      I second the riparian gentleman's post above. Indeed, the whole thing reads more like a GitHub (Microsoft) advertorial than anything remotely academic.

      There's also one glaring fault: one of the aspects of security is availability. If your supposedly FOSS code depends on GitHub (Microsoft) proprietary features, a) it's not really FOSS is it, and b) if GitHub (Microsoft) ceases to be available (generally or to you specifically) then your project is stopped in its tracks.

      Using those proprietary forges might be OK for small, trivial stuff that's easily replaced but it's asking for trouble if used for anything critical. The Linux kernel project have got this right.

    2. Charlie Clark Silver badge
      Thumb Up

      Re: Not Impressed.

      Yep, my "critical" library isn't on GitHub at all so practices of any kind won't be picked up. As things stand, we've just discovered a bug in CPython 3.11 through as test race condition and the library has been added to Google's fuzzing project, though I don't have time to deal with the reports.

      I also hate some of the suggestions such as pinning: this is great for deployments but a PITA for development. You're worried about upstream vulnerabilities? Do what any good BSD sys admin would do and run your own repository.

      All toolchains are susceptible both to code and infrastructure vulnerabilities, the open source ones are just more visible. Python code has, traditionally, a pretty good record on code vulnerabilities. This is due in no small part to the things that some people hate: an emphasis on code that is easier to read because of the forced whitespace; and focussing on quality and readability over execution speed.

      Code reviews are great when you're working on something but otherwise there a bit of box ticking exercise. A good testing strategy, with high coverage rates will generally tell you where to start looking for things to improve.

      1. Anonymous Coward
        Anonymous Coward

        Re: Not Impressed.

        You were going OK until you said:

        > code that is easier to read because of the forced whitespace;

        That's a bit of a subjective statement, to say the least.

        > and focussing on quality and readability over execution speed.

        Well, focusing on python execution speed is a bit like focusing on a train's off roading abilities.

        1. Gene Cash Silver badge

          Re: Not Impressed.

          That's a bit of a subjective statement, to say the least.

          Awright then mate, I want you to try to maintain:

          1 a piece of code where the entire thing is not indented at all.

          2 a piece

          ...............of

          ...code

          with

          ............................................................random

          ........................................................................indents

          (and no, you're not allowed to run it through a formatter for various reasons)

          (edit: and yes, this wouldn't be accepted in open source, but this was the shite I had to put up with at work)

          1. Charlie Clark Silver badge

            Re: Not Impressed.

            Especially when it's someone else's code.

          2. Anonymous Coward
            Anonymous Coward

            Re: Not Impressed.

            > I want you to try to maintain:

            Bad code is just bad code, and clarity of exposition is an aspect of what makes code "good" or "bad".

            But Python effectively insists on having me express my instructions in iambic pentameter. I'm more of a Joyce type, you know.

        2. Charlie Clark Silver badge

          Re: Not Impressed.

          As I said, some people hate the whitespace rules. But I don't consider their advantage to be subjective when it comes to reading code: the blocks can be identified without needing the read the code: edge detection versus feature recognition.

          Execution speed: often you're calling a C or C++ (or even Fortran) library with minimal overhead so worrying about the number of function calls or other micro-optimisations at the cost of readability is a recipe for disaster. Python is still slow when it comes to memory allocation but there are generally easy ways to mitigate that.

    3. Doctor Syntax Silver badge

      Re: Not Impressed.

      Apart from anything else it disregards the possibilities of cybersquatting such as was reported here recently. If someone pushes a deliberately malicious piece of software onto a repository as a new project then they'll happily tick all the boxes claiming to follow best practices. Don't expect someone being dishonest in intent to be honest in method.

      The first check needed is "Are there gatekeepers?" It seems not.

    4. Paul Kinsler

      Re: You can have packages in PyPI without having a Github account or any code in Github

      If you read the article more carefully, for example paragraph 9, it would seem that the researchers say that they are aware of those sorts of issue.

      1. that one in the corner Silver badge

        Re: If you read the article more carefully...

        Yes, the researchers are aware of the problems: and I've been reading all of the criticisms in the comments as agreeing with the researchers and going into detail about how these problems manifest themselves.

        The problem is not that the researchers are unaware but that the OpenSSF don't seem to be aware of these problems (they do admit that their scorecards only works for Github, but don't allow for any of the cases where Github is used but isn't the be-all and end-all). Yet OpenSSF are apparently[1] the only people who are trying to provide a way to examine package security.

        I don't want to give the researchers a free pass, however: the preprint abstract talks about confirming the applicability of these Scorecards, instead of examining their applicability, and the discussions and conclusions both uncritically assume (come very close to stating) that the OpenSSF product is Absolutely The Bee's Knees.

        [1] If you just rely on what the OpenSSF say, in the Github repo for their Scorecards and their distinctly corporate website. The preprint does admit that other offerings have been made in this area.

        Hmm, one thing that I like to see in any open source related site/repo is the section that lists other projects with similar/related goals, especially when comparisons are given. Shows a bit of rigour and knowledge of the field - shame OpenSSF don't have one.

  2. iron Silver badge

    No mention of StepSecurity? StepSecurity are working to secure the open source supply chain by securing GitHub Actions workflows. Their Secure Workflows and Harden Runner tools are quick and easy to use and fix many of the issues mentioned in the article. They also works hand-in-hand with OpenSSF's scorecards.

    I'm not affiliated with StepSecurity, just a happy user of their tools.

    1. that one in the corner Silver badge

      Re: No mention of StepSecurity?

      The preprint paper is restricting itself to "community efforts", whilst StepSecurity (disclaimer: I'd not heard of them before this) appears to be a company still starting up: the FAQ referring to " early adopters" and "All of our tooling and SaaS services are currently free".

      So Step Security is simply outside the purview of the report and hence the article.

      Now, whether this means that the report is too exclusionary to be useful to the general open source consuming population is another matter.

  3. Anonymous Coward
    Anonymous Coward

    link to nothing interesting

    An attacker could abuse a vulnerable package, for example, by crafting a malicious GitHub issue title that injects code and opens a reverse shell connection.

    The link doesn't lead to a report of the hack, or its fix, but to the Github manual page for issues. A bit of a letdown.

  4. DomDF

    It's considered very bad practice to have pinned dependencies in libraries on PyPI. It's fine for applications, but PyPI isn't geared towards distributing them. Why then is pinning being recommended here?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like