back to article About half of Python libraries in PyPI may have security issues, boffins say

Boffins in Finland have scanned the open-source software libraries in the Python Package Index, better known as PyPI, for security issues and said they found that nearly half contain problematic or potentially exploitable code. In a research paper distributed via ArXiv, Jukka Ruohonen, Kalle Hjerppe, and Kalle Rindell from the …

  1. elsergiovolador Silver badge

    Snake

    To be fair, having a snake roaming freely in production is not exactly safe...

    1. Snake Silver badge

      You rang?

      I'm pretty safe. I only bite when asked :-p

      1. Plest Silver badge
        Happy

        Re: You rang?

        Anyone else read that with Snake from The Simpons voice in mind?!

        1. O RLY

          Re: You rang?

          I was picturing Snake Plissken from Escape from New York, myself.

          1. Anonymous Coward
            Anonymous Coward

            Re: You rang?

            I thought you'd be taller.

  2. unimaginative
    FAIL

    Not security issues

    Bad code patterns are not vulnerabilities.

    Take using pass or continue as a catchall in except. Its mostly bad for non-security reasons (silent failures are not visible to the user and difficult to debug).

    They have used a tool that finds potential issues, not vulnerabilities. They have no indication of how many issues have been reviewed by the develoerps who decided they were fine. For example, I use Django mark_safe quite often. Its essential to allow some things such as rich text editing in a Django based CMS (which is why it exists), and is absolutely fine to use on trusted (sometimes because it has been processed to be safe) input. Similarly hardcoded SQL may be using trusted or sanitised sources (very easy to escape your inputs with most libraries).

    What would be intersting would be to see the numbers for issues that are high confidence and at least medium severaity. Even better if we could see whether more popular packages are better (i.e. does many eyeballs work or are people selective in what they use).

    Intesresting that Google's code is so bad! (If the article not corected yet, the package is unofficial, but the code is Google's).

    1. Brewster's Angle Grinder Silver badge

      Re: Not security issues

      DO NOT escape your inputs. USE PLACEHOLDERS.

      1. unimaginative
        Thumb Up

        Re: Not security issues

        Correct.What I meant to say was use placeholders and the library will deal with it for you

        1. Charlie Clark Silver badge

          Re: Not security issues

          Placeholders pass the responsibility of sanitising the input to the database…

          1. unimaginative

            Re: Not security issues

            Libraries do some of the work. For example psycopg2 (AFAIK all API's sticking to the Python standard) will quote a string value for you whereas Postgres syntax requires a quote when inserting a character type. Its one less thing to worry about/get wrong. Probably more a convenience that may prevent a bug (most likely one you would spot in development) than a security issue but still one less thing to worry about/get wrong.

            1. Charlie Clark Silver badge

              Re: Not security issues

              I thought that psycopg2 uses the Postgres quoting API? Not to do so is reinventing the wheel, but it maybe just adding some convenience on top. Sanitising inputs is hard™

              1. unimaginative

                Re: Not security issues

                It does quoting in the literal sense, i.e. adding quotation marks around a string:

                https://www.psycopg.org/docs/usage.html#query-parameters

                I think you are right regarding escaping the string - using a postgres library intended to use in the client rather than sending anything to the server.

                I assume the server then does further work when using the values where the placeholders are, but I have not real idea of what is going on there. I am afraid I use RDBMSes as magic black boxes and know very little about internals.

  3. DomDF

    False positives

    Doesn't bandit have a terrible false positive rate? I stopped using it because there was too much noise.

    Is the article in press with a reputable journal, or just on the preprint server?

    1. pixl97

      Re: False positives

      Doing a test on a different SAST scanner that picked up 16 issues in pbcore, where the paper said they picked up over 1000. Yes, I would say there is tons of FPs or they are looking at coding practices rather than actual security issues.

    2. Michael Wojcik Silver badge

      Re: False positives

      The researchers analyzed the Bandit output; they didn't just report the numbers.

      The preprint is right there for you to read it.

  4. Brewster's Angle Grinder Silver badge

    I'm not a python user. But I really do get narked off by these "we ran a static analyser and it took offence at your use of language features" reports.

  5. Version 1.0 Silver badge

    See cure, Itty-bitty

    Python was created to be accurate and easy to use, I don't think that security was ever a factor in the design or applications written in Python. Essentially don't use Python to write code that processes logins or credit cards - this is just like criticizing someone for digging up potatoes with a fork, "That's risky, don't use a fork because you'll stab your toes ... I'll get my shotgun so that we can dig them up and social-distance ..."

    1. Charlie Clark Silver badge

      Re: See cure, Itty-bitty

      Python's design is pretty good and generally encourages safe programming. For example, global, eval and exec exist because they are very occasionally required but basically the advice is YAGI (you ain't going to need it). Lots of code handling sensitive data, including credit card details, is written in Python and running safely, AFAIK.

  6. pixl97

    Magic Quandrant

    So I have access to a SAST scanner and decided to run 'pbcore', one with over 1000 detected issues in their tests.

    With the scanner and settings I used I picked up 16 issues. Only one being rated in the most dangerous category, as there is a potential for command injection when launching shell processes. It doesn't seem like this are FP's, maybe in python programming you're supposed to filter any user input before you get to modules like this. I can't say I'm a python programmer.

    Are there issues in PyPI libs, yes. But from an initial glance they are not anywhere near what the paper is describing.

  7. Paul Johnston
    Joke

    Well at least I'm secure

    No mention of anything amiss in CTAN

  8. Anonymous Coward
    Anonymous Coward

    Static analysis found a bug in my code...

    ...once... ...the hundreds of other issues were just the static analysis tool having its own ideas about how to write code. I introduced two new bugs in the mindless issue fixing tedium.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like