To be fair, having a snake roaming freely in production is not exactly safe...
Boffins in Finland have scanned the open-source software libraries in the Python Package Index, better known as PyPI, for security issues and said they found that nearly half contain problematic or potentially exploitable code. In a research paper distributed via ArXiv, Jukka Ruohonen, Kalle Hjerppe, and Kalle Rindell from the …
Bad code patterns are not vulnerabilities.
Take using pass or continue as a catchall in except. Its mostly bad for non-security reasons (silent failures are not visible to the user and difficult to debug).
They have used a tool that finds potential issues, not vulnerabilities. They have no indication of how many issues have been reviewed by the develoerps who decided they were fine. For example, I use Django mark_safe quite often. Its essential to allow some things such as rich text editing in a Django based CMS (which is why it exists), and is absolutely fine to use on trusted (sometimes because it has been processed to be safe) input. Similarly hardcoded SQL may be using trusted or sanitised sources (very easy to escape your inputs with most libraries).
What would be intersting would be to see the numbers for issues that are high confidence and at least medium severaity. Even better if we could see whether more popular packages are better (i.e. does many eyeballs work or are people selective in what they use).
Intesresting that Google's code is so bad! (If the article not corected yet, the package is unofficial, but the code is Google's).
Libraries do some of the work. For example psycopg2 (AFAIK all API's sticking to the Python standard) will quote a string value for you whereas Postgres syntax requires a quote when inserting a character type. Its one less thing to worry about/get wrong. Probably more a convenience that may prevent a bug (most likely one you would spot in development) than a security issue but still one less thing to worry about/get wrong.
It does quoting in the literal sense, i.e. adding quotation marks around a string:
I think you are right regarding escaping the string - using a postgres library intended to use in the client rather than sending anything to the server.
I assume the server then does further work when using the values where the placeholders are, but I have not real idea of what is going on there. I am afraid I use RDBMSes as magic black boxes and know very little about internals.
Python was created to be accurate and easy to use, I don't think that security was ever a factor in the design or applications written in Python. Essentially don't use Python to write code that processes logins or credit cards - this is just like criticizing someone for digging up potatoes with a fork, "That's risky, don't use a fork because you'll stab your toes ... I'll get my shotgun so that we can dig them up and social-distance ..."
Python's design is pretty good and generally encourages safe programming. For example, global, eval and exec exist because they are very occasionally required but basically the advice is YAGI (you ain't going to need it). Lots of code handling sensitive data, including credit card details, is written in Python and running safely, AFAIK.
So I have access to a SAST scanner and decided to run 'pbcore', one with over 1000 detected issues in their tests.
With the scanner and settings I used I picked up 16 issues. Only one being rated in the most dangerous category, as there is a potential for command injection when launching shell processes. It doesn't seem like this are FP's, maybe in python programming you're supposed to filter any user input before you get to modules like this. I can't say I'm a python programmer.
Are there issues in PyPI libs, yes. But from an initial glance they are not anywhere near what the paper is describing.