back to article Ghost in the shell script: Boffins reckon they can catch bugs before programs run

Shell scripting may finally get a proper bug-checker. A group of academics has proposed static analysis techniques aimed at improving the correctness and reliability of Unix shell programs. The team argues it's possible to analyze shell scripts ahead of execution, offering developers pre-runtime guarantees more typical of …

  1. Doctor Syntax Silver badge

    Double-guessing the user's intention is so reliable. Maybe I really do mean m -rfy /

    1. Anonymous Coward
      Anonymous Coward

      Re: rm -rfy / useful?

      The system described is for analyzing shell scripts, not for running commands.

      I seriously question whether anyone would have use for a shell script that contains 'rm -rfy /'

      What could the lines before rm / do that could be usefully done after completion of rm? What could the lines after rm / still do?

      Even running the command is senseless. There are easier, faster, and better ways to clear a system.

      1. geoff61

        Re: rm -rfy / useful?

        There is one, and only one, valid reason for a shell script to pass '/' (or any other pathname that resolves to the root directory) to rm, and that is to test that it reports an error, as required by POSIX (which says "if an operand resolves to the root directory, rm shall write a diagnostic message to standard error and do nothing more with such operands").

    2. geoff61

      What version of rm accepts a -y option? I've never encountered one.

  2. anothercynic Silver badge

    Shellcheck

    There's an amazing Github project called shellcheck that already does a lot of this work... I wonder if the researchers were aware of it?

    1. phuzz Silver badge

      Re: Shellcheck

      I have a feeling the researchers weren't interested in existing solutions, instead they had AI to peddle. From TFA:

      Using large language models to check shell command documentation against actual behavior

      1. Dan 55 Silver badge

        Re: Shellcheck

        Great, we can look forward to correctness when compared against some bash/ksh combined AI slop scripting language.

  3. Paul Herber Silver badge
    Coat

    Shell programming checks - apply Occam's razor.

    I'll clam up now.

  4. Falmari Silver badge
    Joke

    Boffins reckon they can catch bugs before programs run

    So did Crowdstrike, that did not end well. ;)

  5. JimmyPage
    Trollface

    Bash compiler ?

    Surely this is a bag of a fag packet task for "AI" these days ?

    1. Bebu sa Ware

      Re: Bash compiler ?

      "bag of a fag packet task"

      Sometime a typo just makes it golden. :))

      1. Doctor Syntax Silver badge

        Re: Bash compiler ?

        You mean it should have been "back of a fag baguette"?

  6. MiguelC Silver badge
    Coat

    HotOS XX conference?

    I'll wait 10 years for the really hot one

  7. Flocke Kroes Silver badge

    Two easy bash script tests

    1) The first line should be "#! /bin/bash -e". That -e means exit on error. If the author missed it then when the script finds something unexpected it will plow on regardless and do things you do not want. This tells you enough about the author to not run his scripts.

    2) If the script is over 50 lines long it should have been written in a proper high level language.

    1. Dan 55 Silver badge

      Re: Two easy bash script tests

      Nobody expects the Spanish inquisition. Our three tests are 1) -e, 2) script length, and...

      3) bash -n <filename> - syntax check the script

      1. Henry 8

        Re: Two easy bash script tests

        4) set -u, to make using an undeclared variable an error (good for catching typos like FILENAME=$(foo) ; rm -f $FILENME)

        1. An_Old_Dog Silver badge

          Example

          Worse:

          DIRNAME=TMP

          rm -rf /${DIRNAM}

          Yes, there are (multiple) things programmers can do to prevent the bad results from this type of error. Despite that, someone at vALVE made this sort of error with ~/ in a script that was pushed to all on-line STEAM users some years back, wiping out all the affected users' stats and saved games.

          (Luckily for me, my STEAM-running games PC was powered off that week.)

    2. Spamfast

      Re: Two easy bash script tests

      I agree that the exit-on-error & error-on-undefined flags should always be set either via the shebang or the set command at the top of the file.

      If using bash, then set -o pipefail is also a good idea.

      But the first line should be #!/bin/sh -e not #!/bin/bash -e unless the script is actually using bash features like shopt -s lastpipe, arrays, local variables etc.

      Using bash for shell scripts that only need basic POSIX shell features wastes a lot of resources, especially if said script is going to be run repeatedly from cron or via other background triggering.

      On 64-bit Raspbian for example:-

      $ ls -l /bin/{ba,da,}sh

      -rwxr-xr-x 1 root root 1346480 Mar 29 2024 /bin/bash

      -rwxr-xr-x 1 root root 133640 Jan 5 2023 /bin/dash

      lrwxrwxrwx 1 root root 4 Jan 5 2023 /bin/sh -> dash

      1. ChoHag Silver badge

        Re: Two easy bash script tests

        $ ls -l /bin/bash

        ls: /bin/bash: No such file or directory

        If you must use bash features it should be launched with #!/usr/bin/env bash. /usr/bin/env is almost standardised.

        Moreover set -e is overrated and unreliable. If a command can fail the script should check for the failure explicitly.

        Unix, which is what's really being programmed here the shell language is a distraction, is user friendly: it's just picky about who its friends are.

        1. malfeasance

          Re: Two easy bash script tests

          I always go for bash first because there is always some utility that is quicker to write in bash than some other higher level language.

          - It's always : #!/usr/bin/env bash

          - set -euo pipefail (or set -eo pipefail, if the script is not destructive but if not -u then we do...

          - local varname=${some_otther_varname:?undefined variable, abort}

          Because the :? is a very useful bash variable expansion...

          You can do some awful awful things with a combination of tools like yq + jq and still store your "configuration" in semi-readable YAML.

        2. Spamfast

          Re: Two easy bash script tests

          I know it works and is the more portable thing to do but #!/usr/bin/env bash seems illogical from a purely abstract point of view.

          Avoiding an absolute path for invoking one program via the absolute path to another seems daft!

          Anyway, back on topic, some other ways of improving code quality is always double-quoting substitutions (completely e.g. cmd1 "$(cmd2 "${var}")"), never using backticks for command substitutions, using xargs or mapfile instead of command substitution.

          As someone said, installing shellcheck to lint all your scripts is a good place to start and will pick up these and many other potential gotchas.

        3. Spamfast

          Re: Two easy bash script tests

          Moreover set -e is overrated and unreliable. If a command can fail the script should check for the failure explicitly.

          I agree that if the failure of a command would lead to the script leaving things in an inconsistent state, then it should be explicitly invoked via if or by checking $?.

          But checking the result of every external command, shell built-in & shell function that might fail is exhausting if a simple abort of the script - maybe with a trap EXIT to clean up - is less error prone and in any case set -e will act as a last resort assertion.

        4. rwessman

          Re: Two easy bash script tests

          Using env to find an executable is dangerous and should be avoided. If someone deliberately or maliciously created another bash, bad things might happen.

      2. Anonymous Coward
        Anonymous Coward

        Re: Two easy bash script tests

        > I agree that the exit-on-error & error-on-undefined flags should always be set

        There's a magic flag, "exit on error"? Where?!

        Wow, no more bugs, ever! What a cool thing! oh wait. You mean, exit on status code? Der... isn't a status code.. status? sigh. You're one of "those". Those ones who require the design paradigm, ` || true` to get commands with status to work. How joyful it is to encounter someone who goes, "Oh, I tried to remove a file that doesn't exist. That's an error. Ok, I'll `rm -f`. No error, win~!" Heh.

        http://mywiki.wooledge.org/BashFAQ/105

        - Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?

        > Once upon a time, a man with a dirty lab coat and long, uncombed hair showed up at the town police station, demanding to see the chief of police. "I've done it!" he exclaimed. "I've built the perfect criminal-catching robot!"

        > The police chief was skeptical, but decided that it might be worth the time to see what the man had invented. ...

    3. Doctor Syntax Silver badge

      Re: Two easy bash script tests

      "If the script is over 50 lines long it should have been written in a proper high level language."

      There used to be a mantra which ran something like:

      Never do in C what you could do in ark

      Never do in akw what you could do in sed

      Never do in sed what you could do in tr

      Never do in tr what you could do in shell

      "Autres temps, autres mœurs"

    4. Anonymous Coward
      Anonymous Coward

      Re: Two easy bash script tests

      set -e

      lol

      ```

      { lpinfo -v || true; } |

      while read tpe nam; do

      echo "$(lpstat -p "$nam" || true) $nam"

      done |

      { grep -v '^enabled' || true; } |

      while read en nam; do

      lpadmin -x "$nam" || true

      done

      ```

      List all printers -- if there are none, don't exit the script. For each printer, get its status, and print it out -- but if we can't read that printer's status, don't exit the script. Grep for the disabled printers, and if there are none, don't exit the script. For each disabled printer, remove it -- but if we can't remove *this* one, don't exit the script. So Clear! So easy to read! And we get the benefit of the "exit-if-bug" flag! (After this script, we'll go through re-adding standard printers, but... the comment length limit is insufficient for all the " || true" required to represent this.)

      Whew. Such nonsense. But hey, `set -e` great! (On the once-upon-a-never when it's appropriate to use it.)

      In environments where it's mandatory to use `set -e`, Line 1 of your script should be, `#!/usr/bin/env bash`; line 2 of your script should be, `set -e`; and line four or five should be, `set +e`.

  8. DarkwavePunk

    Shell

    I take some kind of morbid pride in the horrendous shell scripts that litter the landscape of many a company I've worked for over the past 30 years. If your eyes don't bleed, you've done it wrong.

    1. Doctor Syntax Silver badge

      Re: Shell

      Was it you who had cron run a shell script to use tcl to use vi to write the night's backup script?

      Yes, I've seen than or something as close to that as I can remember. No I didn't sort that one out - couldn't find a long enough barge pole to touch it.

      1. DarkwavePunk

        Re: Shell

        Erm... Pretty sure it was SED not VI. TCL confirmed. Crimes against humanity, definitely.

  9. Peter Gathercole Silver badge

    Really?

    I find the very concept of checking a shell script is a hiding to nothing.

    The issue is that very little of what is written in a shell script is actually written in shell commands. A huge amount is just herding other (external) commands to get a task done, passing data through pipelines of non-shell tools to achieve the result.

    As such, I like to think of the shell (pretty much any variant since UNIX Edition 6 shell, which predates the Bourne shell) as more of a harness for holding things together with some programming structures to make life easier, rather than a fully specified programming language in it's own right. In addition, it has handling of wild-card and variable substitution that make it very suitable for tasks that would be very difficult to code in a more formal language.

    There are two reasons I think this. A while ago, some article or other issued a challenge to write some shell to do something. I can't remember exactly what it was, but it was date related. The majority of posted solutions were posted as shell scripts, but in which, most of the actual processing was not done using shell built-ins. They were shell commands calling other tools like cal, or date, and using something like awk, sed and various other tools to process the output to produce the required result.

    I actually tried to write a version that was pure shell. And it was very difficult. I was working in ksh88, but I also wrote versions in ksh93. This is not what shells were written to do!

    The second reason is that I have been recently reviewing some Python which is being used to run Ansible playbooks, and this often involves quite a few hoops to run external command required for automation tasks, and then process the results in Python to work out how well it worked or whether it failed. I look at these programs, and then imagine what I would do in a shell, and find that quite often it would be much, much easier to do it as a series of commands in something like a pipeline, run from shell, rather than trying to do it in Python. For my own tasks that I have been told I need to write, I'm really thinking of writing shell scripts together with a deploy method in ansible, and playbooks that just call the shell scripts. It would be much easier to write, and (IMHO) will be easier to maintain, although it's obviously subverting the whole reasoning of Ansible.

    In conclusion, what I think I'm trying to say is that the way shell and shell scripts are used should not be treated as a formal language, and as such is very difficult to subject to automated code review.

    1. AdamWill

      Re: Really?

      The problem with this is you can absolutely cause gigantic security issues while you're doing the herding, notably with variable expansion.

      You certainly cannot *fully* check everything a shell script does by statically analyzing it, but you absolutely *should* be checking *as much as you can* of what it does with an analyzer. Like shellcheck, which another commenter pointed out has been around for years. I've seen a definite encouraging trend in the last few years towards incorporating shellcheck in CI workflows, which is nice.

  10. Anonymous Coward
    Anonymous Coward

    rc

    I recall the Plan9 shell, rc was a more modern design intended to address some of the shortcomings of the Bourne and Korn shells ([t]csh wasn't used for serious scripting by anyone who valued their sanity) and bash either didn't exist or was early days.

    Years ago I had rc running on a hpux workstation with a X11 port of the window manager 9wm(?) which as I recall was sort of ok.

    I know I knocked out a colossal quantity of shell script mostly to glue unrelated applications together

    Awk and Perl always seemed to have a impedance mismatch with the OS and file system which just made shell seem an easier choice.

    Until the ascendance of Linux distributions the variety of Unix meant that nothing much beyond Bourne shell could be assumed. Even awk was old awk, perl ancient or not present, definitely no python by default. The init script on hpux 10.20 that configured (multiple) network interfaces and aliases was an impenetrable wonder of Bourne shell gymnastics.

    Part of the problem, I suspect, is that users would prefer their scripting language to be the same as their interactive shell which probably present inreconcilable design goals.

    1. Peter Gathercole Silver badge

      Re: rc

      To tell you the truth, for all I mistrust it, I suspect that something like PowerShell will end up being the way forward. But in using this, you need an OS whose components can and do understand complex data objects.

      I actually quite liked the VM/CMS implementation of REXX as a command processing language, but the OS/2 and AIX implementations were not complete, so never really gained any headway.

      One of the biggest problems (and past strengths as well) is that UNIX-like commands are designed to work on streams of bytes, often arranged as lines. This made every sense when OS's were more simple than they have to be now, and commands were written to be run as interactive commands. But modern problems are often much more complex than can be represented by just a stream of bytes.

      If an OS's command set was re-implemented to use some form of object passing, this would make much of the convoluted shell scripting that has been required in the past redundant.

      But it would no longer really be a UNIX-like OS!

      Boy, am I glad I'm retiring shortly.

      1. chasil

        powershell...

        > PowerShell will end up being the way forward

        There is zero possibility of this happening in embedded or otherwise resource-constrained environments.

        The original Korn shell was able to compile in Xenix running on an 80286, with a maximum text segment size of 64k.

      2. Denarius

        Re: rc

        So Peter, you dont have a copy of "Unix Shell Objects" ?

        As for date munging, ksh93 has some undocumented time and date built-ins which are simpler that spawning another process etc.

        The shell script forensics done over years has usually been required by shell expansion errors. The worst was nonrealisation that *. matches .. A nonrecursive cleanup became recursive down the file tree. Good luck with that shell script code checker. ksh93 -D can help with see what some "$VAR" values produce.

        1. Peter Gathercole Silver badge

          Re: rc

          But the problem there is "undocumented".

          That exercise was a long time ago and I don't remember the exact details. I was just playing around really to see what could be done.

          I had a look at shellcheck after it was mentioned in a previous comment. It looks interesting, and I may run some of the scripts that I've written over the years through it to see what comes out, but at least some of the checks appear to be a bit more like checking style and preferred use, rather than actual errors.

          I've not looked at "Unix Shell Objects". My current scripting needs are relatively simple, and I probably won't be in a position to use such information again in my remaining time in the industry. Besides, Amazon UK is showing a 6-7 month delivery time!

  11. martinusher Silver badge

    There are alternatives

    Bash is one of those things that's probably too capable for its own good. I switch to something like Perl or TCL when the script starts to get more than half a page or so. Its less taxing on the brain. The user doesn't know the difference since the scripts are invoked using '#!' rather than being a command line argument.

    ...and yes, I know its possible to write impenetrable Perl code. But its not mandatory.

  12. Jou (Mxyzptlk) Silver badge

    Takle the old "ascii string interpretation" problem...

    Half a century ago, where 64 K was big for multi-user system, the "everything is text to parse" rule was fine.

    Everything in the script is string/unorderes-chaos-byte-stream too, even the input and output of the "bc" command. (Sorry, there is no English wikipedia link for this?).

    That what started with powershell (or .NET shell), being object oriented right from the start and having a large amount of datatypes as object with properties (and methods) instead of:

    "Parse string, size of file is the fifth entry, separated by one of more spaces or tabs from the other things", let alone the modification time display. The latter was later fixed with "ls --time-style=long-iso" and "ls --time-style=full-iso", but in the end it is still the unix-string-only thing.

    PS: My only gripe is the powershell does not offer, optional, pure byte stream pipes, with a fast and small-buffered implementation. So for ffmpeg.exe video decoding, and piping into the av1 encoder, I end up calling "cmd.exe" within powershell. Else powershell tries to get the full "object" of ffmpeg output into memory, making 64 GB RAM being not enough for even smaller videos. I heard / read that is has been implemented in powershell 7.4, but I stick to 5.1 for most of my stuff since "the same from Windows 7 / Server 2008 R2 up to the newest Windows 11 / Server 2025" reasons. Have to try it some day, when I've grown up...

  13. JLV Silver badge

    Good of people to mention Shellcheck, but... AFAIK it is not intended to work with zsh (there is an open issue @ https://github.com/koalaman/shellcheck/issues/809).

    Now, I have a lot of bash-y looking (and, to be honest, not all that well written) zsh scripts, intended to support development rather than administration. Shellcheck mostly seems to manage to flag stuff despite it being zsh.

    But I suspect it would perform less well if I were to write more idiomatic zsh.

    And, well, if anything might benefit from AI analysis, it might be more something as "syntaxically open" as linting shell scripts. It need not be 100% right - and it may never be sufficient for production scripts But if its false positive rate is low enough, could be a worthwhile addition for more casual work.

    p.s. Learned a lot from "You Suck at Programming" @ YouTube series covering bash pitfalls.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like