back to article CLI-beautifying ANSI escape sequences can also make your log files a security threat

Spend much time working in a command-line terminal and you're likely to have at least a passing familiarity with ANSI escape sequences. Those are the codes that can add color and other highlights to text, among performing other tasks, making your screen a little more easily readable. Unbeknownst to some, these sequences, if …

  1. karlkarl Silver badge

    > Those are the codes that can add color and other highlights to text, among performing

    > other tasks, making your screen a little more easily readable.

    I find all the naff colours detracts from readability personally. Typically I use TERM=vt100 just to try to avoid it all.

    1. blackcat Silver badge

      I can see where it might be useful if you see a different coloured line of text whiz by. Sometimes it is very unhelpful.

      I work with machines that produce very large amounts of harmful light so when in the lab we have to wear special goggles that block out pretty much everything below yellow in the light spectrum.

      One bit of software I was working with prints all the error messages to the screen in RED. So you can't f-ing read them!!

    2. Pete 2 Silver badge

      FTFY

      > making your screen a little more easily readable

      making your screen look like it's on acid.

      The very first thing I do when installing a new Linux distro is to turn off the colourising of the ls command.

    3. Anonymous Anti-ANC South African Coward Bronze badge

      Especially if some output is listed in dark blue... it is not much readable by us old farts.

      *grumbles*

      My eyes aren't what they once were...

  2. Anonymous Coward
    Anonymous Coward

    Seriously, who doesn't just load log files into a boring old programmer's editor

    that can just deal with whatever you throw at it, binary to plain 7 bit ASCII?

    Just do all your searches, display as hex/octal/binary etc etc using the same tools as you would any other it-ought-to-be-just-text-but-possibly-insn't-data?

    Log files, random crap downloaded from the Internet - works safely with the lot. Why cat or head or tail when you can just scroll up and down, with or without word-wrap, splitting the view and displaying multiple sections at once to compare...

    1. Dan 55 Silver badge

      Re: Seriously, who doesn't just load log files into a boring old programmer's editor

      An editor which only understood plain 7-bit ASCII would mess up iso-8859 and utf-8 text and puctuate readable text with nonsense every so often where there is an escape sequence. Also you couldn't grep it.

      You need a command which properly interprets escape sequences and removes them, like piping through ansi2txt would.

      1. Anonymous Coward
        Anonymous Coward

        Re: Seriously, who doesn't just load log files into a boring old programmer's editor

        > An editor which only understood plain 7-bit ASCII

        >> display as hex/octal/binary etc etc

        UTF-8, UCS-16, ShiftJIS all fall under "etc" in my book :-)

        > like piping through ansi2txt would.

        The whole point of having a *halfway decent* text editor is that it has loads and loads of display filters so that you don't pipe data through anything, you just select the filter - and still keep *all* of the original buffer contents unmodified. Oh, and a good grep function preferably works over the displayed/filtered results, but switching to a text/language mode works as well.

        A really, really good editor lets you mark sections of the buffer and select the display filter on each (e.g. use a syntax highlighter for this bit, but haven't seen that in yonks, sigh. Not without using, say, a Markdown display mode and hoping the markup is in place; but I digress, this is about untrusted files).

        That way you can not only find the suspicious entry but you can place the cursor there, switch to, say, hex display (or, even better, "show ESC codes") and immediately spot the naughty sequence they hoped would take over your terminal.

    2. claimed Bronze badge

      Re: Seriously, who doesn't just load log files into a boring old programmer's editor

      Or ‘cat -v’, right?

    3. An_Old_Dog Silver badge

      Arrgh

      Most system logfiles are never "complete"; while the computer is running, they are continually having new information appended to them. Dealing with such files are what streams are all about. Appropriate programs for dealing with streams are more (and its variants), sed, awk, perl and such.

      Add your own colorizing if you want, but keep ANSI codes out of the logfiles themselves.

  3. An_Old_Dog Silver badge
    Joke

    Lowest-Common-Denominator Escape-Sequence Auto-Filter:

    The model ASR-33 Teletype. It doesn't understand ANSI sequences, so it just prints them. Print your log files to a real TTY! (Then go take a shower, cook a meal, and write a book, because at 110 baud, your logfile printout will take plenty of time to finish.)

  4. Gene Cash Silver badge

    "spend a couple days sanitizing your logs"

    If you have to spend a couple days running a script to remove everything that isn't alphanumeric, whitespace, and the standard set of punctuation and symbols, you've got bigger problems than dealing with hackers.

    1. Peter Gathercole Silver badge

      Re: "spend a couple days sanitizing your logs"

      Especially when you have "cat -vt" to do most of the hard work for you.

  5. heyrick Silver badge

    Surely this is a display issue? Wouldn't you want your logs to record everything and not miss stuff (next week - sanitising your logs can be abused to hide data!)?

    Seems to me the problem isn't recording the escapes, it's that the viewing of the log files seems happy to stuff all the information directly to the screen and not, say, replace escape with ^[ or the like.

    It's the viewer that needs to be sanitised, not the data itself.

    1. Peter Gathercole Silver badge

      @heyrick

      As you can see from my other comments, I don't like embedded formatting sequences in the error messages in the first place. Log files should be clean of formatting information, so that you can post process the logs using scripts and other tools that expect plain text.

      The number of times I've nearly missed important error messages merely because there is an embedded carriage return with no line feed (hint, it overwrites what has already been printed with what follows the carriage return) in an error log is too many for comfort.

      If you want to scan a log file and colourize the output based on regular expressions, then fine. That's up to you. But don't make it the default, especially if you're making a guess about what the output device is.

      1. heyrick Silver badge

        Re: @heyrick

        "especially if you're making a guess about what the output device is"

        That's the thing. The program doing the outputting should know and tailor itself accordingly. Plus newlines should be swallowed and replaced by whatever is appropriate. Many systems use LF only, but terminals expect CRLF, so the smart thing would be to translate LF to the correct CRLF, and render CR on its own as text (like |M or something). As you say, missing stuff because of broken newlines isn't right.

        But it's not the fault of the contents of the logfiles, it's the fault of the output formatter. Really, one should be able to cat the end of the kernel binary without causing the terminal to shit itself, switch to Minitel mode, and beep Colonel Bogey to the speaker...

        1. Anonymous Coward
          Anonymous Coward

          Re: @heyrick

          > The program doing the outputting should know and tailor itself accordingly.

          How about insisting on using a remote logger (where "remote" also includes "this host") and making sure that the single log capture daemon does a very good job of refusing weird crap. (OK, you could argue that " the program doing the outputting" is then the daemon and you're covered).

        2. An_Old_Dog Silver badge

          Arrgh #2

          "The program doing the outputting should know and tailor itself accordingly."

          NO! The whole point of plain text is that the output of a program can be easily used by any other program, without the original program having to know anything about any of the programs further down in the processing pipeline.

      2. Anonymous Coward
        Anonymous Coward

        Re: @heyrick

        > Log files should be clean of formatting information

        Very true. *Should*.

        But TFA is pointing out that there are Bad People who aren't playing by your rules and you have to be on you guard against them.

        Which means always using the best possible protection.

  6. Peter Gathercole Silver badge

    Old-timer here!

    Coming from a time before ANSI sequences were ubiquitous, It always bugs me when people throw ANSI sequences anywhere, unless they've checked the terminal type before they do.

    I particularly dislike the general colourization of 'ls' output on Linux systems, as I generally do not run with black-on-white, or white-on-black terminals. I generally set the background colour to indicate which systems I am on, and also when I am running with escalated privileges. Seeing red-on-dark red, or cyan-on-blue really bugs me!

    The application writers saying "don't use escape sequences" is an absolutely stupid thing to say! Unless you are going to have everything written as a graphics application, you need some escape sequences for cursor addressing and, dammit, clearing the screen. How do you think things like editors are supposed to work without some escape sequences.

    I tend to look at log files using something like "cat -vt | pg" (on UNIX systems), which display non-printing characters as a printable sequence.

    But I am aware of the problems. I remember looking at one VT220 compatible terminal that allowed you to re-progamme the characters sent by the function keys, and even trigger them remotely, and another hack was to re-program the terminal ident sequence and then trigger that!

    1. druck Silver badge

      Re: Old-timer here!

      I find colour in ls, grep and particular the way htop does numbers to be absolutely invaluable. It takes me straight to the items of interest without having to plough through a sea of monochrome text. I could not manage without it.

      1. Peter Gathercole Silver badge

        Re: Old-timer here!

        I can see your point, but you obviously like your black-on-white or white-on-black terminal sessions, and are not colour-blind. Try setting the background to another colour, and then see how useful coloured output is.

        One of the problems is that the colourized version of ls and other commands directly generate ANSI sequences that are hard-coded (or at least coded in environment variables), rather than using libcurses and terminfo to get the correct sequences for the type of terminal you are on. Oh, and I don't even know whether the colours are standardized in the ANSI coding. I think they are just referred to as colour 1, colour 2 etc. The colours on a default DOS console running ansi.sys were definitely different from a VT241. Oh, I know you can change both the colour and the codes to generate the colours, but it's a pain to do this on every bloody Linux system I log on to.

        It's almost as if Linux has become merely a personal system OS, not true to it's multi-user, multi-access roots.

        1. sh4dow

          Re: Old-timer here!

          If you do this regularly, you could just write a script to setup all the different colors for a new system automatically...

          1. Mike Pellatt

            Re: Old-timer here!

            I could, but why should I?

            ls should have consistent display across versions. If you want colour, have a flag to turn that on, then expectations don't have to change.

            Yep, I know that ship's well and truly sailed.

            PS I've been using.*nix so long that grep having grown a -R flag was a comparatively recent discovery.

        2. J.G.Harston Silver badge

          Re: Old-timer here!

          Colours 0-7 are standardised as the RGB colours. Black, Red, Green...etc.

    2. John Brown (no body) Silver badge

      Re: Old-timer here!

      "But I am aware of the problems. I remember looking at one VT220 compatible terminal that allowed you to re-progamme the characters sent by the function keys, and even trigger them remotely, and another hack was to re-program the terminal ident sequence and then trigger that!"

      That was possible in MSDOS too with ANSI.SYS loaded by CONFIG.SYS.

      Re-programmed keys to do something else was just part of the "cool things you can do" with it. No idea if it's still there in cmd.com etc with Windows, but you could map a string to a key too so in theory TYPEing a log file with user submitted data in it could remap a key press to a dangerous command string with an embedded ENTER at the end.

      1. An_Old_Dog Silver badge

        Re: Old-timer here!

        Redefiniing [Enter] as "format c: /y ^M" was a common malware trick. The third-party driver NANSI.SYS did everything ANSI.SYS did, except redefining keys.

      2. Anonymous Anti-ANC South African Coward Bronze badge

        Re: Old-timer here!

        Hah, I did the same in OS/2 - added a bit of color to my OS/2 command prompts. Just nothing too fancy, just changed the text color to green instead of white.

  7. Pete 2 Silver badge

    Far too vague to be useful

    > some tool along that chain may accept and follow any ANSI escape sequences included in that input stream, so if an attacker can manage to get some carefully crafted codes embedded in a log file – such as in a profile name or some submitted feedback – you could end up with a mangled or manipulated view of your IT situation. We can imagine some buffer overflow-style bugs could be exploited, too, if present.

    All this amounts to is someone saying that if an unspecified piece of software contains an unspecified bug, it *might* be possible for someone, somewhere, to do something they shouldn't.

    Which pretty much sums up the entire raison d'etre of software security. But without actually helping at all.

  8. Bebu Silver badge
    Big Brother

    Makes sense to me.

    《Here's why you should clean those codes out of input data before logging it.》

    Strikes me if you are logging something expected ie informational use unadorned text and use text representations for non text values eg '\e' for ascii ESC or html entities or whatever.

    If its something unexpected or suspicious this would be even more the case.

    Just chucking the error output from an application or service at syslog (or whatever) is never going to end well.

    1. Mike Pellatt

      Re: Makes sense to me.

      <fx> trots off to see what systemd does.

  9. tiggity Silver badge

    Why would I sanitize a log?

    If I sanitized logs I could be missing signs of attacks (or just user error in text that might need a bit of an educational chat)

    I just use a text editor (with ability to view in hex if needed) - to inspect logs - hassle free

  10. martinusher Silver badge

    Another nonsense

    First rule of software is that you can throw any random input at it and it won't break. This person is suggesting that we have to alter our data to avoid worrying buggy software -- that's a nonsense. Its also a bit strange that something that's been around for decades and has always worked now has versions with significant bugs in them. (Let me guess -- 'cat' has been replaced by something written in Javascript that emulates a terminal window using a web browser......)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like