back to article Sitting comfortably? Then it's probably time to patch, as critical flaw uncovered in npm's netmask package

The widely used npm library netmask has a networking vulnerability arising from how it parses IP addresses with a leading zero, leaving an estimated 278,000 projects at risk. Researchers Victor Viale, Sick Codes, Kelly Kaoudis, John Jackson, and Nick Sahler have disclosed a digital nasty, tracked as CVE-2021-28918, in the …

  1. Anonymous Coward
    Anonymous Coward

    Not NPM again!

    Wasn't it NPM that also had a problem a year or two back? Some core package maker got told to take a package down that did some string manipulatrion, they took it down and it killed thousands of project builds all over the world. 6 hours later the main NPM repo owners had reinstate the iffy package simply 'cos it caused so much chaos not being present when builds were pulling it.

    Never a huge fan of JS nor it's bastard offspring Node, think I'll stick with Go thanks!

    1. LosD

      Re: Not NPM again!

      It has nothing to do with NPM. it's a package with a problem. Do you really think that third party Go packages has less issues?

      1. Muppet Boss
        WTF?

        Re: Not NPM again!

        The problem with left-pad had everything to do with NPM, from the lack of NPM namespacing to the lack of local caching to the NPM-sanctioned cybersquatting to the ability of the maintainer to unpublish the package, breaking dependencies to the horrible NPM handling of the situation. The only upside is that kik interactive who started all this got their karma in full.

        Go packages are locally cached, and always were.

        1. LosD

          Re: Not NPM again!

          Sure, what I meant was that _this_ (netmask) issue has absolutely nothing to do with NPM. Left-pad is pretty effing irrelevant to this article.

          1. Muppet Boss
            Thumb Up

            Re: Not NPM again!

            Thanks for clarifying, upvoted. But it's telling of NPM's reputation that people tend to blame it first.

            1. Michael Wojcik Silver badge

              Re: Not NPM again!

              Yes. The larger problem is that public package repositories are a nightmare, and far too much software now depends on them. Software development has always been good at creating our own worst problems (unmaintainable code, unsafe languages and libraries, lack of rigor, etc); this is a fine example.

    2. sitta_europea Silver badge

      Re: Not NPM again!

      "Wasn't it NPM that also had a problem a year or two back?..."

      Which one?

      https://www.theregister.com/2020/07/03/lodash_library_npm_vulnerability/

      https://www.theregister.com/2019/12/13/npm_path_traversal_bug/

      https://www.theregister.com/2019/07/15/purescripts_npm_installer/

      https://www.theregister.com/2019/06/07/komodo_npm_wallets/

  2. Anonymous Coward
    Anonymous Coward

    Why were people dragging in an external dependency (npm package) for some trivial crap like that?

    What a shite culture so many of these javascript communities have. These language based package managers (NPM, PIP, crates.io, CPAN, etc) seem to breed incompetence.

    Developers should be responsible and of course depend on a library but *only* if it will save days of work. If it is only because you are too lazy to type 10 lines then you basically need to improve your work ethic.

    1. LosD

      Yeah, because people aren't much more likely to make an error like that themselves... Now all those that DOES use netmask doesn't have that issue anymore(if they actually update their packages), but many of those that doesn't will not be protected.

      The only real factor here is how quickly the package maintainers respond. And in this case it was quickly.

      1. Anonymous Coward
        Anonymous Coward

        100% of the people had that issue. If they had implemented it themselves, perhaps 10% would have had that issue.

        Also, do you expect large projects to just do an 'npm update' and push to production? Hah, no, this problem is now known and yet will still reside on ~30% of production servers for another year.

        An important ~10 lines of code well saved ;)

        1. LosD

          You can bet your ass that most do simple string parsing, and has exactly the same issue.

          1. Michael Wojcik Silver badge

            And in this case, note that what netmask was doing was interpreting the text representation of octets in the natural manner – as base 10. The problem is that browsers have traditionally, and still do, either used C library functions which implement C's radix syntax, or emulated it.

            There is absolutely no good reason to interpret the "0127" in the article's example as octal. C's octal notation was handy when it was being used on PDP-7s and PDP-11s; it's occasionally useful now in some UNIX contexts, like the umask() system call (if you have some reason not to use the symbolic constants). It's idiotic to permit it in HTTP contexts.

            But many browsers will happily interpret http://0x7f000001 as http://127.0.0.1. I don't remember offhand whether the HTTP URI standard requires this (or even which RFC currently governs that; even though I subscribe to both the IETF and W3C's URI mailing lists, I no longer can keep the standards straight without digging through them). But it's dumb and dangerous and should stop.

            That said, the right thing to do in netmask would have been to reject it. The Postel Interoperability Principle has long lost its usefulness. We're much better off being conservative in what we accept, because we can't guarantee the semantics of edge cases will be the same throughout the stack.

            1. Claptrap314 Silver badge

              It took WAY too much effort to parse the article simply because I had no expectation that octets would treat a leading 0 as meaning that what follows is in octal.

              Quick question: How would you parse 011.011.011.011? I would expect it to be decimal because I've got it in my head that some systems (old Windows? I don't know!) require three digits.

              I could be completely wrong of course, and THAT is why I rely on a library to handle such things. In fact I did not even know that 1.2.3 was a valid IPv4 address until I grabbed the python library for a test project.

              So, yes. If I'm faced with identifying, finding, reading, interpreting, and implementing some RFC, I'm going to instead look for a library with a decent reputation and use it.

              IF, (and I do mean IF) I happen to observe something weird (like accepting 1.2.3 as a valid IPv4 address), I'll check around and see if that's correct.

              But I'm probably NOT going to trust an npm. There is WAY too much bad mojo in that space.

      2. sysconfig

        To be honest, in large projects you'll probably want to use standard libraries as much as you can, because a lot of homegrown stuff will sooner or later reach the "nobody present knows how it works, nobody dares touch it" sort of maintenance category. Or legacy. Bottom line is, a lot of broken and insecure code will stick around once the developers left, whereas standard libraries usually have a lifetime beyond the contractor's or FTE's term.

        Sure, crap happens either way, and it's not uncommon to benefit from the power of hindsight and point fingers then.

        There's no 100% secure and bugfree software beyond "hello world". Personally I'd go with something that will (or is most likely to) receive future updates.

        1. Muppet Boss
          Thumb Up

          Javascript standard library? hahahaha...

          Well, there seems to be some serious attempt to write one now, currently at v0.0.93 ...hahahaha...

          1. sysconfig
            Coat

            Javascript standard library? hahahaha...

            Ha fair enough! I actually replied to the wrong post in the thread before. My comment was aimed at the broader picture, that is homemade code vs using libraries, not JavaScript, which, well... no, I'll refrain from opening a huge can of worms.

      3. sinsi

        "The only real factor here is how quickly the package maintainers respond. And in this case it was quickly."

        But it took almost a decade to find. Lots of code written in the last decade...

        1. Nick Ryan Silver badge

          I strongly suspect that this particular flaw is not remotely easily detectable using automated code scanning tools.

    2. 1947293

      Parsing things is often less trivial than you would hope. If you wrote your ten-line parser off the top of your head, would it parse “192.168.510” correctly? Is it a good use of your time, and thousands of other developers’ time, to find, read, and implement the same specification correctly? What are the consequences if there are hundreds of differently half-baked implementations in the wild? I’m not suggesting that parsing IP addresses is hard but there is a reason libraries exist.

      1. karlkarl Silver badge

        Perhaps I am simply a god among men but splitting a string based on '.' characters is something that would probably take me less time to write than opening up a web browser (especially a modern one) to find a suitable library / dependency and working out the API / function name.

        1. Phil O'Sophical Silver badge
          FAIL

          splitting a string based on '.' characters is something that would probably take me less time to write than opening up a web browser

          Which would seem to be exactly the same assumption made by the author of this package, who also failed completely to recognize that this isn't a simple case of splitting a string based on dots, but is a case of parsing an IPv4 address. The components of an IPv4 address are not simple strings nor necessarily decimal numbers.

          1. Michael Wojcik Silver badge

            Exactly. What you want is a proper lexer and parser that strictly accepts only well-formed inputs, and then you want to constrain "well-formed inputs" to reasonable use cases.

            I happen to have written such a lexer/parser, which also accepts CIDR netmasks, hostnames, FQDNs, and IPv6 addresses, all with optional wildcards. It was written from the ground up, in C (which wouldn't have been my first choice in an ideal world, but it's part of an existing code base). It wasn't trivial and I wouldn't want to bet that it's completely bulletproof. The code's been tested and reviewed, the token expressions and grammar productions all look sensible and correct ... but getting this right for every case is not trivial.

    3. Michael Wojcik Silver badge

      Often it's transitive dependencies. Marketing wants a fancy-ass SPA web UI that's all fucking async XHR. Developers need to use a framework because implementing that in a reasonably browser-independent way (even though all of this is ostensibly standardized – but of course browser manufacturers are not great about standards-compliance, many users run old browsers, and WHAT-WG's "living specification" stupidity means it's a moving target) to meet all the requirements exceeds available resources.

      So they use a framework. And that pulls in a bunch of third-party packages. And those each pull in a bunch of fourth-party packages. And soon you not only have hundreds of packages, some of them are just different versions of the same package, and it's all a ghastly nightmare.

  3. Jim Mitchell
    Thumb Down

    This isn't a comment on the package or npm, but using a leading 0 to indicate a number is octal seems like a really bad idea.

    1. LosD

      While that may be true, it is also very standard across a lot of (most?) programming languages.

      1. Jellied Eel Silver badge

        While that may be true, it is also very standard across a lot of (most?) programming languages.

        So that may explain sucky VoIP and other IP telephony packages. To err is human, to treat a leading zero as octal probably means those calls get routed to a tentacled one* instead of the STD you were looking for. Also netmask sounds confusing if it doesn't actually calculate netmasks, or calculate them correctly given a useful route needs both the 4 octets of an IP address, and 4 more for it's netmask.

        *May explain hentai, and octet/octal confusion is why I a)take extra care calculating netmasks/inverse netmasks and b) avoid programming.

      2. Chris Gray 1
        Facepalm

        ancient history

        I believe the leading-zero-for-octal convention came from early DEC assemblers. Since the first C compiler was for DEC machines, and it generated assembler output, having C use the same convention was the obvious choice.

        In my programming languages (weird hobby), I've used 0b => binary, 0o => octal, 0x => hexadecimal, with no leading zero defaulting to decimal. And sometimes 0t => decimal. I like things explicit.

        Some early assemblers/languages also used tags at the *end* of numbers to indicate the base. So, you could have 13ah. I'm guessing that that was done rather than a leading "h" so that numbers and identifiers were easily distinguished. See early Fortran, I believe.

        1. Jim Mitchell

          Re: ancient history

          Yes, leading 0 means octal for constants within certain programming languages. Extending this to user-input data, where the user might be totally unaware of this, seems to be the issue.

        2. Michael Wojcik Silver badge

          Re: ancient history

          Specifically the PDP-7, and then the PDP-11. The original C for the PDP-7 was written in assembler, so it's not that surprising that it adopted a PDP-7-assembler convention.

          The PDP-7 used octal notation because of its 18-bit addressing, which made groups of 3 bits more sensible than groups of 4. Then of course UNIX picked it up for things like permissions bits.

          The convention is unfortunate, though.

    2. Gene Cash Silver badge

      Yes. It busts my chops every so often in Python, which also observes this convention.

    3. Anonymous Coward
      Anonymous Coward

      I think its a comment on someone not reading the relevant part of the specification but just writing the code because it was obvious how it should work.

      1. Phil O'Sophical Silver badge

        And then not testing it.

      2. Michael Wojcik Silver badge

        Which specification would that be, in this case?

        I don't know what netmask is meant to do, so I'm not sure which specifications actually apply to it. And I'm not going to trawl through the RFCs to see if the current URI specification actually says "when the URI is a URL and the host portion of the authority production contains a numeric component that begins with a leading 0, you have to interpret it as octal" (or anything to that effect, such as citing various functions in the C standard library). But I can't offhand think of an obvious specification which applies in this case.

        Again, I think this is really the fault of the browser manufacturers, or possibly of the authors of the current URI specification. We should have disallowed radix processing in the authority production years ago, and insisted on canonical representation of IP addresses. The current behavior of browsers is idiotic.

        1. Jamie Jones Silver badge

          Not just browsers, I was surprised to find out, so I guess we can't really blame them:

          21:15 (2) "~" jamie@catflap% telnet 077.4.4.4

          Trying 63.4.4.4...

          ^C

  4. Alan Brown Silver badge

    ancient history

    I got shouted down by Bind groupies in comp.risks for poiinting this stuff out in 1996

    I really can't believe it's still there 25 years later

    1. sitta_europea Silver badge

      Re: ancient history

      "I got shouted down by Bind groupies in comp.risks for poiinting this stuff out in 1996..."

      And apparently you've neither forgotten not forgiven. Do you know my wife?

    2. Michael Wojcik Silver badge

      Re: ancient history

      Surely anyone who reads RISKS shouldn't be surprised when anything is still present a quarter-century later.

      I recall discussing the CBC padding-oracle issue in sci.crypt in the early 1990s. I pointed out a vulnerability in Outlook on Vuln-Dev in 2000. Morris showed the world that BoFs were exploitable in 1990, and Levy showed how easy it was to do it in 1994. All of those are still around.

      The sad truth is that most things just don't get much better. The industry doesn't learn from its mistakes. Mostly it simply finds ways to push the problems around.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like