back to article Microsoft's attempts to harden Kerberos authentication broke it on Windows Servers

Microsoft is rolling out fixes for problems with the Kerberos network authentication protocol on Windows Server after it was broken by November Patch Tuesday updates. As we reported last week, updates released November 8 or later that were installed on Windows Server with the Domain Controller duties of managing network and …

  1. Anonymous Coward
    Anonymous Coward

    JFC! do they actually test anything these days

    Or just come up with some shit and push it out to customers. Glad I ditched the never guaranteed to do the same thing twice OS after XP.

    1. Zippy´s Sausage Factory

      Re: JFC! do they actually test anything these days

      They test it if it's going to be going live on Azure. On-premises seems to be a case of "if you don't like being our guinea pigs, you know where Azure is"

    2. steviebuk Silver badge

      Re: JFC! do they actually test anything these days

      Came to say exactly that "do they actually test anything these days"

  2. Trixr

    Word to the wise

    Also time to change your KRBTGT password (twice, after allowing replication time to all DCs or 10 hours to be safe (krb token expiry interval)) if it hasn't been done since Server 2008 DCs were introduced in the environment. Helpful hint: the WhenCreated date of the "Read-only Domain Controllers" group tells you when. Otherwise KRBTGT won't have AES crypto keys.

    The domain needs to be at least the 2008 DFL first. The safest method is to use the New-KrbtgtKeys.ps1 script by Microsoft, available on Github, since it does a single-item replication to all DCs that finishes in seconds in our environment.

    Any other accounts (i.e. service accounts) with passwords that haven't been changed since then need to be rotated too. Yes, I got an unfortunate surprise when I queried our domains for the number of active accounts in that boat.

    The whole mission of these updates is to deprecate RC4, eventually. It'd be nice if MSFT didn't screw up any further updates, like they seem to have with every single Kerb/NTLM update since late last year.

    1. Pascal Monett Silver badge

      Re: Word to the wise

      Yeah, it would be nice, but don't count on it.

      1. Trixr
        Black Helicopters

        Re: Word to the wise

        Don't worry, my actual theory is they're screwing up these updates the same way they "systematically" screwed up Exchange CUs for years, so everyone pretty much gave up and went the EXO route.

  3. DS999 Silver badge
    Trollface

    Sounds like they did half the job of securing it

    They made it 100% secure when used in an authorized manner, so all that's left is how secure it is when used in an unauthorized manner!

  4. Anonymous Coward
    Anonymous Coward

    Embrace ... Extend ...

    ... Extinguish.

    1. 42656e4d203239 Silver badge

      Re: Embrace ... Extend ...

      Indeed - given that, IIRC, M$, in their wisdom, decided that +-5 minutes would be OK for MS Kerberos.... other OSs want much tighter control than that and back in the day had to put in mods to cope with the M$ Kerboros bodge.

      Yup bodge - reason? well that would be because M$ allow their RTC to be affected by system interrupts so could not ensure it would be updated in a timely fashion. Unfortunately this is still true (at least in Windows server 2016) - get a VM or two cranked up to 100% CPU for an extended duration, and your Hyper-V host clock will measurably drift, which can make things "interesting".

      1. Trixr

        Re: Embrace ... Extend ...

        Really. I don't see in the Kerberos v5 spec where it states the 300s clock skew is MS-specific: https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-admin/Clock-Skew.html. Also, like other OSes, the acceptable skew for Kerberos auth is customisable in Windows, down to 60s.

        And I don't care what OS you're running; you're not configuring a Kerberos environment correctly if you're not using an NTP service to coordinate your system times. As MSFT doco specifically states should be done when setting up an AD environment, complete with a time service embedded into all Active Directory implementations that all clients will synchronise to by default.

        Up until Win 6.3, the acceptable time drift on clients was not defined. In practice, it was up to a couple of seconds (you could install third-party higher-precision NTP clients). More recent versions of Windows allow you to specify max time drift from 1s down to 1ms (the latter with expected constraints - no more than 4 hops and 0.1ms network latency between client and stratum 4 or higher time source, avg daily CPU time <80%)

        In any case, if you have Windows machines in a domain hierarchy getting time from VM hosts, you've done it wrong. I totally accept that Windows/AD is not perfect, but if you're going to snipe at it, at least get your facts correct.

  5. Pascal Monett Silver badge

    Another brilliant demonstration of Borkzilla testing procedures

    Oh, and "engineers were trying to resolve the problem".

    How nice of them to worry after the fact.

    1. Version 1.0 Silver badge

      Re: Another brilliant demonstration of Borkzilla testing procedures

      The computing environment is extremely complex thee days, so many updates making changes to "fix" problems but in the environment, while they will normally "work" there's the potential for new issues - caused by the computing interactions between the "fixed" components and other interacting features. So this is normal, and just needs to be fixed by "engineers trying to resolve the problem". Resolving a problem may work, but can result in a different broplem (sic) sometimes.

      1. The Velveteen Hangnail

        Re: Another brilliant demonstration of Borkzilla testing procedures

        I would be more sympathetic if this situation wasn't entirely of Microsoft's own making.

        They overcomplicate everything, introduce all new protocols, languages and tech stacks as often as Google introduces new messaging systems, and needlessly over-integrate systems that should never have been integrated, so they can drive more lock-in.

        And us morons keep handing them ever more cash so they can keep doing what they're doing cause we prefer the devil we know.

        1. Anonymous Coward
          Anonymous Coward

          Re: Another brilliant demonstration of Borkzilla testing procedures

          Well, this selection of morons left that boat quite a while back - no more MS software.

          Now it would be ungraceful for us to laugh at all those victims but, bwaha, hahaha, snort, it's, hahaha, haha..hard to control. Hihi. Sorry.

          :)

      2. Trixr

        Re: Another brilliant demonstration of Borkzilla testing procedures

        No, I'm sorry, Kerberos is a core technology in the AD stack. While MSFT making moves to tighten up their entire auth structure is commendable, it is only in the last year - in fact, since November last year - that they have released so many updates that have screwed up one or other component of AD auth (NTLM/Kerberos - LDAP stuff has been fine).

        This is *entirely* due to whatever testing model they have implemented most recently, which I understand is essentially relying on the Insider program for "regression testing".

        I expect some pain while we ensure that legacy accounts get remediated prior to force-deprecating RC4, and the documented staged approach to do that via these updates is fine. But you cannot break core services simply because you can't be bothered regression-testing - it is not a new concept, and nor is it an obsolete one.

        AD security has always been highly complex and has undergone significant changes in the last 20 years, mostly without too much disruption. But this past year has been more like the mid-90s days of bad NT4 service packs. It's really not acceptable, unless you subscribe to the view that this is a deliberate ploy to force everyone into cloud, which frankly seems increasingly likely to me.

        Of course, debacles like these mean that our clients are even more reluctant to do even basic timely updates, let alone migrate if it makes sense for their workloads (which, frankly, it doesn't always).

  6. 43300 Silver badge

    Having broken it, why are they making it extra hassle to install the update? I.e. you have to do it manually and it isn't pushed out through any of the update channels (WSUS, etc).

    Also in the past week, we have the latest version of Outlook Android refusing to work on certain models of Samsung phone (MS being exeptionally unhelpful and keep closing cases - forum posts indicate that this is the response to other people raising it too). Still not found a workable solution to this yet so having to buy a different model of phone for urgent issue requests, while we have a pile of one of the affected models sitting in the cupboard!

    Plus, separately, authenticator randomly stopping working for some users, and the only solution I've found so far is to delete the authentication methods from the user's Azure account and get them to re-register the phone for authentication.

    1. Pirate Dave Silver badge

      No idea if there's any relation, but hasn't MS very recently started forcefully disabling basic authentication all of their online services? It wouldn't be surprising at all if MS found out - after the fact - that internal components of their systems were still reliant on basic auth in some way and were getting borked.

      Or I could just be spouting nonsense...

      1. Anonymous Coward
        Anonymous Coward

        They do.

        As far as I can tell, the aim is to pretend being secure by forcing you to entirely rely on their non-RFC mechanisms, mechanisms for which there is no third party assurance they haven't "accidentally" left themselves a neat little backdoor..

        Yes, I don't trust them. Neither should you.

        1. Pirate Dave Silver badge

          Trust and Microsoft can't even be honestly put on the same Venn diagram.

  7. The Velveteen Hangnail

    Worse than failure

    This isn't the first time they screwed up critical infrastructure software, and it won't be the last.

    This reinforces update hesitancy because people cannot risk losing access, and forces them to risk compromise.

    If this only happened occasionally, I could accept it was an honest mistake. But Microsoft does this routinely, which means these mistakes are entirely intentional and Microsoft is specifically trying to undermine computer security around the world.

    1. Anonymous Coward
      Anonymous Coward

      Re: Worse than failure

      I wouldn't say its intentional (but it should have been prevented from being shipped) because Microsoft have enough bad coders that getting a whole review team made up of the dregs isn't unlikely.

    2. Anonymous Coward
      Anonymous Coward

      Re: Worse than failure

      I don't think they're deliberately trying to undermine computer security around the world, that's more an(other) unintended side effect.

      They're just pumping stock by showing investors that their customers are so locked in they can be abused and exposed without any notable consequences for Microsoft.

      It's a demonstrator.

    3. Trixr

      Re: Worse than failure

      We don't patch production till 2 weeks after patch release, and I'm personally very glad for it. I had to push the fixes semi-manually through Dev/Test, but we've got them ready to go in our prod SCCM deployments.

  8. theOtherJT

    Extend...

    The Privileged Attribute Certificate (PAC) is an extension to Kerberos tickets that contains useful information about a user’s privileges. This information is added to Kerberos tickets by a domain controller when a user authenticates within an Active Directory domain. When users use their Kerberos tickets to authenticate to other systems, the PAC can be read and used to determine their level of privileges without reaching out to the domain controller to query for that information (more on that to follow).

    Now, admittedly I'm not exactly an expert in the inner workings of kerberos, but this bloody thing has caused me no end of trouble and looks like another example of MS screwing with something for their convenience without considering the consequences.

    The last time there was a bug in the PAC, we got a samba update that "fixed" it by breaking all non MS kerberos instances, because it put in extra PAC checks and hard-required them to pass for authz to take place.... which of course they didn't. Because we're not running AD. And MIT Kerberos doesn't have the PAC attached to it to start with.

    1. stiine Silver badge

      Re: Extend...

      That's because Kerberos tickets have a lifetime that can extend beyond the lifetime of a users' group membership. That's why you query the DC/LDAP server to verify permissions.

      1. Anonymous Coward
        Windows

        Re: Extend...

        Kerberos doesn't have a notion of "groups" it is only an authentication mechanism. Authorisation is a different thing.

        Kerb tickets are granted on application to the KDC and have a lifetime decreed by a policy (config file or GP or whatevs). This bunch of shysters: https://techcommunity.microsoft.com/t5/core-infrastructure-and-security/decrypting-the-selection-of-supported-kerberos-encryption-types/ba-p/1628797 decree 10 hours for TGT.

        Since I patched my DCs at work with a 1.5GB "Cumulative Update" a lot of kerb related auth snags have vanished. I use Arch (btw) on my personal laptop and PC and so does my wife. At the moment my hatred for MS knows nearly no bounds for wasting a lot more of my time.

        Back in the day the same fuckwits used to make me jump through hoops involving getting autoexec.bat and himem.sys to get enough memory available to do useful stuff.

        How do we go about suing for a lot of wasted time from a bloody monster 8)

    2. Michael Wojcik Silver badge

      Re: Extend...

      Yeah. I understand the motive behind this – Kerberos makes the TGS (Ticket-Granting Service) the hub for all ticket requests aside from requesting a TGT, so TGSes become bottlenecks. An extension which lets some services avoid the TGS interaction makes scaling easier and can improve performance.

      But, of course, messing with a security protocol is an excellent way to break some part of security, and that includes availability.

  9. OldCrow 1975

    Microsoft makes Linux look good

    You have to increase you staff to support Microsoft servers and desktops.

    Linux has a far superior product.

    1. Anonymous Coward
      Anonymous Coward

      Re: Microsoft makes Linux look good

      Heck, even Apple's stuff is better.

      It's a good thing Microsoft spends so much time and money wining and dining those who can't tell a decent cost of ownership study from a screwdriver or they would be out on their ear already. If they spent half the amount of money they spend on *cough* "marketing" *cough* on security there would not be any problems, but why change the habits of literally decades?

      1. Anonymous Coward
        Anonymous Coward

        Re: Microsoft makes Linux look good

        Generously assuming it's fixable without starting from scratch..

  10. Nick Ryan Silver badge
    Flame

    It's a part of Microsoft's Azure/Microsoft 365 only stance...

    It's a part of Microsoft's Azure/Microsoft 365 only stance...

    The bloody error messages, when they are eventually tracked down, state that the Microsoft Azure connected/enabled device is not able to authenticate. Except neither the client nor the server system are in any way Azure enabled and the Microsoft reporting command "dsregcmd /status" confirms this.

    In other words, it's yet more Microsoft shite that was only laughably tested against their own Azure/Microsoft 365 systems and nothing else at all.

    1. Trixr

      Re: It's a part of Microsoft's Azure/Microsoft 365 only stance...

      We didn't see any Azure-related errors in our environments - we got the known error Event ID 14 in System log on DCs that had been patched without the fix yet and a problem account:

      "While processing an AS request for target service krbtgt, the account [ACCOUNT] did not have a suitable key for generating a Kerberos ticket (the missing key has an ID of 1). The requested etypes : 18 17 23 24 -135 3. The accounts available etypes : 23 18 17. Changing or resetting the password of [ACCOUNT] will generate a proper key.

      This was a service account where the "This account supports AES-256 encryption" option had been set. So I unchecked the option on the account and made sure that msDS-SupportedEncryptionTypes was set to 0. That fixed the problem for that account while we applied fixes to the DCs.

      If you're getting Event 42 errors, that needs the KRBTGT password to be reset: https://dirteam.com/sander/2022/11/09/knowledgebase-you-experience-errors-with-event-id-42-and-source-kdcsvc-on-domain-controllers/

      And don't worry, I too feel like this is part of MSFT's ongoing campaign to stealth-deprecate on-prem AD, but I wish they'd just accept that in the real world, some of us don't want to put ALL our eggs in the cloud basket. Especially if you're running critical public infrastructure.

      This kind of thing makes it more difficult for IT teams to convince the nervous that no, we DO need to apply security updates regularly, and yes, we DO need to update these 20-year-old systems before they fall over, since no-one is supporting either the hardware or OS.

  11. Anonymous Coward
    Anonymous Coward

    Wasn't this patched about a week ago? What next? An article about how Notepad++ got an update for 22H2 a week ago last Tuesday?

    1. 43300 Silver badge

      Maybe it was, but MS is so bad at communicating with its customers that many are likely unaware still!

      1. Anonymous Coward
        Anonymous Coward

        I'm not really sure what communication you would be expecting? It's clearly posted on the Release Health status page here : https://learn.microsoft.com/en-us/windows/release-health/status-windows-server-2022

        Genuinely, what more are you looking for?

        1. 43300 Silver badge

          They send out notification emails about all sorts of trivial shit, but for things which actually matter you normally have to actively look for it!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like