A bit puzzled
Maybe I'm missing something, but if a development team has used this to find and squash memory leaks (yes, I know that's a big 'if') how is it then an attack vector for the bad guys?
In 2018, chip designer Arm introduced a hardware security feature called Memory Tagging Extensions (MTE) as a defense against memory safety bugs. But it may not be as effective as first hoped. Implemented and supported last year in Google's Pixel 8 and Pixel 8 Pro phones and previously in Linux, MTE aims to help detect memory …
You aren't just a it puzzled. You simply don't have a clue.
First, MTE isn't used for memory leak detection. It's used for access control to memory. The leak that the article talks about are information leaks concerning the tags.
Second, the tags are not at all an attack vector. Without MTE zero effort would be required. This article is saying that it is relatively easy to get around the MTE access control.
What I find interesting is that 4 bits are considered enough to act as a key.
I'm sure that a key violation will probably trigger an event that will terminate the process attempting to access the memory with the wrong key, but 4 bits is 16 states. It's pretty trivial to try all 16 values, and that is assuming there are no other ways of checking the value of the key without triggering an access key violation.
Even if access were inadvertent, 16 values will not eliminate use-after-free type problems, just reduce them 16 fold (after all, there will be a 1 in 16 chance that a random value used will match, unless of course one value is reserved for free memory), in which case it drops to 1 in 15!
With a 64 bit address space I’d be happy to throw 10 bits at the tag. That brings the odds of a bad pointer continuing to work down to 1 in a million. There is probably a hardware cost due to comparing the tag bits on access that makes this prohibitive.
And I’d want to protect blocks of all sizes not just 16 bytes and over.
Tagged architectures aren’t a new thing, always wondered why no one did this before.
I remember trying to stash some state in the lower bits of a pointer. Modern C compilers make this extremely onerous to achieve — as well they should.
Oops, 10 bits only gets you a thousand combinations. You’d need 20 bits for a million. Still leaves 44 for the actual address space which is more than plenty.
I think actually the chances of a bad access are the number of combinations squared since you have the tag for the (possibly dangling) pointer and the tag for the memory block. So I was right the first time…
I think actually the chances of a bad access are the number of combinations squared since you have the tag for the (possibly dangling) pointer and the tag for the memory block.
Don't quite follow. I would have thought there was a 1/1024 probability of the pointer's key and memory's tag coinciding by chance alone and 1023/1024 probability of the mismatch being detected. Assuming all key/tag values are used.
Even thowing Bayes at it I still get this but the little gray cells aren't quite what they used to be. :(
Surely hiding bits like this in an address is asking for trouble as memory address space grows.
Didnt mac get into a big mess because they hid a few bits in 680x0 addresses and then when memory grew or they switched to processors that used all the bits those old programs were broken.
4-bit memory tags go back (at least) to the IBM System/360. On that architecture, prior to the existence of virtual memory, the running process was assigned a key which allowed it to access its own memory but not the memory of any of the other processes that might be present. Changing the key was a supervisor function, so the process couldn't escape the confines imposed by the operating system.
However, this particular ARM feature is not intended to protect one process from another: virtual memory features can be used for that purpose. It's mainly intended to catch bugs, such as buffer overflows and use-after-free. The idea is that memory allocators assign a random tag to each allocation and that tag forms part of the 64-bit address returned by the allocator. If the program writes beyond the end of the allocation, or into a different allocation (or a freed block - memory in the free pool has a reserved tag not used in allocations) an exception may be raised.
4 bits are arguably enough for this as it's a pragmatic debugging aid rather than a security feature as such. It can be enabled in synchronous mode (you get a precise indication of the location of he fault) or asynchronous (you only get to know the fault occurred, but not exactly at which instruction), with the former potentially incurring a performance penalty, so it's not something you would want to depend on.
Whereas it's true that, if you have rogue code running in your address space, MTE isn't going to do much to prevent the creation of potentially-valid pointers to data, I don't see that was ever the goal.
You could of course, with the aid of the operating system, have a more elaborate system where each allocation was mapped to a different page with a randomly-assigned virtual address. That would be more secure at the expense of wasting large amounts of memory and TLB space. This is intended to be a lightweight feature that can run (mostly) transparently in user space.
I think some of the confusion might arise from the way ARM themselves make several references to security in their MTE blurb (e.g. https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/enhancing-memory-safety) - whilst it's clear from a full readthrough of what they're actually saying that it merely aids the developers in identifying certain types of memory access errors which may be misused to defeat device security, a skim read just picking out a few tasty keywords may well give the impression that MTE is rather more than it really is...
So whilst it's potentially beneficial to raise awareness of just how non-secure MTE is, in case someone was mistakenly hoping to utilise it as part of their security measures, it also feels a bit unfair on ARM by giving the impression they've tried and spectacularly failed to introduce a genuine security measure into their devices.
When the s360 was designed, multi-tasking OSs were actually almost non-existent in the IBM world.
If I go back in my memory far enough, I believe that the keys were originally implemented for the first implementations of IBM's Virtual Machine (VM) hypervisor on 360 (I was given an introduction to the history on an Amdahl UTS internals course I took in about 1987, where they had a section on the 370 EA and XA architectures and where they came from). I had previously seen this in operation (although I didn't know the internals of it) when NUMAC decommissioned their s360/64 and ran an OS/360 partition on the s370 alongside MTS as the main partition while I was a student there.
As such, the setting of the key was definitely a supervisor instruction, and was intended for OS segregation.
I believe that Amdahl's Multiple Domain Facility (essentially a hard-coded VM system) used them for the same purpose, which was fine as you would never have more than 16 domains running on the same system. So they enforced VM segregation, not process segregation.
I admit that the normal virtual address space separation in almost all modern systems is actually the main protection for one process's address space to be separated from another's and also the kernel address space, but what we're talking about here are flaws that allow you to side-step those protections, so the correct operation of these things is not the issue.
Early (and smaller) versions of System/360 could schedule only one job at a time (PCP - Primary Control Program), but over time the mainstays were OS/360 MFT (Multiprogramming with a Fixed number of Tasks) for smaller machines and OS/360 MVT (Multiprogramming with a Variable number of Tasks) for larger machines. MFT and MVT both used storage keys for protection and indeed my handle is the error code generated when a program attempted to address memory with a different key. The key size limited the total number of programs that could be loaded simultaneously, but for all practical purposes the limiting factor would be the total size of the very expensive memory.
However, as you say, this is something different and though any form of side-channel information leak is undesirable it doesn't seem to me that this is a big deal: malicious code code can possiblly subvert a debugging mechanism to introduce undetected bugs, but there was never any claim as far as I can tell that the mechanism would trap all bugs anyway or be a defence against modified code. There's a separate feature - ARM pointer authentication - that's supposed to stop malicious code producing valid pointers.
So basically Spectre, again.
If you're going to speculatively execute something, you cannot do that in any way that differs to the way it would happen for real, and also you have to "rollback" to the point where no trace is left of the speculative execution if that branch isn't actually taken.
Not quite as I understand it. While this the flaw is similar to Spectre in how it works, Spectre could directly steal data from RAM. This flaw requires you to be running malicious code already, and then you could use it to defeat the MTE security/bug system. However, all the other existing protections in place in the OS & app still need to be broken.
This doesn't blow the security of applications on Arm wide open; it means that MTE can be bypassed in the right conditions.
????????
Surely that statement is incorrect and misinforming and disinformation/corrupted information and a misleading intelligence thread?
Is it not more likely perfectly correct to say/ask ....If MTE can be bypassed in the right conditions, the security of applications on Arm can be blown wide open?
And ..... that be the same situation discovered and enjoyed by all speculative security executors/Dread Red Team penetrations testers/0day vulnerability exploit LOVE bug hunters in all modern processors performing certain sensitive operations with inequitable collateral consequences and perversely tempting opportunities the result with further paths available for following/exploring/exploiting.
And .... there's more, and it gets worse, or better, depending upon one's preeminent and predominant disposition.
Free Market Traders of Speculative Executions .... "the practice of performing certain operations on modern processors before they're needed and either using the results, if required by the program's path, or tossing them, if the program takes a different path" ..... realise that persistently ignored events/media hosting and media hosted operations are the life-giving ingredients of future paths to be ploughed and planted with the needs and seeds for effective essential successful provisional growth ....... for evolution in a radical and fundamentally revolutionary novel direction.
And ..... deny it as much as you like, but the greater threat/treat/opportunity/vehicle for such speculative executive order type operations, and readily available for easy expansion and simply complex exploit in more than just programs for microprocessor units, is human readable/digestible/understandable/leading ..... and, believe it or believe it not, almightily unavoidable.