back to article How Apple's M1 uses high-bandwidth memory to run like the clappers

Apple last week set the cat among Intel's pigeons with the launch of its first PCs incorporating silicon designed in-house. The company claims its M1 Arm chip delivers up to 3.5x faster CPU performance, up to 6x faster GPU performance, up to 15x faster machine learning, and up to 2x longer battery life than previous-generation …

  1. Doctor Syntax Silver badge

    How long can it be before this approach gets extended with a memory controller for off-SIP memory so that the high bandwidth on-SIP memory is just another layer of cache? One year? Two? In the lab now?

    1. Brewster's Angle Grinder Silver badge

      I was going to say something similar. Conventional DRAM is still faster than NAND so it makes sense to start virtual memory there, and off load to disk if needed.

    2. Malcolm 1

      Isn't this what AMD are already doing with their "Infinity Cache" - currently only 128MB on their latest GPUs but you could easily see how that could be expanded (manufacturing tech permitting).

      Various of their previous GPUs have featured HBM but I gather it was prohibitively expensive and now only appears on their datacenter/workstation products.

      1. Wayland Bronze badge

        Malcolm, yes the chiplets, infinity fabric, HMB are common to TSMC and AMD. Also interesting is that NVidia not only designed their own ARM chip but now own ARM holdings. What have Intel got?

        1. W.S.Gosset Silver badge
          Happy

          >What have Intel got?

          Most of the market.

          same as Blackberry circa 2000...

        2. 0verl0rd

          Wayland,

          Nvidia don't own ARM until the Regulator's approve! https://www.cnbc.com/2020/12/02/arm-ceo-simon-segars-expects-regulators-to-look-closely-at-nvidia-deal.html

          https://www.lightreading.com/asia/arm-regulators-to-take-good-look-at-nvidia-sale/d/d-id/765858

    3. Anonymous Coward
      Anonymous Coward

      HBM is a DDR memory technology with the associated high latency so not suited for a L4 cache. But such an external memory controller could well be useful for Optane type memory holding the operating system's page file.

      Bandwidth and latency for an HBM2e pseudo-channel are roughly comparable with DDR5-5600 (32-bit wide). So you can think of an HBM stack as having 8x the performance of a DDR5 DIMM (though it has 16 effectively independent pseudo-channels). Currently HBM devices are available in up to 8GB capacity, but like all things DRAM that will inevitable increase with time.

  2. LucasNorth

    expandability? when has a mac ever had that?

    1. Anonymous Coward
      Anonymous Coward

      2010 Mac Pro

      1. TimMaher Silver badge
        Windows

        Mac pro

        My 2010 was a genuine “Trigger’s broom”. Third vid., second set of full RAM, better than Apple RAID controller driving 12TB across four drives in 0/1 configuration , hybrid system drive in second DVD slot.... etc.

        Recently moved on up to a 2012 as I needed High Sierra support without using a VM.

        1. Snapper

          Re: Mac pro

          Put ThunderBolt in mine!

        2. Wayland Bronze badge

          Re: Mac pro

          I love it when people do those type of upgrades.

      2. schermer

        And before that the MacPPC's (tower G5). I had one of those. ATM a MacPro 2009 fully upgraded (bought it new in 2009). Since I am pensioned it is just overkill. So I bought this new MacMini M1 (fully specced): it will be delivered next week and will be more than enough for my current needs.

    2. JDX Gold badge

      You can install extra RAM in many Macs, sometimes the diskdrive too.

    3. Jason Hindle Silver badge

      It's been a while

      This Mac user would certainly appreciate a simple box you put additional things into (or replace existing things with better/working things) :-/.

      1. Dan 55 Silver badge

        Re: It's been a while

        NUCs are quite good as a Hackintosh'd Mac Mini as Apple barely strays from Intel's reference design.

        1. s2bu

          Re: It's been a while

          NUCs might be a good Hackintosh system, but Apple's systems are NOTHING like Intel's reference designs AT ALL. They're very very different.

  3. tsf

    Sounds like a sound plan to ensure those pesky customers don't avoid the Apple Tax by buying more memory later, even if their use case changes and demands it, or they simply realise that what they thought would be sufficient actually isn't.

    Better over-spec at the start just in case $$$

    1. JDX Gold badge

      Except it also has a substantial benefit from the engineering side. Did you even bother reading the article?

    2. Charlie Clark Silver badge

      You haven't been able to do anything with the notebooks for more than five years now. It is annoying. However, apart from being able to run more and more VMs at once, RAM use on MacOS has been reasonably constant for the last 10 or so.

      Who knows, maybe Apple will let people swap the SoC in a year or so from now. For a price, of course. But in the meantime there's no denying that they have a fairly compelling value proposition: improved performance and significantly improved battery life. That said, I certainly won't be switching to Big Sur until I know what the restrictions are and when they've really fixed all the bugs. I've skipped versions in the past when it was clear they were too buggy.

      1. Dave 126 Silver badge

        >Who knows, maybe Apple will let people swap the SoC in a year or so from now.

        Mac users needing an upgrade tend to sell their existing machine and buy a new one. They don't normally depreciate that quickly (though we'll see what these M1 Macs do to the resale value of recent Intel Macs.)

        Generally Apple want you to over spec, it makes it easier to introduce new features to a larger pool of capable machines. For example, Apples cheapest, thus baseline, machines now have graphics capabilities better than entry level discrete GPUs. Soon, all Macs will have, at minimum, a fair bit of GPU prowess, and developers can work with that.

    3. Wayland Bronze badge

      It's actually a very justified reason for integrating the memory in the same device. I'll have some HBM with my CPU.

      What I'd like to see from AMD is HBM in their CPUs with DDR used as virtual memory in some way. I'm pretty sure if you've got 16GB in the chip that will handle most things but with external DDR you won't run out.

  4. Whoopsie

    Glad to see Apple continuing their 'innovation' by copying what Silicon Graphics did in their workstations, what, 25 years ago?

    1. zootle

      Or Sun did with the SPARC modules (mainly CPU+SRAM) in the 90s.

    2. Dave K Silver badge

      SGI machines never used a SIP design to limit expandability however, all of them used memory modules and could have their RAM configuration modified at will. I presume you're referring more to the unified memory architecture used on the O2 and on some of their Intel Visual Workstations I believe. If so, there are similarities here, but execution is quite different.

  5. Steve Davies 3 Silver badge

    Unified Memory

    Isn't that just a new name for 'Shared Memory'?

    I'm sure that a lot of readers of this esteemed site remember that.

    You know when the Grapics Card used a great chunk of the CPU memory because the Graphics Card makers were cheapskates.

    Things improved when graphics cards started to get really fast RAM.

    At least Apple have made the memory bandwidth really, really large. IMHO, that's where a good deal of the performance comes from.

    This CPU will give a few other chip designers a lot to think about.

    But hey... Apple can't innovate can they? (sic)

    1. Anonymous Coward
      Anonymous Coward

      Re: Unified Memory

      The thing is, HBM stacks have 16 effectively independent data paths (each > 20 GB/s for HBM2e) so the system architect can assign some to the CPU and some to the GPU. It looks like that design has 2 stacks, so 32x 20GB/s should be enough to go round.

      (By the way, yes there are applications that will saturate the bandwidth of 2 HBM stacks though you will find them running on GPGPUs for HPC).

      Edit: by the way, something like this was done by Intel with the truly weird and wonderful Kaby Lake G - Intel CPU, AMD GPU, and HBM all in the same package. There is an important difference though: in that product the HBM was only attached to the GPU, and the CPU retained a traditional external memory interface.

      1. DS999 Silver badge

        Re: Unified Memory

        Except the LPDDR4x Apple is using has nothing to do with HBM memory as used on Nvidia Titan GPUs. Not sure why the article author has confused the two, they are nothing alike.

    2. Anonymous Coward
      Anonymous Coward

      Re: Unified Memory

      Unified memory I believe creates the same addressing scheme across users unlike shared memory, where it can allocate from the pool, but the allocation cannot be moved, it can be copied and freed.

      This is the difference between shared and unified memory. Unified is a super set of shared - they theoretically can see all the memory and a handle alone can move data across cores/users.

      This gives a significant performance advantage for offload and heterogenous compute loads.

      1. Mark Honman

        Re: Unified Memory

        Hmm, that's a pretty convincing explanation of the unified vs shared mem difference. I've always struggled to understand the difference between OpenCL's USM and SVM, maybe this is the key...

        If I get this right, the difference is that in a heterogenous shared memory scenario, the memory appears at different locations in each device's address space so pointers are not translatable. For example if the CPU builds a linked list in shared memory, the accelerator cannot just dereference the pointers.

  6. JDX Gold badge

    So what is the neural engine for?

    Apple are making a big deal about ML capabilities on the M1 chippery but what use does this have in a commercial laptop used for gaming/music production/video editing/general use?

    Is it used for everyday purposes I'm simply not aware of? I can't imagine Apple would bother with it for no reason but other than possibly Siri, I'm struggling to think why this is a core part of the system.

    Anyone got a good answer?

    1. Anonymous Coward
      Anonymous Coward

      Re: So what is the neural engine for?

      Photo touchup and editing algos uses inferencing a lot. Edge detection, contour, face detect come to mind. These can leverage the NPU, just like GPUs do for graphics.

      Others are for applications that customise per user - such as your usage patterns. These can use inferencing as well.

      I'd imagine games could use it for some aspects of game play. This is a tiny subset of examples.

      Just like graphics/GPU, the CPU could do it, but it is far more efficient and faster with an NPU.

    2. StrangerHereMyself Bronze badge

      Re: So what is the neural engine for?

      Siri speech recognition. The part where you say: "Hey Siri" is being processed locally on your phone, not in a datacenter like the rest of the "conversation."

      1. NeilPost Silver badge

        Re: So what is the neural engine for?

        So is Siri less retarded when run on M1 Mac now then??

        Better than Cortana, probably worse than Alexa.

    3. Wayland Bronze badge

      Re: So what is the neural engine for?

      Developers who use software or GPU based neural processing can take advantage of hardware based version. This converting software to hardware has been going on since the first microprocessor. Today's buzzwords A.I and Machine Learning are fancy terms for a bruit force way of problem solving, hardware helps a lot.

      1. JDX Gold badge

        Re: So what is the neural engine for?

        So are there standard APIs for this? It's not an area I dabble in so possibly I've missed this become mainstream in the way we have OpenGL, OpenMP, etc.

        1. Michael Wojcik Silver badge

          Re: So what is the neural engine for?

          There are various popular APIs. Apple has its own Core ML.

          The ML hammer can be used on a wide variety of nails. Whether a given application is a wise use of ML technology is another question, of course, but it shouldn't be hard for application architects to find uses for that ANN core.

  7. This post has been deleted by its author

  8. StrangerHereMyself Bronze badge

    Performance tricks

    I believe that had the DRAM been stored off-chip the M1's performance numbers would've been a lot less flattering.

    But chances AMD and Intel will stomp up something similar soon, before Q3 2021 is my guess.

    1. Anonymous Coward
      Anonymous Coward

      Re: Performance tricks

      "The SoC has access to 16GB of unified memory. This uses 4266 MT/s LPDDR4X SDRAM (synchronous DRAM) and is mounted with the SoC using a system-in-package (SiP) design. A SoC is built from a single semiconductor die whereas a SiP connects two or more semiconductor dies."

      The DRAM is not on the same chip. Or die.

      1. StrangerHereMyself Bronze badge

        Re: Performance tricks

        The DRAM is physically very close to the CPU which ensures speedy access. This has an enormous impact on performance.

        1. Anonymous Coward
          Anonymous Coward

          Re: Performance tricks

          But that is not what you posted. You were saying it's on the same die. Or chip. It isn't.

          1. StrangerHereMyself Bronze badge

            Re: Performance tricks

            Yes, I was wrong. From early descriptions I gathered that it was either located in the same package or on the same die.

            That would've explained why the memory wasn't expandable. It seems they've opted for putting the DRAM *very* close to the SoC to increase bandwidth.

  9. Detective Emil

    Sowing confusion

    Apple really isn't helping by calling this high‑bandwidth, low‑latency memory, because, despite it being LPDDR4X, people are likely to confuse it with High Bandwidth Memory, a JEDEC standard. Indeed, a poor translation on Apple Finnish store (since corrected) actually said "High Bandwidth Memory".

    On the "unified memory is old hat" theme, yes indeed: there's a now-expired Apple patent concerning it from 1996.

    1. Dave 126 Silver badge

      Re: Sowing confusion

      Yep, it's not HBM. Confirmed by Andrei over at Anandtech:

      https://www.anandtech.com/comments/16226/apple-silicon-m1-a14-deep-dive/728721

      However, what Apple have done is design a microarchitecture that retains and releases NSObjects (an operation used a lot by OSX applications) five times quicker than Intel chips can.

      "Native code running on Apple Silicon is not 5 times faster than on Intel, generally, nor is Intel software running under Rosetta on Apple Silicon twice as fast as on Intel. But retaining and releasing NSObjects is so common on MacOS (and iOS), that making it 5 times faster on Apple Silicon than on Intel has profound implications on everything from performance to battery life."

      https://daringfireball.net/2020/11/the_m1_macs

  10. Pascal Monett Silver badge

    So now Apple had made it's entire RAM space shared

    I seem to recall not so long ago that Intel had a big problem with its CPU architecture that could allow programs to access kernel memory, and many people were all in tizzy about it.

    Now, Apple brings a system-on-a-chip that shares all its memory space.

    Is there no problem with that ?

    1. DS999 Silver badge

      Re: So now Apple had made it's entire RAM space shared

      CPUs have always relied on memory protection to separate kernel and user pages. Unified memory used to be the norm, it only started getting separated because giving GPUs their own DRAM increases their performance since accessing via the PCIe bus is slow.

    2. Anonymous Coward
      Anonymous Coward

      Re: So now Apple had made it's entire RAM space shared

      Not sure.

      In principle in a PC a PCIe device that can bus-master can access any region of memory it wishes via DMA transfers. The CPU is none the wiser. This was a problem in early Thunderbolt on Mac (PCIe down a wire). So we're quite content with that situation (now that Thunderbolt has been improved), so there's probably not too much to worry about with this architecture. There's no real security difference I can see between devices being able to directly address memory and being able to freely DMA to/from it.

      But I suppose the point is that, whilst Spectre was all about code running on the CPU being able to mess around with caches and learn about other memory content, now its other things too (GPU code, the neural engine). Worse, the fact that an attempt at something like Spectre might be distributed between all three instead of just the CPU could make it very hard to spot in advance.

    3. Wayland Bronze badge

      Re: So now Apple had made it's entire RAM space shared

      If we compare running a task on a CPU to a GPU we see that the CPU has all sorts of fancy rings and layers protecting different processes to each other. Where as a GPU is just raw power in a parallel format.

      I'm not sure how sophisticated an ARM is these days but it would need all those fancy x86 things in order to replace it on the desktop and in the server.

      1. DS999 Silver badge

        Re: So now Apple had made it's entire RAM space shared

        No it doesn't. x86 doesn't even use all those rings itself under Windows or Linux. It uses standard memory protection capability which both x86 and ARM supports. Don't believe the lie that somehow ARM is only suitable for phones but can't run desktops or servers (it has been running in servers at places like Amazon for a couple years now)

        And nevermind that both iOS and Android are far more complex operating systems than any desktop or server was running not that long ago.

  11. Jason Hindle Silver badge

    Long term, I think we will see expansion options

    With good memory management, perhaps we could expect little or no noticeable performance decrease where there is a mix of integrated and DIMM memory. That said, it is starting to look like what memory you have goes further on these new Macs (various demos of users maxing out the 8GB MacBooks; lots of vigorous debate around this).

    The MacBook Air 2020 M1 sitting behind me (humble base model I've just bought)? Erm, runs like the clappers. Office under Rosetta? About 20 seconds when launching each application for the first time. Near as damnit instant thereafter (proving there is a pre-process step). So, clappers for Office also. I'm a big believer in the capitalist principle that competition is good. This is that massive (and much needed) boot up Intel's arse.

    1. phuzz Silver badge

      Re: Long term, I think we will see expansion options

      This is that massive (and much needed) boot up Intel's arse.

      Especially with AMD firmly taking the high end performance crown from them with Zen3, this is probably not a fun time for Intel.

      As you say competition is good. Last time Intel were getting clobbered by AMD (Athlon 64 vs Pentium IV), Intel went back to the drawing board and came back with the Core architecture, so hopefully they'll take this opportunity to do the same.

      1. O RLY

        Re: Long term, I think we will see expansion options

        Don't forget Intel was ALSO paying Dell $1 billion/year not to sell AMD-based systems at that time. Dell, the company, and Dell, the man, each paid small fines. Dell, the company, restated earnings for several years, their CEO resigned and several CFOs cycled through to clean up the mess. Hopefully Intel isn't cheating this time.

  12. martinusher Silver badge

    Great Block Diagram

    I like the block diagram that illustrates the construction of this processor. Very informative. Apparently there are boxes connected through a box called 'fabric'. I shall remember this technique when I do my next design.

    Reading between the lines it looks like the key design points are that the memory is physically close to the processing elements which allows the memory to be synchronized with the processor clocks. There's also no mention of (L1) cache, suggesting that the memory is effectively the cache. So at the price of some trickery in the memory controller design (maybe just alternating access between main processors and GPUs?) you get rid of all the overhead of managing cache misses.

    (Shared memory is as old as the hills and then some. I got lumbered with a university project in the mid-1970s that rehabilitated a GPU that was designed to share memory with the main processor. I had to make it standalone and provide it with an interface to a different system. Logically straightforward but very tedious because the technology used was prehistoric (it was "transistorized", though). The GPU was from an old English Electric computer, it was really very well designed for somthing that old.)

    1. Pascal Monett Silver badge
      Trollface

      Re: it was really very well designed for somthing that old

      Of course it was really well designed. You're talking about an era when Boeing employed redundancy, when computer engineers were actual engineers and when said engineers knew what was going on electrically in their designs.

      And they tested their designs before selling them.

    2. Kristian Walsh

      Re: Great Block Diagram

      This design doesn't replace cache; it just adds a really low-overhead method of accessing the general system RAM pool.

      The L1/L2 caches are still present as normal, although big cores have more than small ones, which confuses some benchmarking software. Total L1 seems to be 192k instruction+128k data, and Apple itself says there's 12 Mbyte L2 cache. Big cores have twice as much L2 cache as small ones, but it's unclear how L1 is allocated to each core.

  13. Anonymous Coward
    Anonymous Coward

    Apple leading the way once more

    I'm sure the haters will manage to pick holes but these stats simply annihilate the opposition.

    M1 Arm chip delivers up to 3.5x faster CPU performance, up to 6x faster GPU performance, up to 15x faster machine learning, and up to 2x longer battery life than previous-generation Macs, which use Intel x86 CPUs.

    Expect Windoze boxes to follow suit in short order.

    1. DS999 Silver badge

      Re: Apple leading the way once more

      Those stats (which are mostly hyperbolic anyway) have almost nothing to do with Apple putting LPDDR4x on package despite what this article claims. Heck, you can buy DDR4-4266 DIMMs, though they are not a JEDEC standard and used primarily by overclockers.

      Given that Apple controls all their own hardware now nothing would stop them from designing systems to use DDR4-4266 DIMMS and getting the exact same performance on a more "traditional" system. Actually you'd get BETTER performance since LPDDR standards trade a bit of latency for power savings (Intel and AMD systems have lower memory latency than Apple's M1 Macs) That wasn't possible when they used Intel CPUs since that clock rate is not officially supported, but Apple can officially support whatever they choose now.

      1. Anonymous Coward
        Anonymous Coward

        Re: Apple leading the way once more

        That's all just technobabble. The bottom line is you cannot post a link to a Windows laptop with anything approaching the performance of the new MacBoook Air for £999/$999.

        While that remains true, Apple wins.

        1. Sorry that handle is already taken. Silver badge

          Re: Apple leading the way once more

          Not planning to wait for independent test results?

          1. Handy Plough

            Re: Apple leading the way once more

            I see you actually mean: "I don't like the myriad independent test results that appeared online since this processor has been available, so I waiting for ones that fit my world view"...

    2. Nate Amsden

      Re: Apple leading the way once more

      Several folks seem to think this performance will be possible on Windows anytime soon. MS partnered with Qualcomm for their ARM stuff and it seems to be weak by comparison. Qualcomm's ARM datacenter chips went nowhere as well. The trend of higher performing processors on mobile Apple vs Android seems to have been going on for a long time. While there are others that make ARM on mobile it seems general opinion is Qualcomm is by far the best/fastest when it comes to Android.

      Things would be totally different if Apple had any history of licensing their chip designs or even agreeing to sell their chips to other companies but they have no interest in doing so(no signs of that changing). Also not as if MS (or google) can encourage Apple financially given Apple has so much money in the bank.

      Apple has certainly accomplished some amazing stuff by vertically integrating all of this, really good work. I'm certainly not their target market so won't be using this myself but for many people it will be good.

      Will be interesting to see how this affects market share in these segments I'm guessing Apple will pick up quite a bit vs Windows. Lots of folks touted OS X as being a great easy to use OS, but add into that this new processor and the speed/battery savings it gives it's pretty amazing.

      If anything this won't obviously inspire significant fear from Qualcomm or other ARM vendors because Apple's locked in ecosystem. They can't sell into IOS/OS X, and vise versa. Just look at the progress of processors in the wearable space for comparison. I have read Apple has made quite a bit of progress there over the years meanwhile many others either got out of the space or let their designs sit for years without improvements.

      Since MS can't go to Apple to buy chips, they are sort of stuck. Same for Google. Sure MS or Google could design their own chips like Apple but it would take many years before they are viable like this (assuming they ever get to that point before being killed off).

      1. Anonymous Coward
        Thumb Up

        Re: Apple leading the way once more

        ^ an excellent write-up. Thank you.

      2. Kristian Walsh

        Re: Apple leading the way once more

        That argument assumes that Apple has a runaway lead in SoC performance on Mobile, which is not true.

        Apple's mobile SoCs tend to score very well on single-core benchmarks, but fall back into the pack when you look at multi-core scores. For example, Apple's A14 is the leading SoC on single-core benchmarks, but the Kirin 9000 beats it on multi-core tests, and early Snapdragon 875 results also show a significant lead in multi-core benchmarks over Apple.

        Basically, Microsoft, Google, HP, Dell do have options if they want to pursue ARM for desktop, and the performance-boost of directly-attached RAM that Apple has used on M1 is not a new idea, and can be adapted to existing systems. It sacrifices any chance of upgrading RAM, though, which could limit its attractiveness in the Windows market, where enterprise IT procurement policies have a lot of power over what gets sold.

        1. DS999 Silver badge

          Re: Apple leading the way once more

          The other SoCs only beat it in multi core because they have twice as many big cores. The situations where you are maxing out all cores on a phone are pretty rare, so those multithread scores are mostly for bragging rights. If Apple thought that mattered they'd put a couple more big cores in the iPhone SoCs, but they only do that in the iPad Pro because that's where it will actually matter.

        2. Wayland Bronze badge

          Re: Apple leading the way once more

          If the SoC is socketed then a RAM upgrade could be like upgrading the CPU. Virtual Memory has been around for decades so if the external DDR was presented as a RAM drive then it could be used as super fast swap space. A slight change in the architecture but the OS won't notice.

      3. Aussie Doc Bronze badge
        Pint

        Re: Apple leading the way once more

        Good job. On the house ------------------------------>

    3. Anonymous Coward
      Anonymous Coward

      Re: Apple leading the way once more

      "M1 Arm chip delivers up to 3.5x faster CPU performance, up to 6x faster GPU performance, up to 15x faster machine learning, and up to 2x longer battery life than previous-generation Macs, which use Intel x86 CPUs."

      3.5x faster than a 3 year old Intel CPU that was neither the fastest or most power efficient option.

      Given Intels current CPU production woes, it is likely to have been the right choice by Apple, but I would still expect a bumpy ride over the next year for ARM users as usability/performance issues are ironed out.

      On the Windows side (or other non-OSX OS), users are less constrained by Apples CPU choices so the performance advantages are less apparent.

  14. Dave 126 Silver badge

    It's not HMD!

    This idea that the M1 uses HMD came from a mistranslation of Apples Finnish website. This has been confirmed by Anandtech.

    What Apple have done is design a microarchitecture that is very fast at doing common operations in the macOS programming framework, according to John Gruber. Like, 6.5 nanoseconds, compared to 30 nanoseconds on Intel.

    1. Dave 126 Silver badge

      Re: It's not HMD!

      Sorry. That's meant to read 'its not HBM!'

      More coffee...

  15. Dave 126 Silver badge

    >So what is the neural engine for?

    If no machines have a neural engine, no devs will bother developing for it. Apple have the view that if they include a new, unused type of hardware in a popular machine without the user specifically choosing it, devs will develop for it and the end user will see the benefit.

    This has worked for Apple before. Most Mac users didn't use FireWire, but that it was once included in all Macs opened the doors to ideas like the iPod (MK I was FireWire because USB 1 wasn't fast enough for it).

    1. Snapper

      "If we build it, they will come!"

    2. ThomH Silver badge

      Right; it's exposed to all developers as Core ML and for now is slowly creeping into image, video and audio editors. Pixelmator Pro jumps on it for image processing, for example. It's not clear to me that there's more here than you'd get with a modern dedicated GPU on any other computer though so you're probably just looking at Apple optimising to do the task as best as can be done within the confines of a mobile SoC.

      I'd be completely out of my depth trying to say anything beyond that.

  16. Dominic Sweetman

    At a few GHz, CPUs are memory-limited. Once you get on-chip caches you can make the CPU faster, until cache misses dominate the workload. A 2GHz wide-issue CPU with cunning insides can probably perform 3-4 instructions per nanosecond. In classical PC a DRAM access must go off chip, off module, through a connector, across the tracks and through a DRAM interface. That's going to take perhaps 80-100 nanoseconds: could be longer, and the more expandable the memory is the longer it will take. That represents about 250-400 instructions you didn't execute because you're waiting for memory. Big caches lower cache miss rates, but only to the low single-digits of percentage. Fast CPUs running one thread (most laptops have one impatient user waiting for one thing to happen) spend the great majority of their time waiting for memory.

    Meanwhlie, Moore's law continues to work for memory density. A laptop with 16GB of memory sounds pretty usable, and should get you a big performance and battery use. A trade-off well worth making.

    1. Wayland Bronze badge

      Yes fitting 16GB in the same device makes it fast and is probably enough. However for tasks benefiting from more RAM then perhaps DDR could be used as virtual memory. That way you can fit as much as you need even though your CPU is stuck with 16GB.

  17. Disk0
    Thumb Up

    I for one

    welcome our artificially intelligent SoC's.

    Progress is a beautiful thing. This old ad comes to mind: <https://tinyurl.com/y2cwjryr> ...Now featuring butterfly wings!...

    And I very much like that this is a significantly different architecture.

    We've been stuck with the x86 monoculture for too long.

    I am also looking forward to some form of multicore high-performance Raspberry Pi to power laptops - it can't be long now.

    Let's see who can make the lightest, most efficient and most performant architecture. We will all win, and I want them all.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021