back to article Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7

Oracle has revealed details of its next-generation SPARC CPU, the M7. As John Fowler, Oracle's executive veep of systems predicted when chatting to The Reg last month, the company took the wraps off the M7 at last week's Hot Chips CPU-fest and filled it with goodies to make Oracle software go faster. Under the hood of the CPU …

  1. Anonymous Coward
    Anonymous Coward

    "32 of them in harness might just be dangerously close to Skynet."

    Not a chance. Just introduce it to Johnny Drop Tables.

    1. Tokoloshe
      Joke

      Though

      that's assuming there's an entity on earth wealthly enough to licence 1,024 cores of Oracle Enterprise software with all the optional trimmings in the first place.

      Dr Evil Ellison gets it for free obviously...

    2. Anonymous Coward
      Anonymous Coward

      Presumably they come connected in a ring so that you can fit the boat anchor chain through it.

      I can't believe that people are still buying this stuff. Solaris was disinvest a decade ago in most places.

      1. DougMac

        Oracle just made sure to sift out anybody that wasn't willing to pay them billions and billions of $$.

        Those that are still *heavily* invested in Solaris are still going strong on SPARC/Solaris, they just weeded out the small to mid-sized shops that weren't the kind to give Oracle billions of $$.

  2. SplitBrain

    Nice!

    The most dense CPU ever created, Oracle is doing things with SPARC that Sun alone would not have been capable of, good to see from this former Sun Shiner, never thought I would say that!

    1. Anonymous Coward
      Anonymous Coward

      Re: Nice!

      "The most dense CPU ever created"

      It is indeed nice to see, but it doesn't exist yet. Oracle says that it might exist in 2015.

      If they can actually deliver it then yes, it will be quite a chip.

      However, the time scale they're talking about still gives Intel and IBM plenty of time to roll out their own road maps, and one wonders how and when the M7 will compare to those.

      1. Anonymous Coward
        Anonymous Coward

        Re: Nice!

        Not really. POWER 8 is just hitting the market now and IBM launches major versions in a three year cycle, so P8 is all you'll get for nearly three years now. As for Intel, they have hardly any focus on anything above two sockets these days - they're too busy fighting off the challenge from low power ARM.

        1. Mad Mike

          Re: Nice!

          Who knows what the wafer size is, but cooling something like that is going to be a challenge. Even with die shrinks and lower voltages, it's going to consume a lot of power and all that heat has to be drawn away somewhere.

          1. Mad Mike

            Re: Nice!

            "Who knows what the wafer size is, but cooling something like that is going to be a challenge. Even with die shrinks and lower voltages, it's going to consume a lot of power and all that heat has to be drawn away somewhere."

            I really wonder sometimes. How does a comment about trying to cool something like this get a thumbs down? Power and cooling is one of the biggest issues processor designers have to face!!

            1. Paul_Murphy

              Re: Nice!

              No idea, but maybe these servers will have liquid cooling as standard or something.

              Which is about time to happen IMHO.

              1. Mad Mike

                Re: Nice!

                Ah, liquid cooling!! Back to the old days.........

                Not sure how much liquid cooling can help though. As the die gets bigger (and even at this density, it's going to be pretty big), it becomes very difficult to get the heat from the center of the chip. I've often wondered whether they'll start producing chips that have cooling channels through them rather than just around (or on top) of them. That would help a lot, but is fraught with difficulties. Might even allow them to cool (as in chill) the chip as well, with a suitable refrigerant.

            2. Roo
              Windows

              Re: Nice!

              "I really wonder sometimes. How does a comment about trying to cool something like this get a thumbs down?"

              That's easy: The down-voters are ignorant fanbois and shills. They really don't give a toss about the tech, all they care about is burying bad news under a mountain of downvotes. The Itanic fanbois did the same trick, a few architectures got buried as a result, but in the real world the Itanic still ended up as an overpriced, inefficient and underperforming boat anchor. The only winners were the shills who got rich in the process (eg Steve Milunovich), of course none of them actually had to use an Itanic to earn a living...

          2. samlebon2306

            Re: Nice!

            "but cooling something like that is going to be a challenge"

            Maybe they need to submerge it mineral oil.

        2. PowerMan@thinksis

          Re: Nice!

          Using your explanation of the Power8 roadmap, they began shipping Power8 in June '14. If they start shipping Power8+ (reasonable to expect IBM to stick with the entry level roll-out as they did with Power8) in 18 months that would put them the Nov / Dec '15 timeframe. Oracle having a 2015 rollout could be January or December. I think the point is valid that they are behind and will be further behind. With regard to Intel - not sure what planet you are on but I see Intel heavily focused on 4 socket servers. Ivy Bridge EP (E5) v2 for the 2 sockets and EX (E7) v2 for 4 sockets and above. I am not fully briefed on Haswell and Broadwell but it's reasonable to expect one of them will deliver a 4 socket solution. It's possible Intel is figuring out enterprise customers don't like rapid change in chipsets that run heavy duty workloads and would rather have a reliable chipset over the latest and greatest every chip release. What you cite is what Intel is battling in general - trying to go after the enterprise space while continuing to own the 2 socket space and defend against ARM in both the 2 socket but also the mobile & portable space. Google using Power, Apple considering a change plus their continued growth with the iPhone / iPad puts pressure on every chip manufacturer.

        3. kkreu

          Re: Nice!

          The Sparc E7 seems nice but if you start to break it down it still falls short to IBM's Power 8 processors. They may have more cores per chip (which can effect software cost) but they have less L3 cache per core. The IBM S824 24-Core server has and L3 bandwidth of 5,407GB/s (~5.28TB/s) and an L2 bandwidth of 4,055GB/s (~3,95TB/s). The S824 is ~230% faster for L3 and ~691% faster for L2.

          They do not talk about anything about their I/O or memory performance as well while the S824 has a total memory bandwidth of 384GB/s and the I/O bandwidth of 192GB/s.

    2. Jim 59

      Re: Nice!

      Awesome. 10 billion transistors in the headline but the story doesn't repeat that claim - is it true Reg ?

      1. Richard Boyce
        Thumb Up

        Re: Nice!

        That you can have so many transistors on a chip and reliably get chips that work is mind-boggling. The purity and quality control must be stupendous.

    3. PowerMan@thinksis

      Re: Nice! -- NOT!

      The most dense CPU ever? An easy search on google will show you this is not the most dense CPU ever created - Intel, Nvidia, Azul and more have far higher core density. Oracle isn't doing things Sun was not capable of....moreover, I would argue they are pulling old plays out of the Sun playbook in a desperate attempt to remain relevant. One example is the similarity the M7 has to the Rock processor cancelled in 2009/2010. Rock was a 16 core CPU design made up of 4 clusters of 4 cores each. Very similar to the M7 which has 8 clusters of 4 cores each - coincidence? Hmmm!

      1. Mad Mike

        Re: Nice! -- NOT!

        "One example is the similarity the M7 has to the Rock processor cancelled in 2009/2010. Rock was a 16 core CPU design made up of 4 clusters of 4 cores each. Very similar to the M7 which has 8 clusters of 4 cores each - coincidence? Hmmm!"

        There's not necessarily any issue with using ideas from the past, brought up to speed with the latest technology. However, the design of this chip demonstrates one of the biggest problems for designers these days......interconnects. Any to any interconnects are always going to be best, but as the number of endpoints rises, become impractical. So, interconnect technology is likely to become one of the biggest drivers of processor/core speed. Used to be seen as mostly a problem in big (such as Power, Integrity etc.) servers with many processors, but as core numbers increase, is even becoming a problem between cores.

        1. Magellan

          Re: Nice! -- NOT!

          Rock was much more than sixteen cores in four core clusters. Originally Rock did not call the cores in the core clusters cores, it referred to the core cluster as a core. The core cluster in Rock was four integer pipelines and one shared floating point pipeline. The four integer pipelines shared an instruction fetch unit and L1 caches. There was a dislike of calling the cluster multiple cores because at that time all CPU cores contained an instruction fetch unit, a dedicated L1 cache, and an FPU. It was only later when marketing decided a high core count suggested advanced engineering that the individual integer pipelines were called cores. This was consistent with the marketing of the various UltraSPARC T processors, which did not have a one-to-one ratio of IUs to FPUs. Rock's advanced features included hidden hardware helper threads to prefetch data (the "Hardware Scout"), the ability to simultaneously run both branches of a code branch ("Execute Ahead"), "reverse hyperthreading" ("Scalable Simultaneous Threading", which turned the four integer pipelines and their paired floating point pipeline into a single virtual core for HPC workloads), and transactional memory.

          Rock's four core clusters shared an L2 cache. There was no on-chip L3 cache.

          Rock had an in-order pipeline, but could execute out of order via Execute Ahead. If I recall, each Rock integer pipeline had four hardware threads, two for executing code (allowing Execute Ahead), and two for the Hardware Scout (to feed the two execution threads). Only two threads were visible to the operating system. Rock was interesting because it used threading to gain ILP.

          It appears of the advanced Rock features, M7 only has transactional memory, although Solaris has used a software based version of scout threading (called Dynamic Helper Threading) since the UltraSPARC IV+ days and this was expanded in the various UltraSPARC T series.

  3. Mad Mike

    Cache size

    Does anybody else think that 64MB of cache seems tiny for 32 cores and 8 threads a core?

    1. Roo
      Windows

      Re: Cache size

      "Does anybody else think that 64MB of cache seems tiny for 32 cores and 8 threads a core?"

      Totally inadequate at that kind of clock rate, they are banking (sic) on the latency being hidden by threading. It'll be interesting to see how one of those chips stacks up against a Xeon Phi.

      1. Mad Mike

        Re: Cache size

        It's an interesting move. Given the way they released T processors before and then reduced core count to produce a M processor. Are they not going to produce M and T versions of this one? If they are, the T version should have something like double the cores!!

        As to the latency being hidden by threading......one of the primary purposes of the caching is to prevent cache thrashing when running lots of threads, so the threading should make it even worse!! The early T chips showed that admirably.

        1. Anonymous Coward
          Anonymous Coward

          Re: Cache size

          Are they not going to produce M and T versions of this one?

          Seemingly not:

          http://www.enterprisetech.com/2014/08/13/oracle-cranks-cores-32-sparc-m7-chip/

          "Oracle will be discontinuing the T Series chips, Fowler tells EnterpriseTech, and building future Sparc machines on the M7 processors solely."

          1. Anonymous Coward
            Anonymous Coward

            Re: Cache size

            The M and T processors are already partly merged with the T5 and M10 servers which now share the T-series processor capabilities. It makes sense for Oracle to merge the designs completely and differentiate with T and M-series servers which focus on different capabilities.

            1. Mad Mike

              Re: Cache size

              If, as said here, they're merging the T and M chips, I wonder if they'll offer sub-capacity offerings with less than 32 cores? Maybe reuse some of the chips with failed cores? Starting a server range at a single processor with 32 cores (and presumably an appropriate cost) is not really that viable and could loose a lot of good business. Unless, of course, they're only interested in people who want single boxes that size and bigger? Maybe push smaller users onto x86. One of the 'benefits' of the smaller T-series servers was that a small one could be purchased quite cheaply. I assume a single processor M7 server won't be that cheap?

            2. fch

              M/T processor vs. systems ... [ was: Re: Cache size ]

              There's T-series/M-series CPUs - which are all Oracle SPARC.

              Then there's T-series systems - which are all Oracle, using Oracle T4/T5 CPUs in the systems of the same name.

              And there's systems colloquially termned "M-Series".

              Of which only the M5/M6 (and M7 to come, unless Oracle chooses to rename the system before launch) are Oracle, and use Oracle SPARC CPUs of the same name.

              The older Mx000 and current M10 systems, though, are designed by Fujitsu's, and use Fujitsu's SPARC64-series CPUs (in the M10 series, the SPARC64-IX - the "commercial spawn" of the current K Super). On HotChips, Fujitsu also presented on the SPARC64-XI - to go into the post-K-Super, and possibly later into (an update of) the M10 series of systems.

              Noone quote me on all these names and numbers please - refer to the vendors' marketeeting departments for the canonical incomprehensible advice instead, and to their legal departments for even more incomprehensible guidance on trademark usage.

            3. Casper

              Re: Cache size

              There is no relation between the Fujitsu M10 systems (which have a SPARC64 CPU) and the

              Sun/Oracle Mx/Tx chips. What changed with the M10 is that it now also uses the sun4v architecture; i.e, a SPARC system with a Hypervisor. This makes the systems look more similar from an admin perspective.

  4. Jim 59

    Multi-core

    Multi-core is great for parallel tasks, obviously. I can encrypt a huge file on my 8 core laptop and the machine doesn't slow down at all, I can happily continue to do other stuff. In single core days it would have been reduced the whole machine to a crawl.

    But there is a downside. Many tasks can't be parallelized by present software. For example, that encryption above. It only gets one core, so only gets about 12% of the PC's compute power. In an ideal world, it would take 6 or 7 cores, run mongo-fast, and leave me with 1 or 2 CPUs read El Reg and play Tetris.

    1. Chemist

      Re: Multi-core

      "Multi-core is great for parallel tasks"

      Bot when it works oh, yes. The quad-core i7 laptop I'm writing this on is bloody great with programs like ffmpeg which will transcode video with all 4(8) cores running at ~85-90% and still is responsive for less demanding jobs. Gets rather hot though !

    2. Alan Brown Silver badge

      Re: Multi-core

      "Many tasks can't be parallelized by present software. For example, that encryption above. It only gets one core, so only gets about 12% of the PC's compute power."

      Allow me to introduce you to my friends

      pigz - http://zlib.net/pigz/

      pbzip2 - http://compression.ca/pbzip2/

      There are other multithreaded archivers but these are the most useful in a *nix house.

      most 7zip and xz code has multithread support built in

      1. fch

        Re: Multi-core

        Both gzip and bzip2 parallelize well only for compression - because that's a "blocked" operation, i.e. a fixed-size input block is transformed into a hopefully-(much-)smaller output chunk. The latter are then concatenated into the output stream. Since the output is stream, there's no "seek index table" at the beginning though and hence one cannot parallelize reverse in the same way. You only know where the next block starts once you've done the decompression and know how far the "current" one extends. While one can "offload" some side-tasks, the main decompression job is singlethreaded with both the abovementioned implementations.

        One can, though, obviously compress/decompress multiple streams (files) at the same time. That's what ZFS uses, for example - every data block is compressed separately, and hence compression/decompression on ZFS nicely scales with the number of CPU cores.

        [ morale: better use a compressing filesystem than compress files in a 1980's filesystem ? ]

        1. Michael Wojcik Silver badge

          Re: Multi-core

          Both gzip and bzip2 parallelize well only for compression - because that's a "blocked" operation

          True, but encryption1 can also be done in parallel blocks, for example using a block cipher and GCM (Galois-Counter Mode) combining.

          One can, though, obviously compress/decompress multiple streams (files) at the same time

          Yes, and clearly that's the solution for large archives: build them from multiple compression streams. With many corpora, you can get close to the same overall compression ratio even if you partition the input in various ways, for example interleaving (one stream for each Nth block, for some block size, then interleave the outputs the same way when decompressing). There are other possibilities that can improve compression ratios for typical jobs, even higher than what a good compressor (e.g. PPMd) would if simply run over the entire corpus as a single byte stream.

          1I know you went on to talk about decompression, but the OP mentioned encryption in this context.

      2. Charlie Clark Silver badge

        Re: Multi-core

        Allow me to introduce you to my friends

        Although the programs sound nice I'm not sure they were a suitable answer to the original post which was about encryption not compression.

    3. eldakka

      Re: Multi-core

      You are aware that this is a server-oriented processor are you not?

      That You're not likely to care about the performance of a gzip/bzip2/zip whatever?

      What you ARE likely to be doing is running 20-30 JVMs of multi-gigabyte heap sizes each handling 100's if not 1000's of user tasks simultaneously.

      Or running a whacking-great Database on it (it is from ORACLE now) doing 1000's of simultaneous, independent database queries (selects, inserts, etc).

      You don't need to be able to extract instruction-level (or task level) parallelism from a SINGLE process (e.g. transcoding video, compressing) or from single tasks when you are running dozens, hundreds, THOUSANDS of SEPARATE independent processes/tasks simultaneously. As tends to happen on servers, which is what this chip is aimed at.

      1. Mad Mike

        Re: Multi-core

        @eldakka.

        Very true to an extent, but there are plenty of systems which are heavily single (or low numbers of) threaded in existence today. Parallelism is coming more and more, but isn't fully there yet. Also, just because something is parallel doesn't mean it doesn't care about latency and other issues that parallelism can cause. Also, don't forget that some things are naturally parallel, such as OLTP systems. However, other workload is naturally not parallel and trying to turn it parallel causes (in some cases) a very significant overhead. It's getting better all the time, but parallelism isn't the answer to everything and causes it's own problems as well.

  5. John Smith 19 Gold badge
    Happy

    A SPARC thread without Matt Bryant....

    It's just, you know, unexpected.

    Takes a bit of getting used to.

    1. Anonymous Coward
      Anonymous Coward

      Re: A SPARC thread without Matt Bryant....

      what happened to him ? Any RIFs at HP/IBM lately ?

      1. PowerMan@thinksis

        Re: A SPARC thread without Matt Bryant....

        LOL - he never worked at IBM as afaik - just checked the public IBM directory and don't see him. Plus, if he did he was a bitter employee :) He always claimed to work for a non-vendor customer location.

    2. SplitBrain

      Re: A SPARC thread without Matt Bryant....

      Don't see much of Bryant these days on Unix related matters, the reg (and the world tech press) has sweet F'all to report on HP-UX and itanium these days.

      He appears to expouse his (mostly) right wing views on things frequently on other threads though, so it's not as if has disappeared.....

  6. Captain Server Pants

    This explains IBM's $3 billion systems invest FUD

    Also, IBM's desperation to unload chip fabs. TSMC is far ahead of IBM. Supposedly Power8 yields are low and they're using 2 chip modules to get to 12 cores per slot. Performance not good because of loss of single chip cache coherence so they went to giant off chip(s) shared L4 cache. Sparc M7 seems like a big step ahead.

    1. PowerMan@thinksis

      Re: This explains IBM's $3 billion systems invest FUD

      Um, not true. Your comment is FUD. Each socket in the current Power8 Scale-out server is package with up to 2 x 6 core chip modules. IBM has done this on Power5, Power6 and now with Power8 servers so nothing new. Moreover, whats wrong with it if it performs? Your comment about "Performance not good" is off base. What do you base it on? I would point you to the following benchmarks where you see a 24 core S824 match a 4 socket 60 core Ivy Bridge EX v2 server in SAPS & Users. Outperform in SPECint, SPECfp, SPECjbb and more. Those are just benchmarks, my customers are seeing the performance and more.

      "So they went to giant off chip(s) shared L4 cache." Really? Adding technology and innovating is now a gimmick? By this explanation having L3....L2 and even L1 cache are all gimmicks. Just main memory and cpu for you. Come on, are you a bit jaded by your SPARC love? It's ok to say your Ford SPARC is the best ever but don't lie about my Chevy Power :) Power8's L1 (D+I) are 2X greater than x86 and 4X+2X over SPARC T5. L2 cache is 2X over x86 and 4X over SPARC T5. L3 is 2.5X over x86 and 12X over SPARC T5. Neither x86 or SPARC have L4 while Power8 has 128 MB per socket. This gives Power an advantage to get data closer to the core so it may fit entirely in a lower cache line.

      1. Mad Mike

        Re: This explains IBM's $3 billion systems invest FUD

        As has always been said; it's the whole path you need to consider. Getting cores faster is no good unless you can keep the data coming in faster as well. Faster memory, faster I/O, faster interconnects etc.etc. There's plenty of innovation going on all over the place.Cache sizes are way bigger on Power chips at the moment and they've opened up with architecture as well. Inviting other companies to create accelerators and the like that sit directly on processor interconnects etc.

        Oracle are heading down the 'accelerator in silicon' route much faster than others.Not that others haven't done it, but it seems to be a much higher priority drive at Oracle. You can see the attraction. Their hardware is perfectly tuned to their software and gets advantages other hardware can't give. At the same time, they refuse to code their software to use accelerators etc. present in other brands of hardware. All about locking and from Oracles perspective is a win-win. However, it is only their version of Sparc they can do it with and their Intel/AMD deployments won't enjoy the same advantages, unless they can persuade Intel/AMD to play ball with them :-)

        As to Itanium.................it rather seems to have fallen off the coupon...

      2. Captain Server Pants

        Re: This explains IBM's $3 billion systems invest FUD

        "Each socket in the current Power8 Scale-out server is package with up to 2 x 6 core chip modules. IBM has done this on Power5, Power6 and now with Power8 servers so nothing new. Moreover, whats wrong with it if it performs? Your comment about "Performance not good" is off base. What do you base it on?"

        You've clearly drank the IBM Kool Aid. It's entirely possible Power8 is a good step up from P7+ and "your customers" are seeing nice performance improvements. My post said nothing about comparing P8 to prior generation of Power.

        Do you realize 2x6=12? When Power8 was previewed at Hot Chips 25 (last year, 2013) it was presented as a single die 12 core chip. Here's the Reg article:

        http://www.theregister.co.uk/2013/08/27/ibm_power8_server_chip/

        To date no IBM system offers a single die 12 core Power8. When you introduce something as one thing and then release it as another it's know as FUD in the IT business. In retail it's called "bait and switch". Despite your rant.

        Do you realize the 6 core chips in the 2 chip modules (remember 2x6=12) contain inactive cores? Why would this be the case? Maybe IBM fabs cannot manufacture the single part in high enough yields. Will Oracle follow IBM's lead and introduce 2x16 core (with deactivated cores) SPARC chip modules because it's a better solution? Sorry no, it's the opposite.

        1. Mad Mike

          Re: This explains IBM's $3 billion systems invest FUD

          @Captain Server Pants.

          Interestingly, it's not as clear cut as you say, depending on exactly what is working and what is not on the chip. If you only have say half the cores, but all the cache is working, that will help quite a lot. If some of the cache isn't working, that's not so good!! It just depends on what has failed. Not sure why you're picking on IBM for this either as just about everybody does it. AMD, Intel etc. have done it in the past. Indeed, in earlier incarnations of the T chips, Oracle/Sun used to sell processors with less than the normal number of cores. Now, I'm not saying if they were simply deactivated, or failed, I don't know. However, it's likely at least some were failed. At this sort of density, you're always going to get some failures and selling them at the lower end is quite a reasonable way of utilising them.

          As you have said, all the current IBM Power 8 servers use 2 chip modules, but they are all the lower end servers and this has been the case for years. It's only when you go up the server line that you get the full chips being used. This is for very good reason. Firstly, it uses the 'slightly' faulty chips and also allows manufacturing issues to be ironed out early. That's why they launch the low end first.

          Regardless of the rights and wrongs of how it's done, the proof is in the pudding and the performance per buck. If making it a 6 + 6 gives better performance per buck, then that's just fine.

          P.S.

          I strongly suspect that if Oracle attempt to manufacture the Sparc chip as mentioned, we'll see lower end systems with less than the full core count 'activated'. Attempting top manufacture chips at this density and core count (not to mention accelerators etc.) and some failures will occur. You either throw them away and absorb the cost, or do something else with them in the low end!!

          1. Captain Server Pants

            Re: This explains IBM's $3 billion systems invest FUD

            @Mad Mike

            Yes, I do admit to an extreme characterization here. Your points are all well taken. Many vendors Oracle/Intel/IBM use multi chip modules and different activation schemes for capacity on demand and other reasons. It's totally legitimate practice and they all do it.

            IBM Power and Oracle's per core pricing put the final shade on the Sunshiners. I was never one of them because SPARC wasn't a good investment for a long time. In my earliest days we used Oracle on Sun SPARC, in the early/mid '90's. After that it was HP and IBM (Power 4 and Power 5) until around 2008. Since then I'm in a Microsoft stack Dell server shop. So I'm not religious about any one thing. Oracle makes the best database though SQL Server is equal/close in many ways.

            Bottom line is there are 3 chip manufacturing trains which are competing through 10nm/7nm/5nm Intel, TSMC, and Samsung. It seems to me IBM needs to get on board one of those trains asap if it wants to keep up.

        2. Freddellmeister

          Re: This explains IBM's $3 billion systems invest FUD

          Captain Server Pants writes:

          "Do you realize the 6 core chips in the 2 chip modules (remember 2x6=12) contain inactive cores? Why would this be the case? Maybe IBM fabs cannot manufacture the single part in high enough yields. Will Oracle follow IBM's lead and introduce 2x16 core (with deactivated cores) SPARC chip modules because it's a better solution? Sorry no, it's the opposite."

          If you check the IBM POWER8 S824 Redbook you notice that all 4 IO buses are wired from each socket. So in fact what might look like a way to use broken chips, the 2x6 dual chip socket design increases the IO performance by 100% compared to a 1x12 core designs.

    2. Roo
      Windows

      Re: This explains IBM's $3 billion systems invest FUD

      "Performance not good because of loss of single chip cache coherence so they went to giant off chip(s) shared L4 cache."

      The POWER8 has 512kbytes of dedicated L2 *per core*. That is backed by a further 96Mbytes of shared L3 on the same die, and up to another 128Mbytes of L4.

      By contrast the M7 has 256kbytes of shared L2 for each 4 cores, and 64Mbytes of shared L3 per die.

      "Sparc M7 seems like a big step ahead."

      The M7 has less cache, and the L2 cache has 4x the number of cores using it. Even if you ignore the L4 cache, the M7's caching scheme is in fact a step backwards for people who value single-thread performance.

      1. Mad Mike

        Re: This explains IBM's $3 billion systems invest FUD

        @Roo.

        "Even if you ignore the L4 cache, the M7's caching scheme is in fact a step backwards for people who value single-thread performance."

        Not just single-thread performance, but multi-thread as well. One of the primary uses of large caches is to avoid cache thrashing in the event of many threads (or partitions) hitting the same core over time and causing cache to be constantly refreshed from memory. The greater the multi-threading and the greater the partitioning, the more cache you need.

        1. Roo
          Windows

          Re: This explains IBM's $3 billion systems invest FUD

          "Not just single-thread performance, but multi-thread as well."

          My gut says you're right, but there have been some pretty stunning massive thread count success stories, like GPUs for instance. They tend to operate well below peak, have relatively tiny cache and suck data through a fat but very long straw, but they dominate the Green500 list nonetheless.

          I still prefer working on machines that can sustain a high percentage of peak performance on a single thread. The Pentium Pro 200 (256kb L2 @ core clock) was a fine example of that style of core, it worked miracles on gnarly dusty deck code. :)

    3. IT Consultant

      Re: This explains IBM's $3 billion systems invest FUD

      This comment is total FUD. Doubt it? Do a bakeoff!

  7. alwarming
    Terminator

    "Oracle has revealed..."

    Surely you meant to say "The Oracle"...

  8. Anonymous Coward
    Anonymous Coward

    It's the SW, stupid..

    Of course it's massive multicore - that's the only way Larry World can stay afloat - by charging out the wazoo to license Oracle on these beasties. BTW, anyone know how much power one of these dissipates?

  9. MadMike

    POWER8 disappoints

    IBM POWER8 is a big disappointment. One POWER8 socket gives 437 SPECint2006, and it gives 342 SPECfp2006:

    http://benchmarkingblog.wordpress.com/2014/04/28/awesome-power8-benchmarks-awesome-dessert/

    The SPARC T5 gives more performance that, 467 and 436 for one socket:

    https://blogs.oracle.com/BestPerf/entry/20130326_sparc_t5_speccpu2006_rate

    I must say that POWER8 is a big disappointment. After four(?) years of development from IBM, is POWER8 the best they could come up with? POWER8 does not even beat current cpus.

    The SPARC M6 is faster than SPARC T5 and scales up to 32 sockets, and only Fujitsu M10-4S 64 socket SPARC server is faster on SAP benchmarks:

    https://blogs.oracle.com/BestPerf/entry/20140327_m6_32_sap_sd

    And now this SPARC M7 cpu gives not 10-20% better performance than M6 (as Intel gives every generation), no, it gives 3-4x better performance than one SPARC M6.

    HP is out from the high end market with their dead Itanium. IBM is out from the high end big margin market with their not-so-good POWER8. And if POWER8 only scales to 16-sockets maximum with 16TB RAM, then IBM has no way of competing against a fully equipped SPARC M7 server with 32 sockets, 1024 cores and 8.192 threads and 64TB RAM:

    http://www.enterprisetech.com/2014/07/28/ibm-forging-bigger-power8-systems-adding-fpga-acceleration/

    "...Depending on how many customers are hitting the performance ceiling on the Power 795, IBM could skip putting out a 32-socket Power8 machine and just got with the 16-socket machine with 16 TB of memory...." - as explained by the IBM die hard fan Timothy Prickett Morgan. And look at the IBM road map. It seems a bit empty? What is there after POWER8? Nothing? What comes? Next year when the SPARC M7 32 socket server arrives, IBM must surely have released their largest server too. Will it only be 16 sockets?

    But I am a bit disappointed on Oracle too. Sure, Oracle has delivered all six cpus early or on time during these five years, but Oracle talked about a 16.384 threaded server with 64 TB RAM and 64 sockets in 2015. Clearly this SPARC M7 server is the one Oracle talks about BUT it only scales to 32 sockets. Fujitsu has 64 socket servers, not Oracle. But I hope Oracle will up the ante and release a 96-socket SPARC server with Bixby. Then there is no competition left.

    As Oracle says "twice the perfomance every generation" in contrast to IBM or Intel, I am very interested in the SPARC M8 server somewhere in 2016 or 2017. How can Oracle double up the SPARC M7? Oracle has doubled up every generation (for instance T5 has twice the cores and twice the sockets as T4, and T4 is twice as fast as T3, and M6 has twice the cores of M5, etc). How can possibly Oracle release a new cpu in 2016-2017 which is again, twice as fast as M7??? Will it have 64 cores??? Or double the clock speed? Or have 20 billion transistors??? Somewhere it must stop, not even Oracle can keep this neck breaking pace. If you double performance every generation (every year), then it will not take long before you have the fastest servers. And with Oracle databases finely tuned to these high end killer servers, what is left of the competition? Oracle databases are using hardware accelerators in SPARC, and refuse to use other hardware accelerators from IBM or HP. Oracle databases will scream on these big M7 servers.

    The Tx series have been migrated into Mx series, which is good. The Tx servers were low-end to mid-end. Left is only the Mx servers from Oracle. Fujitsu creates the SPARC64 cpus, that are also fastest for HPC and number crunching. The Fujitsu M10-4s 64 socket server is quite wicked too. Just check it up. Mainframe class reliability. As the M7 servers.

    1. This post has been deleted by its author

    2. Mad Mike

      Re: POWER8 disappoints

      Interesting. Why is someone using a userid almost identical to mine and posting stuff like this? Presumably, trying to pass themselves off as me. Of course, what the post fails to identify is that benchmarks are one thing, but real life performance is another. Yes, lots of companies like to compete on benchmarks (although only recently with Sun/Oracle), but they are really artificial. It's also interesting that the poster believes making the biggest (as in processors/cores etc.) is all that matters. The vast majority of the market simply doesn't want servers of this size, so it's largely irrelevant.

      1. MadMike

        Re: POWER8 disappoints

        @Mad Mike

        "...It's also interesting that the poster believes making the biggest (as in processors/cores etc.) is all that matters. The vast majority of the market simply doesn't want servers of this size, so it's largely irrelevant..."

        This is funny how IBM goes on when Oracle beats IBM:

        http://blogs.wsj.com/digits/2013/03/27/ibm-fires-back-at-oracle-after-server-attacks/

        “This was a frozen-in-time discussion,” Parris said in an interview Wednesday. “It was like 2002–not at all in tune with the market today.”...Companies today, Parris argued, have different priorities than the raw speed of chips. They are much more concerned about issues like “availability”–resistance to break-downs–and security and cost-effective utilization of servers than the kinds of performance numbers Ellison throws out....Not that Parris is conceding that Oracle’s new hardware is actually faster."

        And now you say that the market doesnt want huge servers that only Oracle can manufacture. Well superior performance means you get wicked small servers too. If you have a small 8-socket M7 server (256 cores, 2048 threads, 16TB RAM), it will surpass the biggest baddest IBM P795 (256 cores, 1024 threads, 16TB RAM) for a fraction of the IBM price - why would you spend much more money, more space, and more wattage being an IBM customer? And the P795 does not have lot of hardware accelerators that M7 has. So tell me, why would anyone go for a hugely expensive large P795 server with 32 sockets, when you can get a small 8-socket M7 server?

        BTW, the SPARC M6 is faster than the POWER7+ cpu. And one SPARC M7 is 3-4x faster than the the SPARC M6. This means an 8-socket M7 server will equal 24-32 socket M6 server. Or, 8-socket M7 server will almost equal an 32-socket POWER7 P795 server. Why would anyone want a huge P795 server, when they can get a small 8-socket M7 server? Tell me.

        1. Freddellmeister

          Re: POWER8 disappoints

          Madmike

          "Why would anyone want a huge P795 server, when they can get a small 8-socket M7 server? Tell me."

          Because you you the term faster incorrectly. Your definition of faster is not yielding shorter response times. A lot of people actually has to open the branches in the morning and cannot wait indeterminately for EOD processing to finish. Telcos need to rate and bill transactions before raw data is discarded.

          Couple slow processing of SPARC with anemic virtualization capabilities will allow these customers to do a lot more with POWER than SPARC M7 regardless of your definition of faster. And it will require a lot less Oracle licenses.

        2. Mad Mike

          Re: POWER8 disappoints

          @MadMike

          I have done comparisons between Oracle Mx and Tx chips and IBM Power x chips. I've also looked at the server design and resilience etc.etc. Been doing it for a while. Oracle chips do well in benchmarks, but it simply doesn't work through to reality, except for some limited applications. If you want to run huge instances (one of the reasons why Oracle push Solaris containers rather than LDOMs) doing a single workload (such as a huge BI machine), you might be able to make use of the performance. However, if you want to do more normal workload (such as OLTP), using a lot of partitions (or LDOMs), performance falls away rapidly. Cache size is one reason, but there are others. Using a lot of LDOMs, containers (to a lesser extent) and really working the threads up (again to a lesser extent) causes cache thrasing as the cache simply isn't big enough for the speed of the cores. Power chips have far fewer problems in this area and have much larger cache sizes which is one of the reasons.

          If you want to run large numbers of partitions (or LDOMs) or generally anything that switches between threads etc. a lot, the Power chips do much better in real life. Yes, the benchmark figures are good, but don't translate into real life performance under many circumstances. We, run Power servers with VP to PP ratios of up to 10 to 1 with really good performance. Tx and Mx chips simply won't do this. It's been a well known problem since the beginning of these chip lines. The Mx chips lost cores in order to increase the cache amount for each core specifically to try and address this problem. And it worked, to a point.

          The M7 chip design looks more like a T7 design, primarily due to the very large core count and low cache quantities per core. If it had been launched as a T7, I would not have been surprised at all and would have expected to see a M7 launched slightly later with fewer cores and bigger cache per core. But, that isn't what Oracle have done.

          Also, have a look at the server designs and their resilience etc. You find an interesting story. I was very surprised to find out some time ago that a T3-2 server would DELIBERATELY reboot itself if a socket failed!! That's not resilience. This was to reconfigure all the I/O onto the remaining socket. However, if you design the implementation correctly, all I/O would be mirrored across the two sockets anyway, so I/O would be maintained. It actually rather seems like Oracle are putting resilience more into their software stack and not their hardware. In the event of a hardware fault, they expect to loose the hardware and simply failover to another instance through software.

          The above is one solution to a problem, but it should always bee borne in mind that software is normally the least reliable part of the stack and therefore deliberately using that for resilience is arguably not the best. As a last resort, fine, but hardware surviving faults is a good starting point first.

        3. Roo

          Re: POWER8 disappoints

          "BTW, the SPARC M6 is faster than the POWER7+ cpu"

          Depends how you measure it. Oracle have failed to provide CINT2006 & CFP2006 single thread results for 3 years and counting now, however they do provide the rates figures (spec.org explains the difference between the two types of benchmark in plain english on their website).

          The lowest common denominator between recent (ie: <2 years old) SPARC & POWER SPEC results seem to be the 16 core rates figures. Box boxes look to be of a similar physical size too. :)

          SPARC T5-1B int 489, fp 369 (Oct 2013 & Apr 2013)

          IBM Power 730 Express (4.2 GHz, 16 core, SLES) int 852, fp 575 (Feb 20i13)

          Power7+ delivers 70% more int and 50% more fp in those 16 core 2U boxes... IMO the main reason for people to run a SPARC is that they can't run their binaries on something else, the performance argument just doesn't stack up, and it hasn't done for at least a decade.

          1. MadMike

            Re: POWER8 disappoints

            @Roo

            "...SPARC T5-1B int 489, fp 369 (Oct 2013 & Apr 2013)

            IBM Power 730 Express (4.2 GHz, 16 core, SLES) int 852, fp 575 (Feb 20i13)

            Power7+ delivers 70% more int and 50% more fp in those 16 core 2U boxes... IMO the main reason for people to run a SPARC is that they can't run their binaries on something else, the performance argument just doesn't stack up, and it hasn't done for at least a decade...."

            Yes but now you are comparing two sockets POWER7+ vs one socket SPARC T5. No wonder that two POWER7+ cpus beats one SPARC T5 cpu. Some would say that to be able to discern which cpu is the fastest - you need to compare one cpu to another cpu. Not two cpus to one cpu, nor three cpus to one cpu. Imagine someone would compare ten SPARC T5 cpus to one POWER7+ and conclude that the SPARC T5 cpu is ten times faster?? Wouldnt that be a bit weird? Faulty logic, right?

            What are we discussing, which cpu is the fastest, or which core is the fastest? I thought we discussed SPARC cpus vs POWER cpus? Are you shifting discussion from cpus to cores and then back again ("one core is faster, therefore the cpu is faster"?). In my old antique car, the piston happens to move faster than the Ferrari F40 piston - therefore I conclude that my car is faster than the Ferrari F40. In my pocket I happen to have more money than you have in your pocket, therefore I conclude I also have more money on the bank. This is sound and correct logic to an IBMer, yes?

            How about this, four SPARC T2+ running at 1.6GHz matches fourteen (14) POWER6 cpus running at 4.7GHz in official SIEBEL v8 benchmarks. Because you randomly talks about cores, I might as well randomly talk about... GHz. You need 65.8 GHz in total POWER6 cpu to match 5.6 GHz of SPARC T2+. Because the SPARC T2+ is 11.75x more efficient when we talk about GHz, I conclude that the SPARC T2+ is also 11.75x faster than the POWER6. How about them apples? Sounds good to you?

            Whats wrong with IBMers? This faulty logic that IBMers display is very common. I discussed with one IBMer here, and he said something like "one POWER6 core is faster than the Intel Xeon on linpack, therefore the entire POWER6 cpu is faster" - when I showed linpack benchmarks that Xeon was twice as fast as the POWER6 cpu. He insisted that the POWER6 cpu was faster, even though it got half the linpack score. "The core is faster, therefore the cpu is faster! It is true!". IBMers need to study some and learn about logic and reasoning, thats for sure. This is not the first time I have had such a discussion with an IBMer. "The core is faster, therefore the cpu is faster!"

            1. PowerMan@thinksis

              Re: POWER8 disappoints "Impostor" wants it both ways

              Impostor "MadMike" pretending to be "Mad Mike" wants it both ways. 8, 16 and 32 sockets are the way to get the greatest performance. Clearly the *sum* is the strength and not the parts that make it up because he has a single socket T1- B with 16 cores when compared to another 16 core server the excuse is because it is a 1 socket and not two. A 32 socket server is the best not because it has 1024 power cores but because it has 32 sockets x 32 cores and its aggregated results. What "MadMike" doesn't want to admit but is obvious even if he won't acknowledge it (because the facts speak for themselves) is a SPARC server with a 16 co chip is inefficient. He says you have to compare one CPU to another CPU to get a fair comparison - seriously, all you mention are benchmark results for a T5-8 or a M6-32 with 8 and 32 sockets respectively then compare that to a S824 with 2 sockets or a Power7 780 with 8 sockets or fewer depending on the benchmark. This is one reason why sockets influence performance and why per core performance is the common denominator.

              What you fail to grasp is that Power isn't always 2X, 5X, 10X (or more) faster than the competition. When talking about current generation processors, Power is not 10X greater than any of them. It is approximately 2X faster than Ivy Bridge and more over SPARC. It is however, 10X, 20X, even 50X more efficient. Thanks to the mainframe efficient hypervisor, how it dispatches and schedules workloads onto the resources it delivers performance and throughput. It's not about 8, 12, 16 or 32 cores per chip but the performance of each core. If those cores are located in different sockets, then the performance between sockets and so on. It's not the sum of all of those cores that is impressive but being able to allocate the cores needed for each workload.

              You've lost it on your rant of Power6 vs T2. Not worth responding to. I can't speak for IBMers or the one you reference. Have them contact me and I'll set 'em straight. What I will say is this though. Even if Power servers (ie cores) were slower than the competition, because of the efficiency afforded by PowerVM which exploits the Power Hypervisor it delivers greater results with a better TCO. Your comment about "The core is faster, therefore the cpu is faster" is only relevant if you were comparing the same size chips - 8 vs 8 or 16 vs 16. But, it is absolutely fair to compare a 2 socket 8 core/socket vs a 1 socket 16 core/socket server. At that point the comparison is on cores and server.

              1. MadMike

                Re: POWER8 disappoints "Impostor" wants it both ways

                To the IBM employee "PowerMan@thinksis":

                "...[I] says you have to compare one CPU to another CPU to get a fair comparison - seriously, all you mention are benchmark results for a T5-8 or a M6-32 with 8 and 32 sockets respectively then compare that to a S824 with 2 sockets or a Power7 780 with 8 sockets or fewer depending on the benchmark..."

                Que? I have never compared a 32 socket SPARC server to a 8-socket POWER server, or anything similar. In my benchmarks I always normalize and compare socket to socket. If I post benchmarks of 2-socket SPARC to 1-socket POWER or similar - I always normalize and divide the SPARC result by two. And THEN compare socket to socket.

                We are discussing which cpu is the fastest, dont we? Why do you compare two POWER7 cpus vs one SPARC T5 cpu - and conclude the POWER7 is faster? You must normalize the benchmarks, do you not understand how to compare cpu to cpu?

                And BTW, I am not attacking you. You started calling me FUDer and employed by Oracle and what not. But in reality, it is you that FUDs (makes lot of false comparisons to prove that POWER7 is faster, such as comparing two POWER7 vs one SPARC) and you are also working at IBM. I would not be surprised if IBM pays you to write this FUD. This is great, you attack me and ask me why I attack you? And then you post lot of FUD, and ask me why I FUD? etc.

                You are welcome to post any benchmarks that show one POWER7 cpu is faster than one the old SPARC T5. Remember, we are discussing cpu vs cpu, not socket vs socket, nor GHz vs GHz, or what not.

                Regarding SPARC M7, IBM has no chance, there are no winning IBM benchmarks. Show us a large IBM server beating a large SPARC server, if you can. M7 is 3-4x times faster than SPARC M6 (which is already very fast), so surely it beats POWER7, POWER7+ and POWER8 and POWER9. If there ever will be a POWER9. As IBM has explained, AIX will be killed in favor of Linux. You better brush up your Linux skills. And it does not make sense to manufacture nextgen POWERxx servers, when everybody else is better (x86 is faster than POWER8, SPARC is many times faster, ARM is more power efficient, etc). First IBM will kill off AIX, then POWER.

                1. PowerMan@thinksis

                  Re: POWER8 disappoints "Impostor" wants it both ways

                  You are hopeless. Check out my two blogs at powertheenterprise.wordpress.com. I don't want to keep going back and forth as it bores the other readers not to mention me. For the record, I was kidding you about attacking me. I thought I made that clear in how I wrote it. Lastly, I encourage you to call the "Brett Murphy" who is at IBM. Look up his phone number at the IBM directory then call and leave him tons of vm. Others can find me at www.thinksis.com moonlighting from my day job :)

                  1. MadMike

                    Re: POWER8 disappoints "Impostor" wants it both ways

                    No, it is YOU who is like a frog without legs: hopeless.

                    ;)

      2. PowerMan@thinksis

        Re: POWER8 disappoints

        @MadMike seems to be nearly identical to the post of "kebabbert" at EnterpriseTech's article. Read his comment and my response. He is spewing nearly the same FUD there. He is a sharpshooter who shoots his mouth off with talking points just like Phil Dunn of Oracle does. http://www.enterprisetech.com/2014/08/13/oracle-cranks-cores-32-sparc-m7-chip/#comment-229928

    3. Roo
      Windows

      Re: POWER8 disappoints

      "IBM POWER8 is a big disappointment. One POWER8 socket gives 437 SPECint2006, and it gives 342 SPECfp2006:

      http://benchmarkingblog.wordpress.com/2014/04/28/awesome-power8-benchmarks-awesome-dessert/

      The SPARC T5 gives more performance that, 467 and 436 for one socket:

      https://blogs.oracle.com/BestPerf/entry/20130326_sparc_t5_speccpu2006_rate"

      How bizarre, it looks like you are comparing base to rate figures, totally different benchmarks. All you have shown is that a single POWER8 *core* can get within spitting distance of a T5 running flat out with all cores blazing.

      1. MadMike

        Re: POWER8 disappoints

        @Roo

        "...How bizarre, it looks like you are comparing base to rate figures, totally different benchmarks. All you have shown is that a single POWER8 *core* can get within spitting distance of a T5 running flat out with all cores blazing...."

        It didnt get this. Care to explain a bit more? How can one POWER8 core match one SPARC T5 socket?

        https://blogs.oracle.com/BestPerf/entry/20130326_sparc_t5_speccpu2006_rate

        If I redo my numbers, look at the SPECint_rate2006, for the "Base" column, it says 3490 for 8-socket SPARC T5. This translates to 3490/8 = 436 SPECint_rate2006 for one socket.

        SPECfp_rate2006, for the "Base" column, it says 2770 for 8-socket SPARC T5. This translates to 2770/8 = 346 SPECfp_rate2006 for one socket.

        In both cases, 436 SPECint_rate2006 for SPARC T5 matches the value of 437 for POWER8. And 346 SPECfp_rate2006 matches the value of 342 for POWER8. Ergo, the old SPARC T5 and the new POWER8 have identical performance on SPEC2006 benchmarks if I pick the worst case numbers. In best case or worst case, the POWER8 core is not as fast as one SPARC T5 socket. So I dont understand your claim. Can you explain a bit more?

        I dont understand why IBM released the POWER8 when it doesnt even beat existing old cpus. Big failure from IBM. Again.

        And rumours says that POWER8 has scalability problems and can not go above 16-sockets. Which makes it as fast as a P795, but in a smaller foot print. What is IBM thinking? Is IBM planning to kill off AIX after POWER8? Where is the POWER9 on the roadmap? It is not mentioned. The future for POWER and AIX looks grim:

        http://news.cnet.com/2100-1001-982512.html

        1. Roo
          Windows

          Re: POWER8 disappoints

          "It didnt get this. Care to explain a bit more? How can one POWER8 core match one SPARC T5 socket?"

          SPECfp & SPECint are *different* benchmarks from the *rate ones. One targets single thread performance the other multi-thread. It's the Apples & Oranges scenario again.

        2. PowerMan@thinksis

          Re: POWER8 disappoints

          Spewing more talking points and FUD. You aren't a "fanboi" as that implies loyalty to technology that may or may not be the best. You are dazed and confused...actually I would say you work for Oracle and simply defending the Larry's honor or you work in their marketing department. Maybe you are the new guy and this is part of your hazing - "Get our there and get your butt kicked while saying this and this and don't forget this!".

          You can't be taken seriously when everything you say is an obvious attempt at disparaging competitive platforms with no substantial data. "Rumours say ..." - whatever!

          1. MadMike

            Re: POWER8 disappoints

            "...Oracle achieves world records and impressive numbers by one means and one means ONLY. They produce servers with excessive sockets and cores then aggregate the values for the given product and claim superiority. That is weak engineering, disingenuous marketing with diluted value to the customer. M6-32 with 32 sockets & 384 cores deliver 793,930 SAPS or 2,067 SAPS per core. P7 795 with 32 sockets and 256 cores deliver 688,630 SAPS or 2,690 SAPS per core. With these two, which delivers the greatest performance per core?..."

            Who cares about cores? We are discussing which cpu or server is the fastest. And who has the higher score? Which server is more powerful? Which cpu is fastest? Sure, if we are going to discuss "who has the fastest core" - but we are not. We are discussing which cpu is best. Which server is most powerful.

            .

            "....Spewing more talking points and FUD. You aren't a "fanboi" as that implies loyalty to technology that may or may not be the best. You are dazed and confused...actually I would say you work for Oracle and simply defending the Larry's honor or you work in their marketing department. Maybe you are the new guy and this is part of your hazing - "Get our there and get your butt kicked while saying this and this and don't forget this!"...."

            Again, I have never worked at Oracle nor Sun. I have always worked in Finance, and right now I am a researcher in algorithmic trading. I tried to get my huge company to buy more Solaris and SPARC, but failed. But you on the other hand, work at IBM, right? Are you getting paid to write this errorneous stuff?

            .

            "...You can't be taken seriously when everything you say is an obvious attempt at disparaging competitive platforms with no substantial data. "Rumours say ..." - whatever!..."

            I always post links as you can see, I would never post claims without links as you do. Just read the link I posted regarded "rumours say". A mathematician needs to prove his claims, and I do that with links. Hardly anything I say are my own conclusions, I just reiterate and post official benchmarks and other links that other people have written. You on the other hand, makes up lot of weird stuff without any links. Your talk about "one core is faster, therefore the cpu is faster" - how in earth can you arrive at such a wrong conclusion? Have you not learned to reason and think critically at your university? Obviously not, as you display such a crippled logic.

            1. PowerMan@thinksis

              Re: POWER8 disappoints

              No, "you" and the misleading SPARC sellers like to talk about who has the highest score while not focusing on how many cores and other resources it takes to achieve that number. I think it is great to be in 2nd place with Power servers on various benchmarks as it almost always shows a Power server with 2X fewer cores with a better score for those cores than the competitive server with 2 - 4X more cores (implying lower performance for those cores).

              Where are your links in this response? Check out my twitter "@PowerMan_SIS" and you will see I provide all kinds of quotes, pictures, charts, etc. These forum comment sections don't let us post anything but text which limits providing comprehensive sources.

              I do not understand why you have to attack me personally. I am just trying to have a discussion with you while informing the other readers.........................I'm just messing with you! :)

    4. Roo
      Windows

      Re: POWER8 disappoints

      "And look at the IBM road map. It seems a bit empty? What is there after POWER8? Nothing?"

      Good question. I suspect a fair amount of their power budget is expended in driving those massive I/O and memory bandwidth numbers, and it's a brutal game of diminishing returns... However IBM also punt massively scalable beasts like BlueGene/Q that deliver very close to peak performance - with decent power efficiency (3.7GF/W). POWER8 being opened offers possibility of convergence on something like a BlueGene style building block with SoC customization (see POWER A2). POWER's future looks a lot more useful to people who want to fun code faster and cheaper than Larry's boat wrecks.

    5. PowerMan@thinksis

      Re: POWER8 disappoints

      This is a repeat of what I posted at http://www.enterprisetech.com/2014/08/13/oracle-cranks-cores-32-sparc-m7-chip/#comment-229928.

      You made a lot of assumptions with nothing to back up any of your statements. Making the claim does not substantiate the claim.

      Oracle achieves world records and impressive numbers by one means and one means ONLY. They produce servers with excessive sockets and cores then aggregate the values for the given product and claim superiority. That is weak engineering, disingenuous marketing with diluted value to the customer.

      First proof point – Oracle sells software, primarily by core (yes, other means are available). Thus, it stands to reason that performance per core is crucial. Look at the SAP benchmark where there are Power7 795, Power8 S824 and the M6-32 results.

      M6-32 with 32 sockets & 384 cores deliver 793,930 SAPS or 2,067 SAPS per core. P7 795 with 32 sockets and 256 cores deliver 688,630 SAPS or 2,690 SAPS per core. With these two, which delivers the greatest performance per core? The entry level Power8 S824 server with 2 sockets and 24 cores deliver 115,870 SAPS or 4,827 SAPS per core. For additional comparison purposes, here is the 8 socket 124 core T5-8 that deliver 220,950 SAPS or 1,726 SAPS per core.

      Just about every SAP landscape that I see have a total SAPS requirement but but that isn’t a singular value. Also, the ability of the server to utilize the full SAP value of that platform is also critical – this is dependent on the ability of the hypervisor, server technology and OS to be able to drive utilization that is useful. What this means is the likelihood of a Power8 24 core server being able to deliver 115K SAPS is quite likely. The likelihood of a x86 server being able to deliver it’s full SAPS is not likely. Given the weak hypervisor used in M6 & T5 along with the less capable SPARC chipset which includes the less efficient CMT (threads) makes it less likely for these servers to be driven as high as Power. I’ll hold off on critiquing Domains and LDOMs pending your desire to engage in that discussion as I look forward to that.

      Looking at the per core results of M6-32 I deliver more per core with the 4 year old P7 795 server and IBM’s entry level 2 socket 24 core S824 delivers 2.8X higher performance per core over the “Solves World Hunger” Oracle SPARC M5-8 and 2.3X over the M6-32 “King of the Hill” server. To put this in perspective, if we normalize the performance of the S824 with its 24 cores, that is equivalent to approximately 67 T5-8 SPARC cores or 56 x M6-32 SPARC cores.

      If you are a SPARC customer then you have fallen for their smoke and mirrors marketing that performance increase comes by doubling the cores. That simply doubles your software costs. If you work for Oracle then I expect you to sing from the Larry song book :) What is impressive with IBM’s Power8 technology when you see that it is 2X the performance over Power7 and x86 is that they are doing that on a per core basis. Using the Oracle sizing method we claim we have 3X the performance over Power7 and x86 but that is because we are going from 8 cores in Power7 to 12 cores in Power8. That means nothing unless you are pay software licenses by the socket then per core and # of cores per socket is very important.

      Hopefully you see the error of your ways. I am a IT business partner, Architect and evangelist on Power technologies. I also worked at Sun for 10 years where I was a cluster and storage specialist (also StorageACE) and an instructor in the Army on SunOS/Solaris servers. I say this to say that i am a fan of Solaris. I love and miss Sun – great people and culture. Plus, I have experience on the platform. I am not speaking here as a seller trying to persuade you or the reader that I’m right because of talking points. I continue to challenge Oracle to a live, face to face technical debate that we can record and publish for all to see and hear the discussions. We both can have a whiteboard to aid in our discussions. We should have a panel of independent industry experts who can award points based on winning each topic. Because we know there are some people (ie you) who will never accept defeat even if proven beyond the shadow of a doubt.

  10. Freddellmeister

    M7 is a rebadged T7!

    there is no M7, to hide the fact M7 development is cancelled after current M5/M6 fiasco Oracle renamed T7 to M7. (and suitably dropped T7). More interestingly this will allow to use the same server designs across the line and the M5/M6 monster can be take out the back. Oracle can finally fire the high end old Sun design team that did not produce a single viable server design for 10 years after E25k. One product (M5) in 10 year is not acceptable by any standard.

    Look at the core strength, number and pitiful cache size it is clear that it will not be able to hold a candle against Power 8. Compression? Well POWER7+ and POWER8 already has HW assisted compression/decompression.

    Unmatched 1024 core scalability? Well IBM already announced the "TrueNorth" 4096 core chip.

    1. MadMike

      Re: M7 is a rebadged T7!

      Well, the POWER8 is 2x fast as POWER7.

      SPARC M6 is faster than POWER7, and this SPARC M7 is 3-4x faster than M6. Ergo, M7 should be at least, twice as fast as POWER8.

      Show us benchmarks instead of talking about details. "Where is the money" - prove your point with hard facts benchmarks instead of your opinions and wishes.

      1. PowerMan@thinksis

        Re: M7 is a rebadged T7!

        Twice as fast based on....what? The M6 has 2 benchmarks that I can find; SAP & EBS. Power8 E870/E880 has been announced for 15 days and has results for SPECint/fp (rate+base), SPECjbb2013, and SAP S&D 2-tier.

        Let's be clear as you continually espouse this lie. SPARC is not faster than Power. SPARC has larger servers than Power which produce larger results. You then market that as being faster. Since most software, especially Oracle where you work licenses by the core it really doesn't matter if a server has 1024 cores rather if each core is the strongest it can be. This is why there are sporadic and few SPARC benchmarks. When there are results they are cherry picked (not a bad thing btw) and often in benchmarks that the competition doesn't use, recognize or "Oracle Internal". Further, Oracle marketing makes their usual overstatements and misstatements (i.e. lies) such as 4 T2+ processors beat 14 Power6 processors when The T2+ processor is a socket and the 14 Power6 processors are cores. Thus it is really 32 vs 14. Mistake? Possibly if there weren't many, many more examples like this.

        For the public reader. The E870 is producing 996 Users & 5451 SAPS per core. A M6-32 delivers 365 Users & 2067 SAPS per core. Btw, the 640 core Fujitsu M10-4S is just 239 Users & 1319 SAPS per core. What did you say the M7 will be? How is the M6 faster than a Power7? Oh, I didn't show it did I? Here it is for a 780 at 594 Users & 3247 SAPS per core. Hmm, M6 isn't faster than Power7 and isn't even close enough to suck the exhaust of the Power8 machine as it is that far behind.

        Solaris on SPARC is not bad. It's solid. If a customer wants it fine. However, if a customer wants the maximum performance, availability, reliability, flexibility, security, virtualization features that control software licenses costs with the lowest cost of ownership they will choose Power. It's not competitive, it's not inflammatory, it's not provocative, it's a fact. I am happy to have a public debate against Oracle to prove this all. Bring your best DE, PE, SE or whomever and get it on. We will record it for public airing and have a panel of independent panelists consisting of a customer, academia and analyst to arbitrate.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like