back to article Boffin stacks 16 PS3s to simulate black hole collisions

When most of us arrived home with our newly purchased PS3, we couldn't wait to start annihilating aliens in Resistance: Fall of Man or kicking butt kung fu-style in Virtua Fighter 5. Not astrophysicist Gaurav Khanna - he used his to build a supercomputer. Khanna now owns a total of 16 PS3 consoles, all linked together to …

COMMENTS

This topic is closed for new posts.
  1. Brian Miller
    Linux

    Love that Cell chip

    Now if only I could get it without the game console packaging.

  2. Jon
    Thumb Up

    But the WII is so much better*

    *not

  3. Sarah Baucom

    Misleading

    It's a bit misleading to say that the PS3 is faster than the fastest desktop machines. It really depends on what you are doing. The PS3 is very fast at certain things, but it is also limited by only 256MB of RAM, and slow double-precision floating point operations (it's very fast at single-precision, but other CPUs can be much faster at double). This cluster is twice the size of the one at my school, but we had the first academic PS3 cluster over a year ago :) - http://moss.csc.ncsu.edu/~mueller/cluster/ps3/coe.html

  4. Anonymous Coward
    Anonymous Coward

    He should see

    If he can get it added to the folding at home app on the PS3.

  5. E

    Excellent

    That is better and smarter use of money than a wall of x86_64 blade servers.

  6. Alexander Franceschi
    Black Helicopters

    And Saddam couldn't get a PS2

    If people remember when the Playstation 2 was released, exports to Iraq were deemed forbidden. Allegedly, the idea was the powerful Playstation 2 would be used to build a powerful Iraqi supercomputer! I'll even quote a Register article, http://www.theregister.co.uk/2000/12/19/iraq_buys_4000_playstation_2s/

    Surely, we must limit Playstation 3 exports to Iran!

  7. Jason Togneri
    Joke

    And the best bit

    When they're finished their project for the evening, they can knock off and have a bit of a play!

  8. Anonymous Coward
    Thumb Down

    Well, what else can you do with a PS3...

    ...when the games are so crap?

    Sony certainly have plenty of them going spare, maybe they should donate them to research?

  9. Anonymous Coward
    Flame

    @ David Corbett

    Go f**k yourself, troll

  10. Anonymous Coward
    Anonymous Coward

    Future headline

    After the earlier comment about the Wii, I'm just looking forward to this headline:

    "Boffin stacks 16 Wiis to simulate big crowd of idiots waving their arms about."

    Now that'll be science!

  11. paulc
    Alert

    wow

    I didn't know they were THAT heavy...

  12. Andy Worth

    I'm sure it won't be long......

    .....Before someone comes along and says that an XBox 360 cluster would do the job better. I was actually surprised not to find a comment in the first 5 when I read them.

    And for the price, I doubt he could make a better cluster, at least for the purpose HE needs. Of course it's a theoretical simulation of an event using equations and constants that are no more than guesswork in the first place, so the validity (and usefulness) of the results is questionable to say the least.

    Either way, I wish him luck :)

  13. Joe K

    @sarah

    I'd say its way faster, thanks to the SPE units, which when programmed correctly act better than a multicore processor, and the PS3 has 7 of them as well as the main CPU.

    Just a few SPE's can real time raytrace, NO help from the graphics chips either:

    www.youtube.com/watch?v=oLte5f34ya8

    Once programmers get used to multithreaded game code and farming out performance enhancing methods to the SPE's, things are really gonna get interesting.

    SPE usage so far in games:

    http://boardsus.playstation.com/playstation/board/message?board.id=ps3&thread.id=2575300

  14. Eponymous Cowherd
    Thumb Up

    @Jon

    Congratulations on winning the the dumb-ass comment of the week award.

  15. Dirk Vandenheuvel
    Thumb Down

    PR

    "Overall, a single PS3 performs better than the highest-end desktops available and compares to as many as 25 nodes of an IBM Blue Gene supercomputer"

    ROFL... I smell bullshit.

  16. Anonymous Coward
    Coat

    @ Jon

    Quote: "But the WII is so much better, not"

    Oh deary me, you appear to be stuck in the 1990's. Much like the gameplay on the PS3.

    Here's your jacket, I believe it's the one with "Backstreet Boys: I want them THAT way!" written on the back.

  17. Monkey
    Joke

    But you know it doesn't count as...

    ... A proper blu-ray player!

  18. Oscar
    Go

    ... imagine if everyone did this ...

    Ok ... so Sony sell the PS3 at a loss .. right? So what would happen if EVERYONE bought a PS3 did this with it and didn't buy any games?

    /me dreams

  19. Anonymous Coward
    Anonymous Coward

    But this is what the PS3 is designed for, not games.

    The Cell processor simply isn't a very good architecture when it comes to gaming, Sony merely used it in the PS3 to bring it's costs down for other more profitable areas (like selling high end research machines).

    It has too many cores that are too specific in task to be usable for the constantly changing data of computer games, as users input commands via the controller it's impossible to predict what needs processing next and hence which SPU should handle it, as such a lot of resources are wasted deciding where data has to go and even then some or possibly even all SPUs may not be able to handle such type of data processing as is required at that point in time and will end up unused for that period of time anyway.

    What the Cell is good for though is this type of project, where you just have a LOT of data that needs crunching. Perfect for data that is static and isn't going to change much if at all so that you can divide the data up before you start processing and then dish it out to the SPUs and let it churn through them in it's own time without worrying about them ever being idle or worrying about decisions about what data needs to go where until the data is all done and processed.

    The fact the PS3 only has a GeForce 7800 equivalent graphics card and pushed Bluray so heavily, coupled with having a Cell processor rather than something better suited to gaming like a CPU with 2 to 4 generic cores is evidence enough that Sony weren't too bothered about gaming with the PS3 and that their main goal was to make hardware designed for their other business paths cheaper by using consumers as a tool to bring prices down.

    Whilst the PS3 has the most expensive hardware out there, when it comes to applying hardware to gaming the PS3 only sits somewhere roughly between the 360 and the Wii power wise but sits on top for scientific application like this.

    This isn't to say it matters, the PS2 was pretty low spec last gen in comparison but it still came out on top however Sony's arrogance in believing consumers would act as a tool to bring Cell costs down and Bluray mainstream may have cost their console market dearly even if it's helped Bluray and Cell judging by the relatively low software and last place hardware sales right now.

  20. HFoster
    Linux

    Yay Linux

    <insert Linux/FOSS fanboy commentary here>

    But seriously, this is the kind of inventiveness that needs greater publicity in this world of End Users. I'd be interested to see PS3/COTS hardware put to use in more and more "hardcore" computing projects. Powered by Linux, of course.

  21. James

    oh dear....

    David Corbett...it was only a matter of time before the mindless fanboys came out.

    @Sarah Baucom - It's true that the doublt performance is an order of magnitude slower, but it's still cheaper to buy a blade rack or several PS3 to match the performance than it is to buy a super-performing cluster.

    And memory isn't such an issue. The whole purpose idea behind parallel computing is to break one big task into smaller sub-tasks. This is also means, breaking up the memory used into smaller fragments. Ideally, you should be loading only the data you need at a certain point before kicking out the results over the network to free up the memory before continuing.

    Or, you could use more that 256Mb of memory space, but it would be paged to disk anyhow under Linux. Of course, if you're school has had a ps3-cluster for a year now, you'd know all about that stuff.

  22. Anonymous Coward
    Anonymous Coward

    @Sarah Baucom

    I think it's fairly obvious from the article that he means 'performs better than the highest-end desktops <for this type of calculation>'. Basically, he's using the graphics processor to do maths (which is what graphics processors do very well).

  23. Anonymous Coward
    Stop

    @ David Corbett

    Well done on being the first person to turn this thread into something negative. I have to ask why you bothered to read the article, did you just see PS3 in the title and rub your hands together gleefully? Grow up.

  24. Steve
    Thumb Up

    Cell chip

    If cell chips are so good for certain types of calculation, why can't we get them on a riser card and stick it in a PCI-E slot.

    Surely it could be used for in game physics engines, and of course academics could make use of it without having to bastardise games consoles.

    Does a PS3 contain just one cell? And the 16 machines in that cluster are communicating over gigabit ethernet.

    A single PC with two PCI-E 16* slots could probably have that many cells stuck inside. It might need an upgrade to it's cooling solution, but they'd get higher intercommunication bandwidth, lower latency and a bigger pool of memory (shared) to access.

  25. Anon Koward
    Coat

    If they did this on the Wii..

    They could use the controllers to simulate black holes slamming nito each other or perhaps have Mario try and escape from the gravity well!

    /my coat's the one made up of strings...

  26. Matt Bucknall

    @Brian Miller

    http://www-03.ibm.com/systems/bladecenter/hardware/servers/qs21/

  27. Mark
    Paris Hilton

    re: Misleading

    "but it is also limited by only 256MB of RAM"

    The PS3 has 512MB of RAM, whilst it's split between 256MB for the Cell, and 256 for the RSX both can use the other's memory pools. In a non graphical Linux enviroment, you should be able to use most of the RSX's memory too.

    Paris Hilton Loves Wii, because all her friends have one.

  28. James

    RE:But this is what the PS3 is designed for, not games.

    I have to wonder Mr Anonymous Coward, do you actually have any experience in programming games? Because if you did, and had experience in programming the PS3 you would know that the problem you've desribed just doesn't happen.

    The PPU is what is used to shcedule jobs on the SPU's, and it's this bit that it a little more suited to overly complex descision making. In fact, the SPU's themselves can also decided what piece of data to process and HOW to process it, since SPU code is really just data too. It's not like the PS2 where you could only feed data to VU0 and VU1 with no real logic control beyond basic loop-branching.

    Just because they don't have a deep and costly branch prediction scheme, doesn't mean that they're inneficient at conditional code.

    A branch miss on a SPU only cost around 6-cyles. Compare this to something like the oh-so-wonderful general purpose processors such as modern day Pentium class where misses can range from just a few cycles to a hundred.

    Right...I have a bug in my SPU code to fix.

  29. Anonymous Coward
    Gates Halo

    SOH bypass?

    @ James & the AC

    James: "David Corbett...it was only a matter of time before the mindless fanboys came out. "

    AC: "Well done on being the first person to turn this thread into something negative.."

    Actually I think that 'accolade' can go to Jon, with comment #2.

    .

    @ the other AC

    AC: "Go f**k yourself, troll"

    1/10, must try harder.

  30. Nick

    RE: But this is what the PS3 is designed for, not games

    "The Cell processor simply isn't a very good architecture when it comes to gaming" - Really? My day job would argue otherwise, but I'm sure you have good reasons...

    "It has too many cores that are too specific in task to be usable for the constantly changing data of computer games" - You can never have too many cores now that the Gigahertz era of CPU design is over. SPEs are not really that specific, they are very good at the sorts of things a game engine needs - physics, scene management, culling, in fact a lot of things that can aid the RSX, in contrast to a PC graphics card which tries to do everything to relieve the general purpose CPU. Data in computer games usually consists of predefined audio, 3D models, textures, etc, it doesn't change constantly.

    "as users input commands via the controller it's impossible to predict what needs processing next" - Everything. For example a racing game, your controller input affects the steering angle, which needs to be fed into the physics engine. The physics engine is running regardless, it's just a different input.

    "and hence which SPU should handle it, as such a lot of resources are wasted deciding where data has to go" - SPUs are identical, a good scheduler will route work to where there is idle time. The time taken for this is negligible and anyway you handle this on the PPE, which is what it's designed for.

    "and even then some or possibly even all SPUs may not be able to handle such type of data processing as is required at that point in time and will end up unused for that period of time anyway." - You've written the SPU code and compiled it, so by definition both you and the compiler know it will run on the SPU. If you find yourself in this situation, you've broken your compiler.

    I've never seen a post that displays such complete misunderstanding in pretty much every phrase. It is obvious you have never programmed either for the Cell, or worked on any published game, given your complete lack of knowledge about both, so I have to wonder why you have gone to such lengths to post what you have. Presumably you are trying to make yourself feel better about purchasing a rival system? Why don't you just go and play it?

  31. Steve

    @ Andy Worth

    "Of course it's a theoretical simulation of an event using equations and constants that are no more than guesswork in the first place, so the validity (and usefulness) of the results is questionable to say the least."

    I hardly think that general relativity theory counts as "no more than guesswork". If you want experimental confirmation of it's predictions, try sending a signal to a satellite without accounting for the frequency shift.

  32. Joe Cooper

    RSX Memory

    "The PS3 has 512MB of RAM, whilst it's split between 256MB for the Cell, and 256 for the RSX both can use the other's memory pools. In a non graphical Linux enviroment, you should be able to use most of the RSX's memory too."

    That's just not how it works. The other memory pool is ~governed by~ the RSX, and needs to be accessed via the video driver or some graphics API.

    Of course, if NVidia drivers are available for it, I suppose you could write an OpenGL based swap FS driver... But I don't know if a kernel module would be able to access OpenGL APIs... Never done that sort of thing.

    So practically the thing is limited to 256 megs of RAM no matter what. That makes it useless for desktop work: Typical desktop Linux is a pain in the ass with 256 megs of RAM.

    But when it comes to single precision floating point ops, the cell processor is an epic badass.

    (And just so ya'll know, that is extreeeemely valuable for gaming.)

  33. Gary McCabe
    Happy

    Nonsense! The Wii can do this.

    My Wii can simulate black holes, and in a graphically superior manner as well. Only the other night, there it was- with Mario running about and trying not to fall in.

    :

    gary

  34. Anonymous Coward
    Anonymous Coward

    What you are forgetting

    What those that write games for the PS3 are forgetting is that the Linux environment Sony let you use is controlled by their Hyper Visor technology which somewhat restricts what hardware resources you can use.

    For a start you can't access the RSX graphics system at all. The only graphics can be done through a Frame Buffer driver. When there was a hack new Sony firmware swiftly closed it. I assume Sony don't want any games avoiding paying Sony their cut by running in Linux. That also means you really are stuck with 256mb of RAM, still for massively parallelised number crunching like this, thats probably not too bigger problem.

    So the only really difference between a normal cluster node and using the PS3 is the Cell chip and its SPEs which can certainly have its advantages.

    As a games platform its definitely the most powerful. But then who owns a console for its specs. You buy it to play games and thats where at present it certainly deserves some criticism.

  35. Stan

    Trolls...

    ...funny, they got nothing in the article to troll at so making troll noises at each other instead. Maybe it some kind of troll mating call.

    Good luck to the guy with his project, and if he gets the time how about adding a few inputs to the big sums for an intergalactic black hole billiards sim :)

    Got to go and read up on the cell again, that is one serious MF of a processor. Around the same time the PS3 was released IBM had a server on offer with the cell processor for something like 3000 squids, not sure if and how much now though. If anyone dares to suggest the 360 can challenge the PS3 for big sums power they can suck IBM's big, fat blue one ;)

    cheers

  36. Roland
    Coat

    Re: imagine if...

    <quote>

    Ok ... so Sony sell the PS3 at a loss .. right? So what would happen if EVERYONE bought a PS3 did this with it and didn't buy any games?

    /me dreams

    </quote>

    We could write some software for this problem and run it on this cluster?

    ;-)

    Nah, don't need a coat, got my portable patio heater with me.

  37. Andrew
    Alert

    If only...

    This sort of thing wouldn't happen if only someone would actually release a decent game for the system.

  38. Iain

    @Mark

    While the main Cell SPUs can indeed communicate with the RSX to read from the latter's 256Mb pool, it's PAINFULLY slow to do so - 16Mbps. No, that's not a typo; it would be faster to use swap on the hard drive.

    There was a whole bunch of scaremongering around launch about this meaning the PS3 was 'broken', but in reality it's just not something that you ever do in practice. Also, as the AC notes, it's something only game coders are allowed to do, as the hypervisor for the Linux environment blocks off that particular pool completely.

    Gravity simulations, like game physics calculations and rendering, involve a large number of iterative calculations on reasonably small datasets, and so are ideally suited to the Cell architecture.

    Which is all a bit of a pity, as it's just _so_ tempting to make a joke relating to the fact that a PS3 is even heavier than the original XBox, and the rest of my post is rather boringly sensible now.

  39. Anonymous Coward
    Anonymous Coward

    An XBox 360 cluster would do the job better.

    So there.

  40. Iain

    @Andrew

    There are several games on the PS3 that qualify as 'Decent'. There may, or may not, be anything that appeals to you personally, but the suggestion that none of them are good on the semi-objective criteria that reviews use is silly.

  41. Anonymous Coward
    Anonymous Coward

    Plainly anal

    Who TF needs to simulate a black hole? Game of Life is what it's about.

  42. Tim Lake

    @ Andrew

    Call Of Duty 4, Burnout Paradise, Gran Turismo Prologue, Uncharted, Singstar, Resistance.

    These are some of the decent games you have clearly missed. Maybe they were on a shelf too high for your little troll arms

  43. E

    @Steve

    No PCIe Cell cards because the vendors are milking it for every dollar they can when used in a computation market. I've pursued companies that sell the beastie in rack mount boxes for a PCIe card product, but all I get back is "Oh, we have this lovely development system for you to buy - it's only $15000", or "No, you have to buy a blade system! It's the only way!"

    For a product that was launched with much hype about network- and ubiquitous computing the product lines are a bit of a disappointment.

  44. Anonymous Coward
    Joke

    RE: An XBox 360 cluster would do the job better

    ...well it would be more fun, you could get your mates round to bet on which one would RROD first. With 16 in the same stack you'd be pretty much guaranteed to have one die before you could get the beers in...

  45. Webster Phreaky
    Jobs Horns

    20 Stacked iPhones to equal 1 giant Etch-o-sketch ...

    and one old fashion crank-up tele.

  46. Anonymous Coward
    Alert

    @Andrew

    This sort of thing would happen with 360s if they didn't keep dying.

  47. Anonymous Coward
    Anonymous Coward

    Mmmmmmmmmmmmmmm, I'm lovin' it

    Now that's smart.

  48. W. Anderson

    story on PS3 cluster

    The key word or consideration noticeably missing from the mention by those who advocate using Nintendo wii or Microsoft Xbox 360 (other than in jest) is use of GNU/Linux - which is fully suitable for such project, but totally unavailable on either alternative platforms and certainly infinitely more powerful and scalable than any OS from Microsoft - period.

  49. Tom

    @ Steve,

    You show me a riser kit with the bus spec required, there'd be nothing that could stop the overflow from bottle neck produced, save a stupidly humungous L2 cache, but then you'd still be waiting for that to clear.

  50. Daniel B.
    Boffin

    Cell

    Ah ... I just hope this picks off well enough so we can get Cell-based servers ... or even better, Cell-based PC's and finally break away from the bloody x86 curse.

    Shame on Apple for ditching PPC and killing the last mainstream non-Intel desktops!! At least the PS3 can be made to work like that ;)

  51. Brian
    Coat

    xbox 360's XNA kit

    Technically, couldn't you write your simulation code (presumably with the cluster/sync code) in XNA? Oh wait, they don't allow network code in XNA. MS screwed the pooch again.

  52. Anonymous Coward
    Happy

    Greedy..

    Ignoring all the "good idea/bad idea" flameboi comments, I think this approach is just greedy. 16 PS3s? Please can I have one and the boffins can cope with a 1/16 reduction in performance?

  53. Highlander

    @ the usual suspects

    Oh look, the usual suspects showed up to turn this into a console war thread. Guys, the 'war' is over. All three consoles are here and have good games. Anyone peddling the myth that the PS3 doesn't have good games is living in a world of their own making.

    @Sarah, Cell BE is very fast on single precision and is still faster than any desktop chip for Double precision. IIRC each Cell at about 3.2GHz hits in the region of 250GFLOPS single precision and 25GFLOPS double precision. Yes, the SPEs are optimized for SP work. However, apart from the most recent quad core (and multiple execution units per core) x86 processors I'm not aware of any existing x86 that comes close to Cell floating point performance. There is a revision of the Cell that further optimizes floating point math vastly improving the performance in double precision work.

    @the AC who continues the blather about the PS3 suffering because Cell wasn't designed for gaming. You sir are a fool. Cell BE was designed primarily for the game console market. It was a custom design job by Sony/Toshiba/IBM as a consortium, specifically for the PS3. It so happens that the high performance architecture they created is also suited to HPC. The RSX is not the GeForce 7800, it's a derivation from that architecture. There is additional hardware onboard as well as design revisions specifically targeted at the needs of a game console architecture.

    You're utterly wrong to state that something with two or four generic cores would be better suited to gaming. It would be easier on day 1 to use, but it would most certainly not be better suited.

    @Iain

    Sony specifically wanted to segregate the memory of Cell and the memory of the RSX. Cell can write through RSX to the GPU memory far faster than the 16Mbps read performance. There aren't many occasions when you are going to want the CPU to read directly from the GPUs memory. The only time when this becomes an issue is when you try to make the PS3 become something it's not. How many times have we ever heard of PC applications where the CPU is reading the GPU's memory directly?

  54. J
    Linux

    @Joe Cooper

    "Typical desktop Linux is a pain in the ass with 256 megs of RAM."

    Probably not "typical desktop Linux" at all, eh? I would suggest the boffin in question did NOT just download the latest Ubuntu copy and just use that as is with eye candy and all that. You know, I use a Beowulf Linux cluster here at the lab, and all that has is console access... And consequently does not need to run OpenOffice or Firefox either.

    Too bad my type of scientific app would not do with so little RAM (genome assembly and molecular DB searches can be quite hungry in that area). Otherwise, it would be interesting to justify a bunch of PS3s in a project's budget...

  55. PaulM

    The PS4 will have 30 times the supercomputer performance of the PS3

    I attended a public talk by a Playstation 3 (PS3) games developer in which he said that future IBM Cell processors would contain hundreds of Synergistic Processing Elements (SPEs). He did this in order to emphasise the importance of using as many SPEs as possible when writing very high performance games.

    It is a reasonable assumption that the PS4 will contain hundreds of SPEs. In PS3 Linux only 6 SPEs are available to the user. If I assume that the PS4's Cell Processor will have 200 SPEs then this would imply that the PS4 will perform scientific calculations around 30 times faster than the PS3.

    Note that I am assuming that the PS4's Cell processor will be clocked at the same frequency as that in the PS3 to ensure that PS3 games will run on the PS4.

  56. Highlander

    @J

    Talk to IBM about their Cell based blades....they *don't* come with a 256MB memory limit.

    Absolutely agree with your comment about typical desktop Linux, but I also have to point out to everyone that Linux was once touted as having an incredibly light memory footprint, what has happened there then? Since when did Linux need more than 256MB to manage a simple desktop for users? Ah, you'll be worried about the frame buffer size? 2MP at 4 bytes per pixel is still only 8MB. Even triple buffered that's 24MB. It's not like you're going to want to run texture intensive games under Linux on your PS3. Honestly, people seem to forget that 2GB per desktop is an obscene amount of RAM for a simple desktop. Only in Microsoft's world does it require 1GB to run the OS and a second GB to ensure you have sufficient RAM to run your standard office applications. Windows plus Office used to run very well on systems with 4MB of system RAM and maybe 1MB of video memory. Have we lost the plot that much that we can't run a word processor in less than a quarter gig of RAM?

    Sheesh!

  57. Anonymous Coward
    Linux

    @ Steve [RE: Cell Chip]

    "If cell chips are so good for certain types of calculation, why can't we get them on a riser card and stick it in a PCI-E slot."

    http://www.linuxdevices.com/news/NS6832279023.html

    ..or they could have just used one of these -

    http://www.linuxdevices.com/news/NS3591350722.html

    (but it was probably cheaper to use the PS3's!)

  58. Michael H.F. Wilkinson

    The key is NOT the processor

    I think linking 16 PS3s is a neat trick in this application (maybe we can stick some PS3s in our next budget ;-) ), but several commentators are missing an important point.

    The key is the fact that his code runs EXTREMELY well in parallel, with little communication overhead, otherwise the Gigabit ethernet would completely kill the performance. I would be very interested in the speed-up he achieves with respect to running on a single PS3. I work on various parallel computing problems, and would LOVE to see 16 cell processors configured in shared memory formation (with crossbar switch or some other fast interconnect) then my code might really run neatly. As it is I will stick with multiple dual core opterons (Barcelona, where are you?), or the Nehalem type Xeons.

  59. Highlander

    @PaulM

    *IF* there is a PS4, and *if* it's based on a new and improved Cell then I would think that the version used will have a 'classic Cell' mode that it can switch to, and the system will downclock if needed.

    If there is a PS4 and backwards compatibility is still a feature on the table. Then look for Sony to have substantially upgraded the Cell to allow the box to do real time ray tracing, and the GPU will become a secondary issue. Heck, they might even get away with 4 slightly uprated Cells working together with an RSX for handling the screen. I know that they are talking about a 10 year timescale, but the investment needed to go to yet another custom architecture is astronomical, and there are some serious payoffs for developing the Cell further over the coming 5 years before decisions have to be made.

  60. popper

    optimising for PowerPC/Altivec/Cell

    forgiving the X86 only centric readers commenting so far its clear that they dont understand the PPC/Altivec or the other vector units on the CELL

    its true the PPC/Altivec/CELL can stand some optimisations when running PPC linux ,as the old Mac PPC Coders obviously didnt want to undemine the OSx Altivec optimisations inside that OS.

    it strikes me reading the linked page to the cluster page http://gravity.phy.umassd.edu/ps3.html that infact it does appear that he used a generic PPC linux, and didnt bother to look at optimisations outside the compiler.

    apparently not even considered using the PPC coders choice, 'PPC Gentoo' were all your current PPC Altivec multimedia optimised code is first produced by the likes of lu_zero and the altivec guys at Power dev.

    http://www.powerdeveloper.org/forums/viewtopic.php?t=1082

    take a look at the old school PPC guys code thread here http://www.powerdeveloper.org/forums/viewtopic.php?t=1426&postdays=0&postorder=asc&start=0

    for a practical tryed and tested code base.

    http://www.powerdeveloper.org/forums/viewtopic.php?t=1426&postdays=0&postorder=asc&start=0

    http://www.powerdeveloper.org/forums/viewtopic.php?t=1494

    its clear that the PPC linux DOES NOT currently use any Altivec vector optimisations (as the x86 linux with its limited MMX vector unit does).

    if you read some of the Altivec threads found at powerdeveloper, you will find some answers/Numbers, and the lads are always looking for feed back on the likes of the freevec optimised codebase http://bbrv.blogspot.com/2008/02/freevec-updated.html ,and helping the new user/PPC programmer better understand the PPC/CELL and their vector optimisations you might find informative and practical.

    look at this chart in the thread above for an indication of a generic memcopy/network speedup for instance.

    -----------

    gunnar:

    Quick update:

    I have looked at the Linux Kernel code a bit.

    Its not difficult to improve the performance on PPC.

    The Linux Kernel has a copy function which is used to cope between kernel and user space.

    As this function copies a lot of data its performance has direct influence on network or filessystem performance.

    Improving the speed of it was actuelly easy as you can see:

    http://www2.greyhound-data.com/gunnar/glibc/throughput_970.gif

    http://www2.greyhound-data.com/gunnar/glibc/throughput_cell.gif

    Especially on the Cell improving this function

    does result in feelable performance improvements of the total Linux system.

    I'll do some more testing and then publish the patch soon.

    Cheers

    Gunnar

    -----------------------------

    Posted: Wed Nov 14, 2007 4:43 am Post subject: Possible benefits - optimization for PowerPC

    --------------------------------------------------------------------------------

    Hello,

    Mostly all Linux applications are developed in C or C++. People often believe that C compiler are good enough to guarentee good performance. This is unfortunately not the case, especially on PowerPC manual optimization can make a huge difference.

    Here an example of a memcpy on PowerPC...

    a) Normal C routine working on Byte

    150 MB/sec

    b) Normal C routine working on Long (32bit)

    800 MB/sec

    c) Normal C routine working on quad (64bit)

    1000 MB/sec

    ** This is best performance that you can archive by algorithm design, using C language **

    d) Normal C routine working on quad (64bit) + with two ASM Cache-instruction added.

    1380 MB/sec

    e) ASM routine better optimized for this PPC architecture

    2750 MB/sec

    From 150 MB/sec to 2750 MB/sec is quite a difference.

    As you can see by using optimized code you can achieve 20 times better performance!

    Gunnar

    ----------------------------------

  61. Joe Cooper

    @J

    "Probably not "typical desktop Linux" at all, eh?"

    Of course not!!

    But several people here brought up performance under various circumstances including desktop. I wanted to explain that it's not designed for such a workload, but it's fantastically fast floating point abilities make it fantastically well suited to other workloads.

    ---

    The comments that it's not suited for games are just dumb. The PS3 is easily the most powerful of the game consoles, even if it's power hasn't been fully realized.

    Note that this is different from saying "The PS3 is the winner". In the last two generations of game consoles, the top spec system sold the worst and the lower specced PSX and PS2 kicked ass on the market.

    The PS3 is reminding me of a lot of systems, none of them winners on the market.

    The Sega Saturn, for example, also had fantastic processing power in a bizarre architecture that would require special investment to take advantage of.

    The Apple Pippin too. It was more PC like, and had the same sort of ill-conceived banana controllers that often go with really expensive systems that do things nobody really cares about. (Sony did dump those banana controllers though.)

    If developers are going to try to target multiple platforms, that will only make it less likely that a bizarre architecture will be utilized effectively.

    You can't really brush this off by saying graphics aren't important because if graphics aren't important, than the Playstation 3 has no competitive advantage against the cheaper to buy, cheaper to develop for Nintendo Wii or X-Box 360.

  62. Iain

    @Highlander

    Yes, I should probably have mentioned that even the official Sony docs suggest that you avoid using that painfully small bandwidth from the CPU to the RSX's memory, and get the RSX to write the data to main memory where the CPU can read it at the normal (very, very quick) speed.

    Although, this is discussing the specific case of trying to use RSX memory when the CPU memory pool is already full - at which point there isn't much space to do such a thing. A small 'paging' area would work, somewhat like the old memory pages on the 128k Spectrum, if you're elderly enough to remember working on that.

    I don't know the guy's application code, but it might well be that 256Mb is enough anyway. There are plenty of clever things you can do in that space, without needing any more.

  63. ex-Applenazi
    Heart

    Joe Cooper:

    "If developers are going to try to target multiple platforms, that will only make it less likely that a bizarre architecture will be utilized effectively."

    Lucasarts just announced they'll develop Games for PS3 first and then downport them to X360, didn't you hear?

    http://ps3forums.com/showthread.php?t=124972

    Which is just another sign that the X360 is going down the drain.. Especially now that the HD-wars are over and BD has won, and that RROD-disaster is *still* going on.. Oh, btw: PS3 has just passed the X360 worldwide minus America -> www.vgchartz.com

    Wii passed them long ago, and PS3-Market share keeps growing slowly but steadily! ;-)

    Also, there are finally some really good games already out or coming really soon: Burnout Paradise, Motostorm, Uncharted, Resistance, UT3, CoD4 (already here), LittleBigPlanet, Killzone2, GT5 (coming soon)

    Okay, Lair and Heavenly Sword weren't as good as expected, and Assassins Creed was merely good, but well, it won't kill the platform! ;-)

    And while it may be fun, the Wii isn't included in the regular crossplatform development plans anyway (just not possible, it would "drag down" the graphical quality for the other systems too much!), due to a way different control scheme, target group and hardware that's too weak. Mind you, it's not too weak for fun games, and I'm not saying there won't be great games for it, they'll just be different games than the regular x-platform titles. If that's an advantage or disadvantage remains to be seen, but here this question is irrelevant! ;-)

    "You can't really brush this off by saying graphics aren't important because if graphics aren't important, than the Playstation 3 has no competitive advantage against the cheaper to buy, cheaper to develop for Nintendo Wii or X-Box 360."

    Gee, i keep hearing this about the cheaper-to develop-for Wii (not X360, when they say this game studios mean primarily content generation, and that is *exactly* as work-intensive as for PC or PS3!). But can anyone finally please explain to me why Wii-games cost EXACTLY the same as PS3- and X360-Games then? What happened to all that "easier to develop, so games will be cheaper"? Is Nintendo raking in all the extra money?

    It's a bit like this whole "HD-DVD is way cheaper to produce because we can use DVD-manufacturing lines" BS - Where exactly were HDDVDs cheaper than BDs?

  64. Anonymous Coward
    Stop

    @ ex-Applenazi

    Quote: "But can anyone finally please explain to me why Wii-games cost EXACTLY the same as PS3- and X360-Games then?"

    I dunno where you buy your games, but Wii games tend to be 10-20 EUR cheaper than PS3 or Xbox games.

    Oh, if I were you I'd get a new keyboard. Yours is dropping smilies and exclamation marks all over the place. It makes you look like a bit of a d!ck...

  65. PaulM

    @Highlander

    Highlander believes that if the Cell processor is upgraded in the PS4 then the upgrade will be to clock the Cell processor at a higher frequency and not to increase the number of SPEs. I think that this is unlikely. This is because I understand that doubling the clock frequency of a processor increases the power consumed by a factor of four.

    A far better use of that extra power would be to quadruple the number of transistors in the Cell processor chip. This is the approach that intel has take with Core Duo processors where extra performance in gained by increasing the number of processors on the chip and not by increasing the clock frequency.

    I extimate that quadruping the number of transistors in the Cell processors chip would increase the number of SPEs from 7/8 to around 40.

    According to Moore's law the number of transistors on a chip doubles every 18 months. It is therefore perfectly possible to imagine a Cell processor with 40 SPEs being available 3 years after the manufacture of the original Cell processor.

    IBM Cell processor documentation does not specify the number of SPEs which indicates that future Cell processors will most probably include more SPEs.

This topic is closed for new posts.

Other stories you might like