I was a little surprised and saddened to read that DARPA has decided to pump US$250m into Cray and US$244m in IBM for them to work on DARPA's next high performance computing initiative - their "Petascale" program.
There was a lot of bidding and vying for the investment and it came down to 3 final contenders - IBM, Cray and Sun. Now being a Sunny, you can understand my disappointment - had Sun got some of this cash, I may have got a pay rise. That aside, I'm surprised DARPA made their decision based on the idea's pitched.
IBM pitched their idea based on the current Power processors and their GPFS clustered filesystem, but promised the IBM system would run on their upcoming Power7 processor. Ummm, what Power7 processor?
IBM has just jumped onto the bandwagon Sun has been driving for years (opening their processor specs to the world - first SPARC and now OpenSPARC) and
released its first collaborative specification, the Power Architecture Platform Reference (PAPR) to Linux developers. What is interesting about this is you need to be registered with them to get hold of the specs and they don't seem to give any details of their so called upcoming Power7 processor on their Power Architecture Processor Roadmap. So if this Power7 processor exists somewhere, they're keeping schtum about it's capabilities and specs and when it's coming, especially considering that the agreement with DARPA requires that the architecture be commercially available.
In contrast, compare it to Sun's processor roadmap. Whilst specs aren't available yet either, at least Sun is telling the world what's coming.
Cray's bid was also centred about something that doesn't exist yet and something Sun has done before. Cray's bid was based on the hypothetical product called "Cascade". This will essentially be a blade system that will combine Opteron-based servers, FPGAs, vector systems and massively multi-threaded systems into the same chassis and then use a software layer tool to spread the load across the blades to get the best performance.
Whilst Sun doesn't make/use FPGAs (why would they want to? They're slower, less functional and more power hungry than ASICs) or vector systems, they've already combined multiple architectures in a blade chassis (now EOL) and will be releasing a new product shortly (Sun Blade 8000 P) which can hold Opteron and T1 processors. This is something that is happening, not "may happen" or is relying on DARPA cash.
Then there's the topic of problems and failures - we all know these do occur.
From a hardware perspective Sun has had years of experience providing the world with enterprise class availability features like like memory and CPU hot-swap, redundant system bus and fault tolerant clock-boards that do not require a reboot in most of their mid-range to high end systems like the E25k. IBM's high end p595 lacks some of this functionality. Credit where credit's due, Cray are good for this - they did sell the E10k to Sun after all.
Then there's software issues - nothing helps with finding software issues and performance problems like DTrace. This has got to be the single best software invention in the last 10 years (and I'm not alone thinking this) and best of all, it's open source. IBM is free to add it to any operating system they like, just like Apple and FreeBSD have done. Somehow I don't think this will be happening any time soon. How would they explain to their customers that the single best feature they've just added to their OS actually came from their nearest competitor? Even if they did swallow their pride and port DTrace to AIX (hell will freeze over first) or Linux, it's going to cost them: How would they support DTrace? Take whatever OpenSolaris gives them? No way. No customer would select this approach over going straight to the source and IBM knows this.
Whilst I don't know the ins and outs of Sun's tender and the gory details of everyone else's tenders, I can't help but think DARPA has been a bit careless in handing out their cash. I don't think they've really weighed up the pros and cons. It will certainly piss them off if Sun takes what they've done at the Tokyo Institute of Technology even further and beats DARPA to the two petaFLOPS mark. They've certainly got their soldiers all standing in a row.