* Posts by PeteMaths

3 publicly visible posts • joined 2 May 2013

When asked 'What's a .CNT file?' there's a polite way to answer

PeteMaths

Re: cd trays

Which they were, but the slot size stayed, being useful for such things as CD drives.

D-Wave promises chip that could search the whole universe

PeteMaths

Re: Lets see it working then

No, quantum computers are only known to do better than classical computers in some quite special situations, and NP-complete problems (such as travelling salesman) are not one of those situations (in fact, the belief, although there isn't much evidence, is that they don't gain much for such problems). However some of the problems they are known to do better on are game-changingly important, such as factorisation (read: working quantum computation blows away much of the cryptographic basis for e-commerce).

What D-Wave promise is not a general purpose quantum computer, though - even if it meets their claims, it is not known possible to (and probably it is not possible to) run Shor's algorithm (the factorisation thing). It does some things which are potentially useful, what's maybe more important is that a different design using the same ideas might do more (assuming it works...). What's interesting is they try to sidestep a big problem 'decoherence' with the standard academic approach to building general purpose quantum computers. It's not clear if this is even theoretically possible, but they are going for the try-and-find-out approach rather than trying to work on the theory.

What seems to be clear is that there is something quantum going on in the current versions: there is definitely entanglement, the problem they want to solve definitely gets solved, and it seems reasonable to believe this solution really comes out of a quantum effect. Which is pretty cool. But, there is also (at least for now) quite a lot of reason to believe that this specific quantum effect is something a classical computer can simulate efficiently (there is a known algorithm whose performance appears to be similar, it's not easy to be definite due to a mixture of D-Wave holding trade secrets and that we don't really know how to look at the workings of the chip even if we wanted to). What is certainly true is that the results aren't anything to write home about in terms of computational power. Unless you're reading this on a cheap mobile phone, then you have more computational power in your hands than anything D-Wave demonstrated. Of course, as they point out, you also have more power (by a much greater factor) than the Colossus machines had, and quantum computing today probably should be compared to that.

Crowd-sourcing interpretation of IBM RAID 5 extension paper

PeteMaths

So the first thing to understand is a simple error-correcting code. If you want to store some binary string, the simplest thing you can do is just store the string. If your storage is faulty, one failure of even one bit loses everything (or at least you have to do something clever to recover). So you could try to get round this for example by saving the same data three times. Now one bit failure can be corrected, since you can take the majority vote on each bit - in fact like this you can correct many bit failures, provided bit number i doesn't fail in more than one of the copies. But in practice if you think many bits may fail, then it's likely that for some i, bit i does fail in more than one copy and you lose everything. Usually you assume bit failures occur independently at random, and you want to be 99.99% sure that you will not lose data (or something similar, in practice with more nines). This save three times scheme isn't great protection (the bit failure probability has to be very small or you lose data) and it takes a lot of space. You can do better by doing something clever: using some `code' which transforms your binary string into some longer codeword in a more complicated way, which has the property that any two codewords are very different: to get from one to the other you have to change a lot of bits, so after a small number of failures you can still recognise that there was only one possible codeword.

This works if you can find a set of codewords. But it's practically useless if the only way you can do it is some brute-force search: then your computer takes a century (or more) to set up your storage system. So you need some clever construction that can be computed fast, usually this is from some piece of linear algebra over finite fields. What you want is to minimise both the disc space used (given the failure protection you want) and the time taken to code and decode. Various sorts of clever constructions are well known these days; the problem is essentially solved.

Now there is a new problem: your failures are not any more independent. If a disc fails, you lose the entire disc. If a sector fails, then other sectors on the disc are unaffected but the sector is gone. The classic codes don't work well (or at least they aren't known to work well) in this sort of model, because they would assume that failure of lots of bits is unlikely wherever those bits are and place an entire codeword in one sector of one disc - you lose the sector and the data is gone. You need a code which will cope well with this sort of thing. RAID 5 does this to a certain limited extent, but maybe even that is not good enough in (some) applications. Because you put lots of extra requirements on your code, you have to be even more clever with coming up with a code. Before this paper, there was RAID 6 (which is a more clever version of the save multiple times idea) and some brute force schemes which showed that it is possible to have such codes but which were practically useless. This paper gives something you could use in practice; it's not too wasteful and the construction is farly quick. But you would only do so if you complain about RAID 5 failures often, and this paper's method only gives you a moderate amount more protection, so it won't help you much. What you would really like is a more general scheme that lets you specify how many discs you can allow to fail before data loss. That doesn't exist yet.