* Posts by kdkeyser

3 publicly visible posts • joined 20 Oct 2015

Igneous ARM CPUs: What if they tossed the blindfold?

kdkeyser

Re: Don't think it'll fly

There are 2 classes of erasure codes: systematic and non-systematic.

For systematic ones (e.g. the common Cauchy-Reed-Solomon), the first n chunks are simply the original object chopped into n pieces. Only the additional ones (e.g. 8 in the example) are actually "parity" and do not consist of the raw original data. So in this case, each drive could in fact see part of the original object.

For non-systematic codes, your describition is accurate.

Of course, all of this becomes moot if the object is encrypted somewhere higher up the stack.

Erasure coding startup springs forth from Silicon Fjord

kdkeyser

Difficult to make sense of these numbers

"Classic" Reed-Solomon is MDS, i.e. mathematically proven to have the lowest possible storage overhead. So if they claim to do better, they'd better back it up with hard facts, because it would imply a new kind of maths.

Sounds more like they have build a "locally repairably" code, i.e. a code that has a slightly higher storage overhead than MDS, but which requires less data to reconstruct. This is not necessarily new (Microsoft, Facebook and Dropbox have been quite open about the fact that they use such a code), but can be interesting anyway.

The values for the recovery speed also do not make sense. Modern implentations of Reed-Solomon do multiple GB/s.

(and LDPC's become mainly interesting in the error-correcting case, while this seems to be mostly about erasures, i.e. data is lost, but not corrupted)

Overcoming objections: Objects in storage the object of the exercise

kdkeyser

Amplidata / HGST consistency

The AmpliStor (Amplidata) / Active Archive (HGST) object storage systems offer strong consistency, i.e. when a PUT is confirmed (200 OK), the data/metadata is guaranteed to be written with full durability

There is nothing intrinsic about the S3 API that prevents a strongly consistent implementation, the consistency is simply not part of the API and is implementation specific.

The scope of the consistency is indeed a single object.

BTW, DeepStorage has a whitepaper about AmpliStor performance, so that gives you at least one data point.