* Posts by ifadams

5 publicly visible posts • joined 12 Nov 2014

Great Graph Database Debate: Abandoning the relational model is 'reinventing the wheel'

ifadams

Re: Riiight

A year is optimistic for anything beyond early POC/Beta. While I'm not arguing that an RDBMS can't get such functionality, I think there is a serious issue in trivializing how hard it is to add functionality to these systems given the enormous complexity of maintaining consistency at scale, nor am I convinced the complexity of shoving it into an RDBMS is worth it.

Google's plan to win the cloud war hinges on its security aspirations

ifadams

No matter how shiny and important somebody on the inside thinks security is, its an uphill battle to make it the deciding factor for customers. Orgs and businesses generally don't make a cloud shopping decision based on who has the better security features....

Should they? That's a debate worth having. Will they? I am cynical about that....

The History Boys: Object storage ... from the beginning

ifadams
Headmaster

Erm...

Might just be the academic pedant in me nit picking at terms, but content addressable storage is generally different from object storage in the storage world I live in, to the point that I almost never hear the terms used interchangeably.

Certain implementations share flavors of one another, but the key is in the name: content. A content addressable storage system is one where the content is hashed (or a portion of it hashed) to for the purposes of producing an address that serves the dual purposes of implicit deduplication and naming of the data. Object storage, on the other hand, tends towards using the *name* of the object as a way of locating it, by hashing it and dumping it on a consistent hashing ring with object servers owning parts of the namespace.

FLASH better than DISK for archiving, say academics. Are they stark, raving mad?

ifadams

Re: More to the story....

Good points, but I think the base point the authors were making still stands. A low rate of data refresh (even assuming a total overwrite of an entire device every month, which would be unusual in an archival environment and or assuming very aggressive checking and data refresh) it will take a *very* long time to reach the max program erase cycles. More anectdotallly, (can't remember the source, apologies) Id seen some empirical tests that suggested that the max program erase cycles even in MLC architectures are pretty conservative. Not something to rely upon certainly, but still not hurting the base point of "Hey, we're gonna take a loooooong time reach that in a write-once/maybe scenario".

ifadams

More to the story....

So this work had its precursor in a tech report which addresses *some* of the concerns folks have above

http://www.ssrc.ucsc.edu/pub/adams-ssrctr-11-07.html

The idea being that there are a lot of small benefits that add up to flash or other SSD technologies being a better medium for longer-term storage than disk or tape in certain scenarios. It also doesn't assume a completely unplugged drive, but rather a lightly or self managed one that periodically self checks and audits the data, refreshing as necessary.

On the write endurance front, if the data isnt written that often it becomes a moot point. Even crappy SSDs have endurances of at least 10s of writes, which with proper wear leveling is a ton of overwriting for a drive in archival scenario.

Read disturb can be an issue, as repeated reads can have a weak programming effect.

Experimental data is pretty sparse on retention times of flash, but I've seen a few papers claim 10+ under ideal circumstances. But, like I said earlier, the authors really dont assume totally neglected media.