Multiple use-cases, multiple architectures
There are 3 drivers for long-term retention of digital content
1) I want to reuse, repurpose, and re-license (eg movie studio)
2) I want to analyze (eg Amazon and Facebook)
3) I want to preserve because that's what I do (US-LOC) or because I promised (Shutterfly)
Each brings different sensitivities in terms of performance SLA, cost, and data integrity. For use-cases 1 & 3, tape-based latencies are entirely acceptable as long as sequential performance is good enough for bulk data operations. Analytics will almost always need more consistent SLAs.
In all cases however, the placement, maintenance, and performance expectation for *metadata* is much more aggressive than the SLA and placement rules for asset itself. For these sorts of storage solutions, the query has always dominated as the first step in data IO, although to-date most often performed against a host database. In the future, storage solutions optimized for these use-cases should recognize that semantic through distributed indexing mechanisms implemented in flash. And although the LTFS file-system may be is useful at a component level, it is inadequate for the task at an aggregate solution level.