* Posts by bkrosnov

1 publicly visible post • joined 24 Aug 2016

The TPC-C/SPC-1 storage benchmarks are screwed. You know what we need?

bkrosnov

IMHO, with very few exceptions, no customer knows the size of their active set or performance requirements of their application. Almost all ignore latency, and only look at high queue depth IOPS and MB/s numbers (storage vendors' fault). Testing with "dd if=/dev/zero" is common, and everyone has their favourite ad-hoc test tool and favourite FIO parameters.

Another aspect which is often ignored is the effect of the complete storage stack. The same workload behaves very differently if run on files in ext4 through the Linux buffer cache, vs directly on a block device. Applications live on top of actual real-world (guest) OS, so ignoring it in testing is silly.

Characterizing the workload can help a long way in designing synthetic tests to emulate it. Example things to measure, which would be very useful for storage system characterisation:

- size of active set over different time-spans, separate for reads and writes -- 1-hour active set is different from 7-day active set.

- sustained and peak random operations per second per size and per direction (read/write)

- io depth

These are all accessible through good quality storage traces. Traces are not ideal, but much better than the current status quo. They are used too little.

We need unification - one test which we can all trust represents actual applications. It will definitely need different profiles for different applications and use-cases. Still, as a storage vendor, being able to say "our $50k system can do 100 kilo-MySQL-stones and 50 PublicCloudVM-stones" would be extremely helpful. Getting the help of diverse users in defining meaningful units of merit for each profile sounds like a good idea.

A test harness blessed by, or even better designed by, Howard will go a long way in being widely trusted.

Just my 2c.

Cheers,

BK