Want to train a dragon? You'll need 500 million files, 730TB of data, 54,000 CPU cores...

Korev

Re: Files

It's a huge number of files.

In HPC land we'd normally use something like HDF5 to store the images (or other matrices) which are then very fast to access compared to have zillions of files all over the filesystem. Some people also use SQLite for this and I assume there are other similar tools that'd also work.

