Re: I have _got_ to be missing something basic...
(tl;dr : I think the need to maintain the existing format is probably the killer.)
> I am not a developer, and very definitely not of parallel code
I am a developer, but have only had a few projects involving parallel code. Just enough to be dangerous.
> How do you allocate a continuous stream to an unknown number of cores?
That part is fairly routine, and there are several ways to do it. You can just pick a number of threads to run. Pick too many, and the OS is quite good at scheduling them to spread out the load evenly. Best not to have more threads than the machine has cores, just to avoid the (small) overhead of added threads.
> When decompressing how do you coordinate who is writing which parts of the output stream -- bear in mind this is not a file-based system -- without expensive IPC or locking?
Hmmm... the loop I'm envisioning is :
While data is coming in :
Do we have (say) half a gigabyte, or whatever our block size is?
If so, start a new thread or process (probably thread) to compress said half gigabyte.
Has the thread processing the "oldest" block completed, with all preceding blocks (if any) written?
If so, start a new thread to output its data.
Both threads and forking new processes, by the way, are quite cheap in Linux. The above could be streamlined with a bit of IPC; the only "communication" in the above is a thread signalling completion. Pretty simple as it is, though.
> When compressing, how do you ensure that concurrent threads use the same encoding patterns? Note that all can be assumed to be receiving different data.
Not sure I follow you here. Of course the threads will each be compressing different data; the alternative would be that we've been conveniently handed data that repeats for each block. Different data means a different encoding pattern for each block.
> Also note you can't change the pattern used in existing formats: the format was set 13Y ago.
I'd ass umed that the format was block-based. Or, at the very least, you could say "here's a chunk of data to compress/decompress, and here's another chunk, and..." It isn't. To make it such would require a format change. Not much of one, and there's a 'version' flag that would allow for it. As an admittedly minor side benefit, such a change would mean that if you needed to decompress only part of a file, you could do so.
It would not be completely impossible to do it the way I've described and stay with the existing format. Each block would need some data from the end of the preceding block, because the compression mostly consists of saying "copy N bytes from delta_bytes ago". But it does raise the level of difficulty a fair bit.