the system must scan remaining data ... ?
Why? If the reference count (to the file, block, or whatever you've deduped) is going from 1 to 0, you can delete the actual data, otherwise you need to keep it.
I'm similarly stumped by "reclamation is rarely run on a continuous basis on deduplication systems – instead, you either have to wait for the next scheduled process, or manually force it to start." What prevents reclamation happening as soon as the reference count hits zero?
Regarding the randomly selected early warning thresholds, wouldn't it be more useful to monitor usage vs reference count. e.g.
90% full with reference count > 100 is different from 90% full with all files having reference count of 1. In the later case, targets for deletion will be easier to identify.