It's amazing it's taken nearly 30 years to get to this point with FPGAs.
People were doing this sort of trick to ease cpu load substantially back in the late 1980s. Xilinx XC3020 FPGAs then were very low density - less than 2000 gates at 50mhz. We had to hand craft the logic to shoehorn it onto the available gates and still get the right timing on the interconnects. Different FPGA configurations were loaded by the cpu software dynamically as part of its normal operation.
It was obvious then that FPGAs on a board in a PC would add a lot of computing power. Not just as a field re-programmable gate array for an update of customised "hardware" - but as logic that could be altered dynamically as part of the computation.
FPGA gate density increased enormously - but the barrier seemed to be finding a software language that could define how to offload the algorithmic processing from the cpu software.