kgraft is essentially based on the principles I've described, with some neat hacks on top. It does, indeed, work on a function-by-function basis, by inserting a trampoline at the start of the original.
On top of this it has a mechanism that ensures everyone 'sees' (well, executes) a consistent view, in case the changed behavior of one function is dependent on the changed behavior of another.
It doesn't actually replace functions "in-place" or such, for the obvious reasons, including what you mentioned (if it's bigger than the original it'd have to move everything after it, which can't be done for any function there might be function pointers to or any other external reference, like interrupt handlers).
See eg. http://events.linuxfoundation.org/sites/events/files/slides/kGraft.pdf (slide 11)
"-Callers are never patched
-Rather, callee's NOP is replaced by a JMP to the new function
- So a JMP remains forever
- But this takes care of function pointers, including in structures"
So you have the original around, but I think it does free up old patched versions if they are replaced in another patch applied after.
You could presumably remove (well, zero out) the code after the trampoline eventually.
However, if you're going to do this for the entire kernel, after one full "move" you are in exactly the scenario which was discussed here earlier - all function calls just go to a trampoline that calls the actual function. With exactly the same issues - just done a lot more inefficiently.
TL;DR kgraft doesn't actually MOVE code, which is what we're discussing here. It diverts execution. It's meant for small patches, not shuffling the entire kernel around.