Re: I don't get it
It could be that they had not previously enabled Process-Context Identifiers (PCID), or possibly the virtual machine manager allowed it to be virtualised.
PCID is a relatively recent x86-64 addition. The PCID is a tag against the Translation Lookaside Buffer entries that acts as a filter, saying 'this TLB entry belongs to this process'. The hardware will only use a TLB mapping if the tag matches the current process's tag. This allows the TLB to contain mappings for multiple processes or contexts.
Traditionally, on a context switch between processes, the whole TLB had to be flushed, all entries discarded (or marked invalid). That meant that for the initial memory accesses that the process performed, including the instructions to be executed, the hardware would have to walk the page tables to find the mappings from virtual to physical memory, even if it was something that the process had recently accessed the last time any of its threads ran.
With PCID, the OS doesn't have to flush the TLB on a process switch - only if it's reusing a PCID value from a different process. It can selectively flush entries for a process if it's changing that process's address map, using the INVPCID instruction. This would normally happen in response to a page fault exception.
You can mark pages as global in the x86 architecture, which means that when you switch to a new process context - register CR3 changed to point to a different set of page tables, causing a TLB flush - the TLB entries for those global pages are retained. Since it's common that the incoming thread was already executing in kernel mode - for many workloads the thread is blocked on a kernel operation, not having been pre-empted in user mode - this saves having to walk the page tables to find the kernel code.
However, we're now putting kernel code into a separate memory space altogether, so that the processor can't speculate loads of its address space. That causes an address space switch on every user->kernel and kernel->user transition, which itself causes a TLB flush on older hardware or with PCID disabled. So, if the processor doesn't support PCID or it's turned off, the newly-loaded kernel code causes page table walks, then on return to user mode, it has to walk the page tables again.
TL;DR check that your processor supports PCID and the INVPCID instruction, and that it's enabled.