As a CPU micro architect, using a physical address in the vector unit is easier... it's cheaper in hardware implementation terms.
Using the VA would mean doing an MMU translation for every one of the vector elements, when accessed. This might slow the vector unit down or make it significantly bigger, to run at speed.
But it's the CORRECT thing to do. Using physical addresses like this has been done before, for the same reasons. No sensible micro architect would allow it. The hardware implementation might push for it, or say translating the first address... then running with that, but the only sensible solution is to translate every address, not in the same page boundary as previous addresses.
There is a problem however, that using physical addresses, that if precise exceotions. During a vector operation, say accessing the 5th element of the vector causes a page exception, this has to be dealt with by the hardware/software at a privileged level. Using physical addresses side steps this problem, but at a high cost in terms of safety.
Such a choice would be acceptable for an embedded processor with no user access to these instructions.
It's a bad choice of the implementers, that's come back to bite them.