Is BCPL's O-code similar enough to count?
Close but no cigar. Ocode successfully separated the front ends of the compiler (lexical, syntax analysis, and all that) from the code generator. The front ends outputted Ocode, and up to that point pretty much all BCPL compiler were the same. Some variations existed e.g. the Acorn BCPL compiler was structured as a set of overlays or whatever because of the extreme RAM limit on a BBC model B. The OCode was fed to the code generator stage which could vary wildly between implementations. So it was not originally conceived as a run-time environment.
Moving on a little, M. Richards did produce INTCODE, that was a simple virtual machine designed for interpretation, to help with bootstrapping. You could cross-compile BCPL to OCODE and code generator would output INTCODE, and then write a simple but slow INTCODE interpreter on the target machine. I think the idea there was to be "just good enough" to bring up a compiler on the target machine for which you could write a native code generator. INTCODE was a bootstrap aid, not an execution environment.
The next iteration was something like CINTCODE which was designed for interpretation. CINTCODE was designed for microprocessors where address space was at a premium (C=Compact). So in the Acorn implementation of BCPL for BBC Micro, the code generator takes OCODE as input and CINTCODE as output, and the CINTCODE was interpreted by code in the *BCPL language ROM. CINTCODE implementations did permit escapes to native compiled code for those time-critical bits. The BBC Micro BCPL was created at Richards Computer Product, run by John Richards, lovely fellow. M. Richards continued to develop this stuff further in academia so there's probably a divergence between the two over time.
That said, I think CINTCODE is roughly contemporary with UCSD p System.
I'm an old man now, forgive my slips of memory.