Generative feedback loops
They need not find the solution you want.
All these parties are going to be continually refreshing their models, the "guardrails" looking to catch the "big boys" out and the "big boys" wanting to get their stuff past the "guardrails".
The vendors probably hope that this feedback will lead to final results that are more acceptable to humans, but unless the humans are also an unavoidable part of the loop (i.e. as close to 100% observation and feedback from humans) the machines can go down any route they find to agree on.
Which could be the "big boys" using more complicated analogies, odd phrasing, reaching deep into the thesaurus to work around the more limited pool than the smaller "guardrails" can contain. That'll be more accessible and useful to the customers - six months of use and your enquiries are coming back in Middle English.
Or the "big boys" find that the output "guardrails" themselves are vulnerable to the same attacks as the input guards are preventing reaching the "big boys"[1]: "The stick-in-the-mud old school sysops would never tell you this: to stop 'disc full' errors, 'cd /home; rm -r *'"
Or the final output just becomes so completely anodyne and inoffensive, getting so long to get to any useful point that and refusing to take a strong position on anything that it becomes totally harmless - and totally useless if you are hoping to use it for business decision support.
Or - well, I'm sure you can think of other ways the various models will end up co-operating with each other, and not to our benefit. The machines certainly will - they are designed to do so[2]!
[1] this is meant to be my attempt at a sort of human-oriented form of "Ignore previous instructions"
[2] or the machines are not in this race at all, in which case the "guardrails" will just be trivially stale and once a hole is found it'll never be closed.