Texas Instruments had a VLIW family of DSP processors around the late 1990s that I had the sad misfortune of working on. Again the promise was 1GIPS of performance from a 200MHz or so clock rate (which seems nothing now, but then was seriously impressive), but that was only possible on very specific code segments when the various internal units (integer cores, MACs, loop counters/index, etc) could all run code in parallel. Which was rare. What made it worse was the piss-poor compiler tools that hardly managed to optimise C-code for that sort of a situation, a life way too short to learn its assembly rules, and to cap it off a long instruction pipeline that was dropped, with a serious performance hit, any time there was an instruction branch (i.e. an if statement or break in a loop).
End result was a mediocre performance in reality, and a few years on it was beaten on performance by x86 style chips that had OOE and branch-prediction capabilities. Not to mention far better compilers for PCs, many of which were also free, and greater ease of debugging.