back to article Finally, a good use for AI: Machine-learning tool guesstimates how well your code will run on a CPU core

MIT boffins have devised a software-based tool for predicting how processors will perform when executing code for specific applications. In three papers released over the past seven months, ten computer scientists describe Ithemal (Instruction THroughput Estimator using MAchine Learning), a tool for predicting the number …

  1. BebopWeBop

    I can certainly see the application of a more generalised system, but at the moment it goes to the heart of the problems with black box magic predictors - it needs to be fast enough to provide a useful part of the design - test loop.

  2. Pascal Monett Silver badge

    "dropping the mean absolute percent error by more than 50 per cent across all benchmarks"

    Okay, they halve the error margin and still produce a result "quickly". Nice to know. It would be even nicer if we had a ballpark number of the analytical error margin to get a better idea of how important this is.

    As it is, this 50% improvement in guesstimate precision could be half of 80 or half of 10, we just don't know.

    1. Michael Wojcik Silver badge

      Re: "dropping the mean absolute percent error by more than 50 per cent across all benchmarks"

      There's a link to the paper in the article, you know.

      A quick skim suggests that typical error margins for llvm-mca and IACA are around 17-24%, while those for Ithemal are in the 8-9% range.

      As the paper points out, though, a number of applications just care which of several alternatives is likely to be fastest, and the accuracy of the specific predicted timings doesn't matter as long as the winner is correct. They use Spearman correlation to gauge that, and there Ithemal came in at around 96%, versus ~91% for llvm-mca and ~92% for IACA. So if Ithemal is, say, being used by a compiler to select particular optimization strategies in inner loops, it might well squeeze out a not-insignificant performance improvement over existing implementations.

      Ithemal also seems to frequently do better in edge cases where the other models are way off (see Figure 3 of the paper and accompanying discussion). The authors theorize that these represent cases dominated by undocumented Intel microarchitectural optimizations that even Intel's own IACA does not model.

      The ANN network in this application, by the way, is a pretty standard RNN-based Deep Learning network, with LSTM (Long Short-Term Memory) components. That seems like a good choice for this sort of application, since what you want to do is train the network on a set of small, well-labeled atoms (instructions) and then on sequences of them. It's interesting, though, that they also tried a DAG-RNN and it didn't perform as well. Graph ANNs have done well in some other domains; Colyer summarized a paper on them not that long ago. The authors ascribe its underperformance here to the strong effect of specific instruction ordering on microarchitectural optimization.

  3. Michael Wojcik Silver badge

    Auto-vectorization

    The third paper mentioned at the end of the article, on auto-vectorization, also uses an ANN approach, but here it's a graph neural network (specifically a Gated Graph Neural Network), with a classic multiple-layer perceptron stack and softmax on top to derive the output. Also, this network is trained using supervised and imitation learning, rather than the unsupervised learning used for the modeling network in the first paper. If you're interested in the use of ANNs in code optimization, the two make for an interesting comparison of what techniques work in different areas.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like