Thank you for not using the word "parameter"
> Large language models are, by nature, orders of magnitude bigger than that, and they are not human-readable code. They are vast tables of billions or trillions of numerical values, calculated by huge numbers of machines. They cannot be checked or verified or tweaked: it would take cities full of people working for millennia to read them, let alone understand and amend them.
Most articles, including those from The Register, keep usin the word "parameters" to describe all of those numerical values. As nicely described by this article, they are not now, never have been and (barring a miracle of computer science) never will be "parameters"!
A parameter is something you feed into a function to generate a known effect: box(x, y, height, width) has four parameters and - here is the important bit - *you*, the human, the programmer, KNOW what they mean! You can set them - and if you don't get the (usually blindingly) obvious results then you can declare that fact as having found a bug[1].
You sometimes come across a "parameter" that isn't well-described, may not even be labelled and then you are either in an "artistic" situation[2] or are considering revenge on the idiot who created this. Consider the annoyance at coming across "I passed in (q + 17.63) here 'cos that made it work, don't know why"!
But when you are faced with billions of values, where you can not say *why* they are set at that value[3], how they may differ in meaning or effect from any of the other values in the array, not even the one in an adjacent cell, then they are "understood". Unless you make an extraordinary effort - and get very lucky - you can not even say "look, I can't put it into words, but if you add one to this specific value you can see the effect on the output" ![4]
Do not use the word "parameter" to describe these values.
The numbers inside an LLM are, at best, "weights", as in a "weighted graph". But even that starts to lose meaning when you can't pretend to comprehend what the resulting graph represents. They are, individually, little more than near as damnit arbitrary numbers.
[1] ok, the bug *can* be in either the code itself *or* in the documentation, the thing that provided you with the knowledge in the first place. For example, you may be upset that making 'y' larger moved the rectangle in "the wrong direction", but the error may not be in the code but in the docs, which forgot to remind you this is a screen-graphics function, not a proper maths function, so y points down, not up. Or it is even weirder and you are looking at an (x, z) projection so y is ignored - until you switch on perspective. But all of those interpretations are still explainable, understandable and decently parametric.
[2] e.g. plenty of knobs on synths, especially circuit-bent ones, are very unhelpfully labelled and you aren't even supposed to be able predict what they'll do, just twiddle, have fun ad maybe get lucky with the next Top Ten hit. But even then, *someone* exists who actually knows what is going on: "Of course changing that resistor makes it sound like that, I spent days getting rid of that cross-coupling through that power line!" Or even "bloody singers, if they bothered to read the manual for Autotune it is *obvious* that setting those values will turn this $2000 precise audio analysis and resynthesis tool into a $15 over-clipped crappy diode ring modulator! Why look so surprised?!"
[3] logically, you could, of course, log whenever the value was changed and derive some chain from that, but it would itself be so ludicrously huge, dwarving the existing table of numbers that it could never be done. A shame, as then we could start to work on figuring out what percentage of this bit of output we can attribute to Fred's writings, what percentage came from Jim's copyright work and pay them accordingly.
[4] yes, there was the report where a group found where an LLM "stored the word 'Paris'" and by changing that they affected all the outputs to now say things like "the Eiffel Tower in Margate" - except they put in lots of efffort, figured out *one* number (and not, say, all of the entries in this block or row), they didn't report this gave them a mechanism to find, say, Moscow, London and other capital cities (or anything similar) so basically got lucky. Now, about the remaining billions of "parameters"...