Re: LLMs cannot summarise
Or it could just be you got lucky.
You'll need to repeat this experiment at least 1000 times with careful review, at which point you could draw a vague conclusion. Three sample points are not adequate to draw anything from. If you find it performing perfectly each time even then, I would eat my hat.
"Whatever's going on behind the curtain, it's doing more than echoing it's training data."
Sorta sort of. It's a huge tangled ball of wtf. Things like word2vec showed you can do some interesting algebraic 'reasoning' from unsupervised training at the word level. For example (man - king + woman == queen), which actually makes sense when you look at how it works.
At the scale of these tens of orders of magnitudes larger models, nobody really knows why it is doing anything, and some portions of the network may be simply undertrained but we don't know because we've not tried it yet (and will never have enough data to satisfy training it anyway, or adding more data will cause previously working parts to start barfing). When it comes to anything that requires auditing and accountability, this should be an enormous red flag.