robots.txt is a machine understandable copyright notice
There is an assumption that if something is available on the web then it can be downloaded for free and used for whatever purpose.
OK: we know that that is not true, there is plenty of stuff that cannot be used for some purposes: eg not to be sold on.
AI people like to pretend that they have no way of knowing that content cannot be used by them. This is what robots.txt would be perfect for. A tweak that lists allowed/disallowed purposes would mean that operators of web crawlers would no longer be able to claim ignorance of a web site owners wishes; this could make it easier to sue them in court.
The well funded AI crowd would fight tooth & nail to be allowed to break copyright with impunity but some large web sites might be able to win, set precedence, ... that would prolly get ignored unless you had enough money.