The thing is, NLP models are the ultimate black boxes. We don't have any way to apply binding constraints on a model's behavior. if a model has been trained on something, you have no way to prove with mathemathical certainty that it won't reproduce it eventually. And being "almost quite sure" that your model won't tell everyone my private data is, simply, not good enough.
The best way to defend against this kind of attacks is to not train models on personally identifiable information. At all. Like, any model that has any PII in its training set should be legally considered a derivative of that PII, with all that entails.
This makes training them harder? Tough luck.