...could scan all emails, find any with red flags and then only store those ones
Right, the red flag being that it's not the 538925th copy of some spam mail. This way they can easily claim to not store most of the email traffic. Surely, advanced algorithms can be used to filter out other trivial dross ("I'll be late tonight, darling" and similar) and they'll be left with the truly dangerous original thought crimes.