Re: since they’re all publicly available anyway
Wrong on all counts.
"Using an image to train an AI does not necessarily entail copying that image, so copyright issues would not apply."
Wrong. Training requires the software to have access to the image to read features off it. That requires the software to have read access. Which can only be done if the software has the data. Images published online are not licensed for any purpose automatically and it is illegal to treat them as such. If the image is licensed under a noncommercial license, they're in violation. If it's licensed on a royalty-required license, they're in violation. If no license is stated, they could be in violation. It's important that copyright isn't just for distribution; it can prevent you from reading without permission too.
"Nor would it violate copyright if you reduced an image to a set of numbers that you stored so that recognition software could use to match two images of the same face."
Wrong. Data is a series of numbers. You can copyright text, and if I store an array of the unicode values for each character, I've violated your copyright. Now if you summarized the data into new numbers, those numbers aren't covered under my copyright unless they include most of the existing data, but the data you summarized to get them is. You are allowed to keep those numbers, but you weren't allowed to generate them. They could be ordered seized or destroyed as the products of criminal activity. That's unlikely, but possible.
"In addition, you quite likely gave away your copyright to the site that you uploaded the image to (e.g. Facebook, snapchat YouTube etc.) when you clicked on "I agree" to their terms & conditions."
No, you almost certainly did not. Read the terms and conditions. They all have a statement giving the site the right to display unless you revoke it (sometimes they omit that part), but few if any make you turn over the copyright to them. The ones you mention do not, and in fact explicitly state that you retain ownership*. Even if they had, copyright would still apply, and Clearview didn't get the rights to the data.
*Let's look at the text of some of these:
YouTube: "You retain ownership rights in your Content. However, we do require you to grant certain rights to YouTube and other users of the Service, as described below. [...] For clarity, this license does not grant any rights or permissions for a user to make use of your Content independent of the Service."
Snapchat: "Many of our Services let you create, upload, post, send, receive, and store content. When you do that, you retain whatever ownership rights in that content you had to begin with. But you grant us a license to use that content. How broad that license is depends on which Services you use and the Settings you have selected. [...] Snap Inc. respects the rights of others. And so should you. You therefore may not use the Services, or enable anyone else to use the Services, in a manner that: violates or infringes someone else’s rights of publicity, privacy, copyright, trademark, or other intellectual property right."
Facebook: I'd have to disable a block to read it. It's not worth it.