I didn't say you had to trust me or them. I didn't say all their statistics were wrong or falsified, although when you see statistics from lawyers, you know they are the set that support their case. That may be the only correct way to view the data, but it also might be very wrong. I do not want to convince you of a particular point. I will, however, explain some things you questioned.
First, you don't understand why I said that 16 employees created the divide they're talking about. Here's how I got that. There were, as the article states, 248 people over 50. 60% of them were fired. Of the set below 50, 54% of them were fired. Had they fired 54% of those over 50, that would have come to 133.9 people, which we can round to 134 although I rounded it to 133. They actually fired 149 people, which makes 16 more than my original rounding of 133. Therefore, the difference that they are pointing to, which is 6 percentage points, is 15-16 people. This may but does not necessarily indicate a small sample problem. For a simplified example, if they fired the three-person team mentioned in my original comment, you could say that they fired 100% more women than men, but the difference is one person, which makes it more likely that it wasn't for that purpose. This is a simplistic understanding, as there are plenty of reasons other than simple chance that could have been used. Some would indicate discrimination and some would indicate normal running of a business. If the sample size was large enough, assuming that some other factor was necessary would make sense. Since it is smaller, that is not as evident and other factors must be investigated.
The difference becomes larger in other sets. For example, the female-to-male divide consists of 221 people, which is a much larger sample. That still doesn't necessarily indicate discrimination, but that is much less likely to happen by random chance. By far the best number they have is when they limited the numbers to engineering roles. The reason for that is that, if there were a lot of women and few men on the moderation team, then when Musk decided to demolish the moderation team to get started on destroying the business early, the women would end up in a worse situation for a reason that will not count as discrimination in court. They will probably have to make a lot of similar subsets to demonstrate discrimination when controlled for what kind of job was being done by the person who got fired.
Bringing in random chance is already a problem, since even with the extremely poor quality that was used during the process, people don't get fired by random chance. Calculating how likely this would be if the decisions were made using dice is not the right way to determine discrimination or not anymore than you would expect your raise to be determined by flipping a coin. You still might be treated unfairly, but it would be due to decisions of your employer that they didn't want to give you more money, not valuing you accurately, or discriminating against you, none of which is random. When you start comparing something to random chance, you open yourself to lots of arguments about what counts as a positive result which will dramatically change the random value. When that comparison is of no value, it is often not useful to bring it up because you'll end up in a stats fight.