Michael Wojcik

OK - let's put it another way - of the 100 people investigated, and charged with writing the infringing application, 34 of them will be completely innocent and the chances are not good that 32 of the others had anything to do with the application either.

I cannot for the life of me figure out what scenario you are describing, but it doesn't appear to be at all related to anything described in the paper.

First, they're talking about single authorship, so of the hypothetical "100 people investigated" (by, apparently, the world's least-competent police force), only zero or one would be guilty, and at least 99 would be innocent.

Second, let's assume the 0.64 accuracy rate does extend to some pool of 100 candidates that the model has been trained on, and the single guilty party is among them. The classifier is presented with input and indicates candidate A is the closest match. Disregarding all other factors, for some reason, the investigators interview candidate A. There's a 0.64 chance they have the guilty party, and a 0.36 chance they don't. So what? It's a place to start. Picking a starting interviewee at random has only a 0.01 chance of being correct, so they've improved their odds significantly.

Third, the hypothetical suggestion that someone might make stupid decisions based on weak evidence doesn't negate the importance of that evidence. A Perfect Bayesian Reasoner already knew it was weak, and treated it as such. Any other process for accounting for that evidence is inferior, but that's not the fault of the evidence. Nor does that suggestion vacate the importance of the mechanism used to extract that evidence, or of the research that led to the mechanism.

We see in posts like yours a typical Reg commentator fallacy: if there's any objection that can be raised to research, then that research is useless. It's tiresome, sophomoric anti-intellectualism.

