* Posts by jhnc

1 publicly visible post • joined 6 Mar 2022

Apple didn't engage with the infosec world on CSAM scanning – so get used to a slow drip feed of revelations

jhnc

I suspect the reason for your "inexplicable" matches may often be related to debian bug #87013:

Although the code finds pairs of images whose fingerprints really do differ by less than the threshold (say: a1=a2, b1=b2, c1=c2, a1=c1, b1=c1), it then, for user convenience, by assuming transitivity, coalesces these pairs into sets (leading to: a1=a2=c1=c2=b1=b2). In some cases this is fine, in others (especially large corpora) not.

Of course, sometimes it is the case that fingerprints of very dissimilar images are close to each other.