Say I write some code and have my copyright message in the code, maybe register copyright by whatever pointless method EU want ...

How no earth will a system detect infringement of my copyright?

Assuming the code is non trivial then there will be all sorts of clashes with other code

e.g. in my code is a helper method to read text from a file, yes it's trivial, but handy for unit tests and if anything alters in terms of file raeding needs it can be done in one place

lets assume signature a bit like this

string ReadFileAsText(string fullFilePath)

Its possible that within lots of other code in a repository is the same method signature

There certainly will be matches on small code fragments, who has not seen bits of code that do counting of something and have a line that defines and initializes the count

e.g. int count = 0

How much "match" will count as infringement 5%, 10%, 90%, 100%

What if its only 5% but the 5% happens to be the most innovative methods they have "stolen" from my code as thats the key "secret sauce" that made it of value?

What if they take the code and obfuscate it? All teh special functionality is there but code "text" looks vastly different to the stolen original

Lots of work parsing the code, will need to be language specific, so that comments can be "ignored" when evaluating code copying (otherwise just altering / increasing comments would easily take the degree of matching down)

Yet more magic thinking that can ony cause problems.

There's enough legislation dealing with copyright breaches already, making repository hosts do copyright scans (that will have masses of false positives & negatives) is not achieving much other than letting bureaucrat tick the "something must be done" box

