Reply to post: Well-established, proven dictionaries and encyclopedias

Oh dear... AI models used to flag hate speech online are, er, racist against black people

Il'Geller

Well-established, proven dictionaries and encyclopedias

You see, companies like OpenAI prefer to annotate using random texts mined from nowhere, that is practically creating dictionaries and encyclopedias (for annotating) from scratch. For this they use astronomical volumes of raw texts, gigabytes and gigabytes. I tried this method during my preparation for NIST TREC QA and came to the conclusion that standard dictionaries and encyclopedias are more suitable, not least because they are compact. For example, only 25 and nice indexed megabytes for Merriam.

In particular, general dictionaries and encyclopedias are very good because they contain absolute minimums of bias. Thus, if you want to avoid completely or minimize the manifestations of AI racism, you must inevitably use well-established, proven dictionaries and encyclopedias, and not the random texts.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon