Lexical noise
There is a sentence "Alice and Bob exercise merrily, she trains a lot."
The word "merrily" can be an example of lexical noise; where it's typically superfluous patterns that do not explain the central themes contained within the digital textual information and, accordingly, removal of such noise often results in an improvement in the quality of the structured data.
Suppose somebody doubts if Merrily the athlete who trains a lot being exercised by both Alice and Bob - it's a name. Or is it an adjective? Or both? So this sentence is structured having "merrily" both as a noun and adjective.
With my patented lexical noise deletion AI-parsing gets FIVE patterns:
- Alice exercise merrily - 0.25
- Bob exercise merrily - 0.25
- Alice trains a lot - 0.5
- she trains a lot - 0.5
- she trains merrily - 0.5
AI sees that the word "merrily" is an adjective from its context and subtext, comparing its dictionary-encyclopedia definitions.
Without AI-parsing gets SEVEN patterns:
- Alice exercise merrily - 0.1(6)
- Bob exercise merrily - 0.1(6)
- merrily exercise merrily - 0.1(6)
- merrily exercise a lot - 0.1(6)
- Alice trains a lot - 0.5
- she trains a lot - 0.5
- she trains merrily - 0.5
With dictionary-encyclopedia AI gets TWO synonymous clusters, without - THREE! One extra:
- merrily exercise merrily - 0.1(6)
- merrily exercise a lot - 0.1(6)
If the search term "Merrily exercise a lot" came, the above sentence will be found even if it has nothing to do with Merrily!
OpenAI and other companies which do not use my patented technology, which train their data by random data and not on dictionary-encyclopedia - cannot remove lexical noise, gets results like this with "Merrily" and are not trustworthy at all.
Do not invest in them! You are gonna lose money!
My AI database can, however, be 100% trusted.