semantic topics within this corpus
First, I can not take anybody serious that writes sentences like this: “First, we take a large online corpus, Wikipedia, and use a well-known technique from computational linguistics to identify lists of words constituting semantic topics within this corpus,”
Second, this is yet another one of these prediction methods that are based upon on the assumption that history repeats itself. It doesn't. Yes, there are a few somewhat repeating patterns out there, quite a few in fact. But this is a system with a memory and the last outcome will carry with and ensure that you never have the same set of conditions required for the same outcome.
Third, the funny thing about economics is that it is a man made system, but not a fixed system. The rules change constantly, but also the knowledge of the participants (us) which influence how they behave. This leads to the fact that any new method to describe the system will change the system*. Thus if this method actually worked with any certainty then it would seize to do so as soon as people started to believe it can.
*It is required that the description is known to at least one that will change his behaviour due to it. It is not required that description or method actually has any validity whatsoever.