where is this Internet of which they speak?
It seems to me that if you scrape Reddit & Twatter for your training text the "AI" will be generating an awful lot of 'interesting' language.
Project Gutenburg might be more useful, but the generated patterns/word sequences would resemble turn of last century (1900s) language....