Re: AI indexing
"In RL, a software agent takes sequential actions aiming tocmaximize a reward function, or a negative cost function, that embodies the target problem. Successful training of an RL agent depends on balancing exploration of unknown territory with exploitation of existing knowledge." https://www.nature.com/articles/s41534-019-0141-3.pdf