We all know the singularity is: Internet + Cats + AI = FTL... Frank predicted the Cat videos on an AI powered internet (Thanks Youtube!), but we don't have FTL yet!
Reinforcement learning woes, robot doggos, Amazon's homegrown AI chips, and more
Hello! Here's a brief roundup of some interesting news from the AI world from the past two weeks, beyond what we've already reported. Behold a fascinating, honest explanation of why reinforcement learning isn't all that, Amazon developing its own chips, and an AI that colors in comic books. Also, there's a new Boston Dynamics …
COMMENTS
-
This post has been deleted by its author
-
Saturday 17th February 2018 11:16 GMT Doctor Syntax
"Sometimes when it’s just trying to maximize its reward, the model learns to game the system by finding tricks to get around a problem rather than solve it."
A bit like the horse that could do arithmetic except that it was picking up cues from humans when to stop tapping out the answer.
How long has it taken for this insight to dawn on them when it's been in plain site for a century or so?
-
Saturday 17th February 2018 13:02 GMT Christoph
The first use is always porn
"After the pixels associated with the bodies have been mapped, various skins and outfits are superimposed onto them."
One of the first uses of this in the wild will of course be to produce 'naked' videos of celebrities.
However it will be extremely useful for CGI films - Andy Serkis will no longer need to wear a special motion capture suit to play Gollum, you can as shown motion capture a whole crowd at once with no extra special equipment at all.
-
Saturday 17th February 2018 23:18 GMT Anonymous Coward
AI gaming the system already
TL;DR: Deep RL sucks
"It’s difficult to try and coax an agent into learning a specific behavior, and in many cases hard coded rules are just better. Sometimes when it’s just trying to maximize its reward, the model learns to game the system by finding tricks to get around a problem rather than solve it."
Reading this I am both strangely happy and confirmed as a human, as it sound just like 'US' and also more afraid of the future, that any system we create with AI + automata (robots) will be gamed by whatever we create with supposed intelligence.
We already imagine this see movies eg.
"Automata (2014) - Robots violating their primary protocols against altering themselves. What is discovered will have profound consequences for the future of humanity."
AI has already developed it own language, that spooked the lab and they shut it down.
If we divide the research into two, those that are to be function exceptionally and those that are to imitate humans, I wonder how much we just create our own worst enemy in both cases, doing all the worst or annoying things more efficiently and quickly.
I cannot remember ever appreciating a device that did things for me, but do appreciate assistive devices, machines that I operate.
-
Sunday 18th February 2018 16:03 GMT Steve Knox
“A researcher gives a talk about using RL to train a simulated robot hand to pick up a hammer and hammer in a nail. Initially, the reward was defined by how far the nail was pushed into the hole. Instead of picking up the hammer, the robot used its own limbs to punch the nail in. So, they added a reward term to encourage picking up the hammer, and retrained the policy. They got the policy to pick up the hammer…but then it threw the hammer at the nail instead of actually using it.”
This isn't a failure of RL; this is a failure of the researchers to identify and control for their own preconceptions. Why were they trying to train the robot to do thing the most inefficient way possible?
We only use hammers because our hands are too soft. Why should a robot use a hammer to pound a nail? Why was it "wrong" for the robot to identify a perfectly effective solution to the task which didn't require extraneous materials?
-
Monday 19th February 2018 08:40 GMT Destroy All Monsters
Animoo and Mangos
Hakusensha and Hakuhodo DY digital, both Japanese publishers of internet manga comics have released titles that have been automatically colored by PaintsChainer. There is also another option for those that want to hold onto their artistic freedom, where you can broadly choose the color of the clothes or hair in your drawings, and then PaintsChainer fills in the rest.
Actually sad as AFAIK, toning and coloring-in are jobs left for less-than-weel paid drudgery workers.
Automated away?
-
Monday 19th February 2018 08:40 GMT Destroy All Monsters
Hmmm.....
It’s difficult to try and coax an agent into learning a specific behavior, and in many cases hard coded rules are just better. Sometimes when it’s just trying to maximize its reward, the model learns to game the system by finding tricks to get around a problem rather than solve it.
Robots. Having a Deep Diversity Problem.
-
Monday 19th February 2018 10:37 GMT Nigel Sedgwick
Different Forms of Learing
Reinforcement learning is IIRC supposedly an approach inspired by (human) behavioural psychology. However, the example given (robot driving in a nail) strikes me as ignoring pretty much all we could 'learn' from human learning practice.
Years if not decades before humans drive nails, they play with such things as a toy hammer bench.
On top of that, any child will be shown what to do, in steps, and sequences of steps of increasing complexity. For example, to drive one peg down to be level with all the others, before learning to mount a peg first, and then to mount each peg in its correctly shaped hole. In AI circles, this approach is given the grand name "Apprenticeship Learning". The requirement is to copy the 'master' (usually a parent). There is, I suppose, reward - parental smiles, clapping, etc. However there is an explicit act of (direct) supervised learning - which is, by definition, different from the indirect use of reward in reinforcement learning.
I would have hoped that AI researchers would have learned (most likely by apprenticeship themselves) that machine learning, to be both effective and efficient, is best done through a combination of Apprenticeship Learning (ie steps to copy) followed by tuning (eg how hard to hit the peg given how far down it must be driven) done through (mainly a mix of) supervised learning (early emphasis) and reinforcement learning (later emphasis). It is inefficient, and hence inappropriate, to (attempt to) have the machine learn initially and overall from only the mechanisms suitable for later refinement.
And, of course, General AI is very largely the stringing together, in a useful order, of (quite sophisticated) steps that have been previously mastered (for other purposes). And the reward function (such as it is) is the general one (in engineering) of minimisation of resource usage (including time) with achievement of adequate performance/quality.
A major feature of human intelligence is the memory from generation to generation, of everything that was previously learned. And don't forget the importance of language (and speech) in that societal functioning.
Best regards
-
Monday 19th February 2018 11:34 GMT SeanC4S
Associative memory (AM) including error correcting AM and fast, vast AM.
https://github.com/S6Regen/Associative-Memory-and-Self-Organizing-Maps-Experiments
Black Swan Neural Networks.
Each layer is an "extreme learning machine." This in some sense allows greater than fully connected behavior at the layer level. I used a "square of" activation function because it works well with the evolution based training method I used. Perhaps because that induces sparsity.
The network always contains weight connections back to the input for technical reasons I describe:
https://github.com/S6Regen/Black-Swan
I think evolution algorithms can solve reinforcement learning problems in the least biased way. It makes fewer assumptions and I think is likely (eventually) to pick apart cause and effect regardless of separation in time.