Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. This one is going to be huge, certainly one
of my favorites. This work is a combination of several techniques
that we have talked about earlier. If you don’t know some of these terms, it’s perfectly
okay, you can remedy this by clicking on the popups or checking the description box, but
you’ll get the idea even watching only this episode. So, first, we have a convolutional neural
network – this helps processing images and understanding what is depicted on an image. And a reinforcement learning algorithm – this
helps creating strategies, or to be more exact, it decides what the next action we make should
be, what buttons we push on the joystick. So, this technique mixes together these two
concepts, and we call it Deep Q-learning, and it is able to learn to play games the
same way as a human would – it is not exposed to any additional information in the code,
all it sees is the screen and the current score. When it starts learning to play an old game,
Atari breakout, at first, the algorithm loses all of its lives without any signs of intelligent
action. If we wait a bit, it becomes better at playing
the game, roughly matching the skill level of an adept player. But here’s the catch, if we wait for longer,
we get something absolutely spectacular. It finds out that the best way to win the
game is digging a tunnel through the bricks and hit them from behind. I really didn’t
know this, and this is an incredible moment – I can use my computer, this box next to
me that is able to create new knowledge, find out new things I haven’t known before. This
is completely absurd, science fiction is not the future, it is already here. It also plays many other games – the percentages
show the relation of the game scores compared to a human player. Above 70% means that it’s
great, and above 100% it’s superhuman. As a followup work, scientists at deepmind
started experimenting with 3D games, and after a few days of training, it could learn to
drive on ideal racing lines and pass others with ease. I’ve had a driving license for
a while now, but I still don’t always get the ideal racing lines right. Bravo. I have heard the complaint that this is not
real intelligence because it doesn’t know the concept of a ball or what it is exactly
doing. – Edsger Dijkstra once said, “The question of whether machines can think…
is about as relevant as the question of whether submarines can swim.” Beyond the fact that rigorously defining intelligence
leans more into the domain of philosophy than science, I’d like to add that I am perfectly
happy with effective algorithms. We use these techniques to accomplish different tasks,
and they are really good problem solvers. In the breakout game, you, as a person learn
the concept of a ball in order to be able to use this knowledge as a machinery to perform
better. If this is not the case, whoever knows a lot, but can’t use it to achieve anything
useful, is not an intelligent being, but an encyclopedia. What about the future? There are two major
unexplored directions: – the algorithm doesn’t have long-term memory,
and even if it had, it wouldn’t be able to generalize its knowledge to other similar
tasks. Super exciting directions for future work. Thanks for watching and for your generous support, and I’ll see you next time!