Using Reinforcement Learning To Train Robots.

Picture of robot

The Problem.

Most autonomous robots today can only do very specialized things like placing parts on an assembly line at specific times, and they already cost thousands of dollars a piece. If you have many items to package, you'd need many different robots. But there is a better way. A subset of Machine Learning, known as Reinforcement Learning can solve this problem. Machine Learning is a subset of Artificial Intelligence wherein the AI learns how to do something based on data. Reinforcement Learning builds on that idea.

What is Reinforcement Learning?

Reinforcement Learning is different from regular Machine Learning, because the AI is controlling an agent which is is placed in a virtual environment and learns how to perform a task in that virtual environment. It’s the same idea as regular Machine Learning except the AI is getting data from the environment. However, you cannot label if every possible action is a good or bad one. So you can give the AI a positive reward if it gets closer to its goal, and a negative one if it gets farther away. These rewards can help it learn that crashing into objects is bad, but picking them up is good, for example. The agent's action gets sent to the environment and the system interprets if it was a good action or a bad one, and returns the reward and a state, which is basically the new state of the environment after the agent had interacted with it.

Explanation Diagram of reinforcement learning
Robot cooking in the kitchen

How can Reinforcement Learning Help Train Autonomous Robots?

We can solve the problem of training autonomous robots using reinforcement learning. Going back to the assembly line analogy, using computer vision the robot can detect the kind of item that is on the line, and using reinforcement learning, figure out how to orient itself to pick up that object. We could use it for more advanced problems too. A crazy not necessarily practical idea that reinforcement learning could theoretically accomplish would be cutting and cooking vegetables for you. You can create a simulation with a positive reward for cutting veggies, placing them in the pot and cooking them. After hundreds of thousands of generations (runs), it will learn how to cut and cook the veggies. This illustrates the broad spectrum where Reinforcement is useful!


contact me.

Feel Free to Contact

Email Address:
Sign up for my newsletter!