In recent months, the field of humanoid robotics has taken some major steps forward. These steps take us ever closer to the future portrayed by Hollywood, in which robots are as common as fridges or laptops.
Every day, teams across the world develop humanoid robots capable of performing disaster-rescue missions. Far above Earth, NASA’s “Robonaut 2” performs mundane and dangerous tasks on the International Space Station, ensuring the safety of its human comrades.
One of the biggest challenges in the field of robotics is to take behaviours that humans perform subconsciously, and to translate them into machinery and code.
A perfect example is the challenge of “self-localisation”, which has been the recent focus of mycollaborative research with the University of Newcastle’sNUbots group.
Imagine that you awake in a pitch-black room, with no memory of how you arrived (much like the plot of many low-budget horror flicks).
As soon as the lights come on you would look around, process your surroundings and draw some conclusions as to where you actually are. Perhaps you are in the middle of the room, or in the corner, and perhaps you are directly opposite the door leading to your freedom.
Through these observations of features in your environment, you have determined your position relative to the door, which is information that you require in order to leave. This is the process of “self-localisation”, and is something that a human would do instantly and subconsciously.
To separate the concept of self-localisation from “mapping”, consider a simpler scenario in which you are given a map of the room in advance.
You are able to inspect the map to identify the key features (such as the door or telephone), but have no idea where in the room you will awake. Upon awakening, you can immediately search for these features (ignoring all else), again assisting with your escape.
Solving this problem without a map leads to an area of active research called “simultaneous localisation and mapping” (SLAM). But the simpler scenario (in which a map is provided) is still incredibly complex to implement on an autonomous humanoid robot, and involves the following major steps:
Each of these major steps has been the focus of much research. But we recently realised that a major question had been overlooked: where should the robot be looking? More concretely, if you wake in a pitch-black room and the lights are switched on, what is the quickest and most efficient way to look around the room to determine where you are?
Given the state of and excellent results of modern machine learning research, the solution was clear – let the robot work it out for itself!
So what’s the best way for a robot to learn an inherently human trait? This is a complex and unsolved question, but there is intuition in trying to best replicate the process by which a human would learn.
For this work, so-called “motivated reinforcement learning” was implemented, which combines traditional reinforcement learning with motivation theory.
Simply put, reinforcement learning is a framework by which an agent (such as the robot) can determine an optimal policy (or sequence of actions) to maximise expected reward within some environment (the soccer field). In this scenario, the problem is structured so the robot determines a policy of head motions that minimise self-localisation error.
Motivated reinforcement learning extends this framework by having the robot seek out “similar-but-different” experiences, when it cannot otherwise determine an optimal action. This behaviour is based on the idea that animals seek “novelty of sensation”, and has been previously applied to studying theprogression of architectural designs.
Science favours simplicity, and there is often merit in taking a step back from the textbook methodology for solving a well-known problem.
The problem of self-localisation is no exception, with an optimised head actuation policy shown to yield an 11% improvement in self-localisation performance.
Although this may not sound like much, recent research on the improvement of self-localisation by improving probabilistic filters (mentioned earlier) has demonstrated that improving self-localisation accuracy by just 5cm translates to a 14% improvement in goals scored (over 750 games).
And that might just be the difference needed to win us the cup.
Check out some of the fierce moves on display in the RoboCup 2013 trailer.