Bridging the Reality Gap

14 December 2017

Transfer Learning for Connected Autonomous Vehicles: Training in Simulation and Operating in Reality

As part of the FLOURISH project, React AI is deploying their neural network-based AI (known as ‘Brain Squared’) to control the Lutz Pod, in co-ordination with the Transport Systems Catapult.

The Lutz Pod is a large, expensive piece of machinery for day-to-day testing and development, and we needed a more practical solution to act as a mobile test-bed. So we developed ShoeBot, so called because it’s roughly the size of a shoebox - we like literal names! ShoeBot hosts the Intel Euclid computing platform, which provides it with a range of sensors, including GPS, inertial sensors, colour camera and infrared depth sensors, as well as an onboard processor and Wi-Fi connectivity.

A great way to train a neural network is inside a simulation; you can run endless simulations to generate huge sets of training data without the risks and cost of running the real system. But then you hit the ‘reality gap’ – this is where your simulation is never quite the same as reality; it is never as rich, and does not have the quirks and artifacts of the specific sensors deployed to the physical system. Neural nets often struggle to transfer learning from one to the other, hence the specific branch of machine learning dedicated to Transfer Learning.

We decided that the first thing we would do with ShoeBot is try bridging the reality gap to demonstrate Transfer Learning – and we decided to make it quite a gap, by training a neural net in a simple 2D game, far removed from reality.

Image 1: Cookie Monster and ShoeBot

We ran up a ‘cookie monster’ game, where the object is to eat cookies while not hitting the walls, and hooked it to an untrained instance of Brain Squared. The game sends Brain Squared a stream of simulated sensor messages and the score. After a few hours, Brain Squared learns how to control the monster to maximize the game’s score. Then we created data processing code to take ShoeBot’s sensor inputs and make them look like the simulated sensors in the game. In place of flagging up cookies, we processed the camera image to highlight red objects.

First off, Brain Squared did not make much of its new senses; it did not understand its new view of the world. The next step was to visualize the signals in real time, and compare the game with ShoeBot, then add further processing to bring other aspects of the signals, such as the range of values from the depth sensors, roughly into line with the game. And it worked! ShoeBot, controlled by a neural net trained in a 2D world, spun around and wheeled off towards the fire extinguisher! You can see in the video clip that it displays the same distinctive twisting motion as the cookie monster agent does in the game.

Video: Cookie Monster and ShoeBot, Demonstrating Neural Network Transfer Learning

What's Next?

This was a fairly straightforward task (though solved with some of the latest neural net techniques). But it provides a great base for developing techniques for the deployment of automated vehicles, and for mixing training across different environments, physical and simulated. Next up, we will train Brain Squared further using data from physical ShoeBot; specific steps after that will be aligned with our end goals for control of the Lutz Pod. There are a lot of opportunities to transfer learning between the 2D game and its 3D equivalent, the simulated ShoeBot and the real ShoeBot, and the simulated Lutz Pod and the real Lutz Pod. Being able to bridge the reality gap successfully means we can take full advantage of them all, and speed along the development of smarter machine learning for automated vehicles.

Authored by: Nic Greenway, React AI

Back to List