comma.ai’s Quest to Build Self-Driving Cars Better than Google

Brad Templeton takes us inside comma.ai's quest to build a self-driving almost entirely from deep neural networks that learns from the humans driving it.


For now, most people are not willing to have the neural network be the entire system as comma.ai wishes. This approach is probably capable of producing an autopilot that requires human supervision, it is less clear if this is a good path to a vehicle capable of unmanned or unsupervised operation. If you do have a camera, however, the neural networks are going to do a lot to help you understand what the camera sees.

More LIDAR vs. Vision

It may surprise some people to know that Google’s early generation cars barely made use of their cameras. The LIDAR is so useful that mostly the camera was there to see things like the difference between red and green lights (LIDAR doesn’t see lights and color.) At the same time, since then, Google has established itself as one of the world leaders on neural network technology, so it’s a likely supposition that Google has made strong internal efforts to do “sensor fusion” of the LIDAR, cameras and other sensors, and is probably very good at using neural networks to assist their vehicle. Other companies, such as Zoox and Daimler have shown good skill at fusing camera and LIDAR information.

In 2013, I published an article on the contrast between using LIDAR and cameras. In the article, I pointed out that LIDARs work today and are assured to get cheaper, while vision does not, and needs a breakthrough to get the job done. While we have not yet crossed that threshold, new neural network technology holds the promise of being the technology to make the leap.

One of LIDAR’s flaws is that it generally is low resolution. As such, while it is very unlikely to not sense an obstacle in front of the car, it might have trouble figuring out just what the obstacle is. Fusion of LIDAR and camera with CNNs will make those systems much better at this, and knowing what things are means making better predictions about where they will go in the future.

Cameras might also help with LIDAR’s other limitation - the approximate 100m range of near infrared LIDAR for dark objects (such as black cars.) At highway speed, you want more than 100m of range. Vision doesn’t have a range limit (except at night) though the further things are from you, the more resolution you need in your image, at least in the areas you are paying attention to you (mostly the road far ahead.) The more resolution, the more CPU it takes to run the vision processing. That’s why MobilEye’s newer units feature 3 fields of view. One is a “telephoto” view for seeing things further away directly in front and there are two wider views to see things closer and more to the sides. This is a good strategy for making use of vision. It’s even better if, knowing the curvature of the road ahead, you can focus your attention only only the road, and not waste pixels or processing on the things to the side of the road.

Some people hope vision systems will get good enough to make a car that, like a human, can drive any road without a map. Mostly this technique is applied to very simple roads, like highways, which are all similar and easy to understand. However, this is an example of the wrong type of thinking about AI. Just as airplanes do not fly by flapping their wings like birds, robotic systems should not necessarily be aimed and doing what humans do the way that we do it. A vision system that can classify everything in its view sufficiently well to drive without a map is also a system that’s very useful in building a map with less or no human assistance.

The map-making system can do its work with the benefit of a car driving over the territory more than once, and with the benefit of as much cloud supercomputer time as is necessary to do the best job. It would be foolish to throw away the great things a map can give you for the false goal of driving unknown roads with no data. Of course, cars need some ability to drive when the real world has changed from their map, but that’s a fairly rare event (which only happens to the very first car to encounter the change) and so it is not necessary that the vehicle be quite as capable in those situations - or if it has a human driver on board, it need not be capable at all.

Where this approach (probably) falls down

I wrote in January about how testing is the real blocking problem in robocars - that while there are many challenges in getting robocars working, one of the biggest is proving (to yourself and others) that you have really done it.

Neural networks face a problem here because it’s harder to know that they’re working. You don’t know why they are working, you can only measure their performance. You can re-test your network on all your old sensor data but you have a hard time being sure that the latest training you have given it won’t create a problem that wasn’t there before in some new situation.

On the other hand, traditional systems are so complex that it is difficult to judge their performance to. If the test is, “Drive one million km with less than 2 incidents, including every complex situation imagined in the simulator or recorded in the real world” then it could be the regimen is the same no matter how the system makes its decisions.

Strange legal benefit

Neural networks may also face a bizarre - I might even say perverse - advantage in the legal system. When there is an accident there will be a scramble to figure out why. In particular the plaintiff’s lawyers will be keen to show some negligence on the part of the developers.

With traditional code, you mind discover the cause was a classic old-style bug, like the famous off-by-one error or any other such problem. You will see the cause of the bug (and fix it) but you might now be able to claim that programmer, or the QA process, were negligent in some way.

With a neural network, the is not traditional code. If the network makes an error, we won’t know a lot about why, and so there is less likely to be a particular negligent human or negligent act, unless the court decides the whole idea of using the neural network is negligent.

That’s a more complex question, but generally if a team is following established good practices, it’s harder to find negligence. Negligence isn’t any sort of mistake, it’s a mistake that good and diligent people should have avoided, but they didn’t because they got careless.

The perverse factor here is that knowing less about how your system works may make it less likely somebody can claim you were careless.

This article was republished with permission from Brad Templeton’s Robocars Blog.




About the Author

Brad Templeton · Brad Templeton is a developer of and commentator on self-driving cars. He writes and researches the future of automated transportation at Robocars.com.
Contact Brad Templeton: 4brad@templetons.com  ·  View More by Brad Templeton.
Follow Brad on Twitter.



Comments



Log in to leave a Comment



Editors’ Picks

How Many Robots Does it Take to Screw in a Light Bulb?
Watch a Fetch robot with a custom soft robotic gripper use a...

Disney: Focus on the Robot Experience
The robot experience included in a business strategy is important not only...

Flirtey Wants Drones to Deliver Defibrillators in Nevada
Flirtey and REMSA have partnered to use drones to delivery automated external...

NVIDIA: Drive PX Pegasus AI Computer Powers Level 5 Autonomous Vehicles
NVIDIA introduced its Drive PX Pegasus AI computer that it claims can...