Yesyou are behind the wheel of your car, but you are exhausted. Your shoulders begin to droop, your neck begins to droop, your eyelids slide down. With your head tilted forward, you veer off the road and hurtle through a field, crashing into a tree.
But what if your car’s surveillance system recognizes the tell-tale signs of drowsiness and prompts you to move off the road and park instead? The European Commission has determined that from this year new vehicles must be equipped with systems to absorb distracted and drowsy drivers to help prevent accidents. Now a number of startups are training artificial intelligence systems to recognize the giveaways in our facial expressions and body language.
These companies are taking a new approach to AI. Instead of filming thousands of real drivers falling asleep and processing that information into an in-depth learning model to “learn” the signs of sleepiness, they create millions of fake human avatars to recreate the sleepy cues.
“Big data” defines the field of AI for a reason. To accurately train deep learning algorithms, the models must have a large number of data points. That poses problems for a task like recognizing a person falling asleep at the wheel, which would be difficult and time-consuming to film in thousands of cars. Instead, companies have started building virtual data sets.
Synthesis AI and data are two companies that use 3D full-body scans, including detailed facial scans, and motion data captured by sensors placed all over the body to collect raw data from real people. This data is fed by algorithms that adjust different dimensions many times over to create millions of 3D representations of people, resembling characters in a video game, exhibiting different behaviors in different simulations.
In the event that someone falls asleep at the wheel, they can film a human artist falling asleep and combine it with motion capture, 3D animations and other techniques used to create video games and animated films, to build the desired simulation. “You Can Map” [the target behaviour] across thousands of different body types, different angles, different lighting and also adding variation in movement,” said Yashar Behzadi, CEO of Synthesis AI.
Using synthetic data removes much of the clutter from the more traditional way of training deep learning algorithms. Normally, companies would have to collect a huge collection of real-life footage and low-paid employees would painstakingly label each of the clips. These would be fed into the model, which would teach how to recognize the behavior.
The big selling point for the synthetic data approach is that it is faster and cheaper by a wide margin. But these companies also claim it can help address the biases that give AI developers huge headaches. It is well documented that some AI facial recognition software is bad at recognizing and identifying correctly certain demographics† This is usually because these groups are underrepresented in the training data, meaning that the software is more likely to misidentify these people.
Niharika Jain, a software engineer and expert on gender and racial bias in generative machine learning, points to the infamous example of Nikon Coolpix’s “blink detection” feature, which, because the training data contains a majority of white faces, Asian faces disproportionately judged as being blinking. “A good driver tracking system should prevent members of a particular demographic from being misidentified as sleeping more often than others,” she says.
The typical response to this problem is to collect more data from the underrepresented groups in the field. But companies like Datagen say this is no longer necessary. The company can easily create more faces of the underrepresented groups, meaning they will make up a larger portion of the final data set. Real 3D facial scan data from thousands of people is whipped into millions of AI composites. “There is no bias in the data; you have complete control over the age, gender and ethnicity of the people you generate,” said Gil Elbaz, co-founder of Datagen. The creepy faces that emerge don’t look like real people, but the company claims they’re similar enough to teach AI systems how to respond to real people in similar scenarios.
However, there is some debate as to whether synthetic data can really eliminate bias. Bernease Herman, a data scientist at the eScience Institute at the University of Washington, says that while synthetic data can improve the robustness of facial recognition models on underrepresented groups, she doesn’t believe that synthetic data alone can close the gap between the performance of those groups and others. While the companies sometimes publish academic papers showing how their algorithms work, the algorithms themselves are proprietary, so researchers cannot independently evaluate them.
In areas such as virtual reality, as well as robotics, where 3D mapping is important, synthetic data companies argue that it might actually be preferable to train AI on simulations, especially as 3D modeling, visual effects and gaming technologies improve. “It’s only a matter of time before… you can create these virtual worlds and fully train your systems in a simulation,” Behzadi says.
This kind of thinking is gaining ground in the autonomous driving industry, where synthetic data is becoming increasingly important for teaching the AI of self-driving vehicles to navigate the road. The traditional approach – filming hours of footage and entering it into a deep learning model – was sufficient to allow cars to navigate relatively well on roads. But the problem irritating the industry is how to drive cars reliably what are known as “edge cases” – events so rare that they are rare in millions of hours of training data. For example, a child or dog driving onto the road, complicated road works or even a number of traffic cones placed in an unexpected position, that was enough roll over a driverless Waymo vehicle in Arizona in 2021.
With synthetic data, companies can create endless variations of scenarios in virtual worlds that are rare in the real world. “Instead of waiting millions of miles to collect more samples, they can artificially generate as many samples as they need from the edge case for training and testing,” said Phil Koopman, an associate professor of electrical and computer engineering at Carnegie Mellon University.
AV companies such as Waymo, Cruise and Wayve increasingly rely on real-life data combined with simulated driving in virtual worlds. Waymo has created a simulated world using AI and sensor data collected from its self-driving vehicles, complete with artificial raindrops and sun glare. It uses this to train vehicles in normal driving situations, as well as the trickier edge cases. In 2021 Waymo . will told the Verge that it had simulated 15 billion miles of driving, versus only 20 million miles of real driving.
An additional benefit of testing autonomous vehicles in virtual worlds first is minimizing the chance of very real-life accidents. “An important reason why self-driving cars are ahead of a lot of synthetic data is fault tolerance,” says Herman. “A self-driving car that makes a mistake 1% of the time, or even 0.01% of the time, is probably too much.”
In 2017, Volvo’s self-driving technology, which had been taught how to respond to large North American animals such as deer, stunned when encountering kangaroos for the first time in Australia. “If a simulator doesn’t know anything about kangaroos, it won’t create any simulation until it’s seen in tests and designers figure out how to add it,” Koopman says. For Aaron Roth, a professor of computer and cognitive sciences at the University of Pennsylvania, the challenge will be to create synthetic data that is indistinguishable from real data. He thinks it’s likely we’re at that point for facial data, as computers can now generate photorealistic images of faces. “But for a lot of other things,” – with or without kangaroos – “I don’t think we’re there yet.”