For the last two months I’ve been closely following the progress of Cory Strassburger of Los Angeles-based VR developer Kite and Lightning as he started working with an iPhone X and its TrueDepth camera. He’s trying to use the new iPhone’s marquee feature as an inexpensive way of capturing facial expressions for the startup’s upcoming VR game, Bebylon: Battle Royale.
The idea is that if Strassburger can squeeze out enough subtle expressions he could, on a budget that won’t break the bank, bring to life the absolutely ridiculous characters (“Bebys”) that will inhabit the world they’re building.
Last month he showed a pretty compelling example of the facial capture system and outlined plans to combine the data with an Xsens full body suit to do complete performance capture.
Now, you can check out the hysterical results.
“Our whole game is based on having serious fun while expressing the wildest version of yourself in VR to your opponent and a live VR audience,” Strassburger wrote in an email. “So the best part of seeing these results for me is knowing I have in my grasp a means to accomplish a part of this.”
He said he bought a “tactical” helmet — like one used for paintball — and GoPro mounting hardware that he modified to hold his iPhone. His first test (seen near the end of the video) was a huge failure as the arm broke off, sending his shiny new iPhone rocketing straight to the floor. Luckily it survived and after that he added elastic cords as a safety wire. It all cost him less than $200.
“The key to getting a tactical helmet that will work is the head fastener feature. Basically the same fastener you find in a construction helmet where you tighten an inner strap inside the helmet to secure it to your head,” Strassburger wrote. “I mention this because many of the cheap tactical helmets didn’t have this feature and without it the helmet would not stay on your head with the additional weight of the arm and iPhone X. ”
He also recommends an aluminum arm for the phone mount because a plastic one is likely to warp out of shape. He also added a counter weight to the back of the helmet to balance it out. Because he doesn’t have enough phones on his head, he added a second one to record his voice and film the iPhone X so people could see the raw facial capture.
“Initially I thought the results were going to be terrible because the drastic head rotation & movement caused the iPhoneX to constantly lose its stable track so I saw a lot of flickering of the virtual head,” Strassburger wrote. “Oddly this had no apparent effect on the actual expression / blendshape data so regardless of how terrible it flickered during a capture (you can see it in the video), the resulting expression data was as solid as if I were holding still. In fact, not a single test resulted in less than expected quality facial capture data. Of course, if I were using the iPhone X to capture the head position and rotation, that data may have been compromised but I was using the Xsens motion capture suite for the head.”
The app he built for iPhone X includes features to streamline the capture process — converting the data on the fly to Maya’s .ANIM format. He also made it easy to Airdrop the recorded data to his Mac’s desktop, saving him the step of physically plugging the phone into his computer to copy the files.
They bought the Xsens MVN Link performance capture suit about a year ago but Strassburger hadn’t unboxed it until this test.
“It was super easy to setup and use,” he wrote. “I literally followed a few online tutorials and within a ridiculously short time, was dancing around my living room and capturing data, wirelessly. I had to learn nothing… so far. The resulting data was better and cleaner than I expected.”
He said it takes about 10 minutes after the performance to see it translated into Maya, and it only took that long because he had to manually sync up the audio.
“As I create trailers and cinematics, this data will serve as a really good basis and minimize cleanup and augmentation. It is basically a movie pre-vis animation pipeline that’s cheap, very easy to use and good enough for us to use as real data in our game without any notable sacrifices,” he wrote.
Next Steps
Strassburger plans to add voice capture to his iPhone X app. Right now, he’s manually syncing up the sound to the video so this addition would save time and get rid of the second phone.
He’s also considering trying to capture split screen video to see the performer’s face without the Beby overlaid, as well as streaming the data wirelessly to the desktop. That would be useful when he’s doing capture sessions with other performers.
We’ll keep following Strassburger as he progresses with his efforts and report back with any significant updates.