Meta AI researchers are getting closer to delivering realistic avatar legs without extra tracking hardware.
Out of the box, current VR systems only track the position of your head and hands. The position of your elbows, torso, and legs can be estimated using a class of algorithms called inverse kinematics (IK), but this is only accurate sometimes for elbows and rarely correct for legs. There are just too many potential solutions for each given set of head and hand positions.
Given the limitations of IK, some VR apps today show only your hands, and many only give you an upper body. PC headsets using SteamVR tracking support worn extra trackers such as HTC’s Vive Tracker, but buying enough of them for body tracking costs hundreds of dollars and thus this isn’t supported in most games.
In September, Meta AI researchers showed off a neural network trained with reinforcement learning called QuestSim that estimates a plausible full body pose with just the tracking data from Quest 2 and its controllers. But QuestSim's latency was 160ms – more than 11 frames at 72Hz. It would only really be suitable for seeing other people’s avatar bodies, not your own when looking down. The paper also didn't mention the system's runtime performance or what GPU it was running on.
In a new paper titled Avatars Grow Legs (AGRoL), other Meta AI researchers and intern Yuming Du demonstrated a new approach that they claim "achieves state-of-the-art performance" with lower computational requirements than previous AI approaches. AGRoL is a diffusion model, like recent AI image generation systems such as Stable Diffusion and OpenAI's DALL·E 2.
Unlike other diffusion models though, and most AI research papers, the researchers say AGRoL "can run in real-time" on an NVIDIA V100, running at around 41 FPS. While that's a $15,000 GPU, machine learning algorithms often start off requiring that kind of hardware but end up running on smartphones with a few years of optimization advancements. That was the case for the speech recognition and synthesis models used in Google Assistant and Siri, for example.
Still, there's no indication body pose estimation of AGRoL's quality will arrive in Meta Quest products any time soon. Meta did announce its avatars will get legs this year, but it will probably be powered by a much less technically advanced algorithm, and will only be for other people's avatars, not your own.