Boxes, little boxes, ticky tacky boxes
So, what’s a minimum viable holographic camera? Perhaps we can build a hand-waving intuition: grab an envelope, flip it over, and let’s start calculating. First, take a closer look at that water-reflecting image from earlier. I’ve grabbed a section of the hull that’s about the same size as today’s BBOCs: ~50 cm. The sunlight dapple-pattern will show us how much spatial resolution we’ll need:

The first image is is as-shot, which is 720 samples across. Each pixel represents a location where we would have to place a camera — and yes, that’s a silly number of cameras, but the result high enough resolution we can consider it ‘ground truth’: our eyes can’t perceive spatial differences wider than our pupil width (>=~2mm). Since the first dapple-pattern image has about half a millimeter resolution, it’s above the Nyquist limit for what we could perceive. So far, so good. That’s almost 400,000 cameras, though. Somewhat impractical.

For each resampled image, I downsampled by a linear factor of four from the previous image (for a 16x reduction in total pixels, or ‘cameras’). I also just grabbed every fourth pixel; this simulates the aperture function which throws away most of the light. In the first image, our camera array is now a merely-stupid 25,000 cameras. And the structure looks… basically identical. So far, so good: we’re not losing much by tossing out those extra 375,000 cameras, or 75% of the light.
Next I threw away 94% of the remaining cameras, so we’ve got about 1,500 in our array now, one every 8 mm or so (about the largest pupil diameter). And visually, it looks like most of the glints would be captured by at least one camera in this array, so we could reconstruct the proper, sparkly reflectance. And with only one hundred times the size of our current BBOCs!
After another 4x linear reduction, we’re pushing about 100 cameras now, and though the image is significantly fuzzier, the overall structure is still pretty visible. Will it look good enough in a headset? I have no idea; we’re literally looking at a boat, people. But it’s possible a 100 camera array would fall into the ‘good enough for us to be fooled’ category.
Finally, at an additional 4x reduction, there’s not much detail left. This, by the way, is the current state of the BBOC: this represents about four cameras to capture the spatial structure of the scene. (In a ~16 camera array, you don’t have more than about 4 cameras looking at any one object point.) You would not see real glitter or sparkle in this scenario.
So, quick caveats: first, we need to see results in an actual headset (or at least a calibrated boat?). Second, this is only one example; water, glass, smoke, reflections off of cars, calmer and rougher water will behave differently. So don’t take this test too seriously, but what I hope we’re seeing with this exercise is we’re only one or two orders of magnitude away from a more ‘holographic,’ light field-like approach. Yes, one or two orders of magnitude is a lot, but it’s not ‘infinite’; is there a Moore’s law for BBOCs?
Focusing to infinity… and beyond
We’ve focused on virtual reality in this article, but light fields are useful for a lot more. Holographic capture lets you create multi-perspective content for AR and future holographic displays, and that future isn’t far off — our helpful holographer Linda Law can rattle off a bunch of companies working on the problem: Zebra, Holographika, Leia, Ostendo, and several others in stealth mode. (And no, that Tupac thing is not actually a hologram.) All these systems are currently limited to computer-generated imagery, until we get our holographic camera.
And of course there’s AR: HoloLens, Magic Leap, others. And yeah, if Magic Leap is going to play anything on their mythical glasses besides computer-generated robots and whales, we’re going to need holographic cameras. (They’ve also been hiring holographers, for the record, so maybe they’re making cameras, too. I wouldn’t put it past them; I wouldn’t put anything past them.)
Your punchline, sir
So what have we learned? Well, all these displays need content. And we need light field capture — holographic cameras — to deliver live action content to these displays properly. This is a very big deal, because the success or failure of these future platforms hinges on content: until they have good, live action content, these platforms are akin to an iPhone without an app store, or a TV with nothing but cartoons (okay, that’s not even an analogy, that’s just what they are).
But if you can bring the real world into the virtual world, then VR/AR/holographic displays are a ton more relevant. They’re not just toys anymore, or specialist tools; they can grow to be major platforms, the future of mass media… and suddenly all those overblown predictions about VR/AR eating the world start to sound more reasonable.
So those are the stakes. That’s why we need holographic, or ‘light field’ cameras. And perhaps that’s why people are so eager to brand their cameras ‘light field,’ even when they’re clearly not — because deep down everyone really, really wants holographic cameras.
Because we all want to live in the future, right?