Artificial Intelligence

DeepFovea: AR/VR rendering, inspired by human vision

May 15, 2020

At Facebook Reality Labs, we are building a future where real and virtual worlds can mix freely, making our daily lives easier, more productive, and better connected. Power consumption is one of the challenges in getting to that future. In order to get to augmented reality and virtual reality (AR/VR) devices that can be worn comfortably for however long you want, up to and including all day, VR headsets must become more power-efficient, and AR glasses have to consume much less power still. As part of building AR/VR systems that get to that next level, we’re developing graphics systems that dramatically decrease power consumption without compromising image quality.

DeepFovea: Modeled after peripheral vision, powered by AI

The key to DeepFovea is the physiology of the human eye. When the eye looks directly at an object, the photons from that object land in what is known as the foveal region of the retina, or fovea for short. The fovea is the only part of the retina with high resolution, and it is a very small fraction of the overall retina. The highest-resolution region is only 3 degrees across, out of the more than 150-degree field of view of the human eye, and resolution drops by an order of magnitude within 10 degrees from the center of the fovea. We feel like we have high-resolution vision for a much wider field of view, but that is because our brain maintains a model of our surroundings and fills in missing details, while at the same time moving the fovea to any object of interest so quickly we’re not aware that we couldn’t see that object clearly a split second earlier. In truth, we have a tiny area of high-resolution vision — only the size of your thumbnail held at arm’s length — with only very blurry perception of everything that surrounds it.

That’s not to say that peripheral vision isn’t important. It is important for balance, motion detection, and ambient awareness, and cues the brain where to look next. But its ability to distinguish detail is sharply limited.

DeepFovea works to generate images that match the resolution of the retina, using the minimum necessary amount of data to do so. Given an image that is rendered sparsely, with a variable resolution that matches the resolution of the retina at each point based on where the fovea is pointed at any given moment, DeepFovea infers the missing data. Crucially, it does so in a way that, given the resolution and image-processing characteristics of the retina, produces results that are perceptually indistinguishable from a full-resolution image. That doesn’t mean that the results are identical — in fact, they’re generally not even close when you look at them with the fovea, as you can see in the example below — but the lower-resolution processing of the peripheral vision can’t perceive the difference.

DeepFovea infers the missing peripheral information by using generative adversarial networks (GANs). We train DeepFovea’s neural network by feeding millions of real videos with artificially degraded peripheral quality. The artificially degraded videos simulate peripheral image degradation, and the GAN-based design helps the network learn how to fill in the missing details based on statistics from all the videos it has seen.

As a result, a renderer can render an order of magnitude fewer pixels, or even less than that — pixel density can decrease by as much as 99 percent along the periphery of a 60x40-degree field of view — thereby saving a great deal of power consumption. Yet thanks to DeepFovea, together with eye tracking, the viewer will perceive exactly the same scene at exactly the same quality.

DeepFovea also ensures that flicker, aliasing, and other video artifacts in the periphery remain undetected by the human eye.

The future is bright — and efficient

Our ultimate goal is to make real-time foveated rendering run on lightweight, highly power-efficient AR/VR devices that can be worn all day. DeepFovea marks an important step toward that goal by setting a new standard for the efficiency of perceptual rendering, by demonstrating no perceived quality loss while rendering fewer than 10 percent as many pixels as conventional renderers. This approach is also hardware agnostic, making DeepFovea compatible with a wide range of AR/VR research systems.

While DeepFovea unlocks an important method for efficient rendering in AR and VR, this is just the beginning of the exploration of ultra-low-power perceptual rendering. By releasing the DeepFovea demo in addition to our research paper, we hope to provide a useful framework for researchers in graphics and vision science who are interested in contributing to the advancement of perceptual and neural rendering technologies.

Written by:

Anton Kaplanyan

Anjul Patney

Research Scientist

Artificial Intelligence

Meta creates breakthrough technologies and advances AI to connect people to what matters and to help keep communities safe.

Tech at Meta

DeepFovea: AR/VR rendering, inspired by human vision

DeepFovea: Modeled after peripheral vision, powered by AI

The future is bright — and efficient

Artificial Intelligence

Follow us

Research & Engineering

Developers

News

More from Meta