Announcing Project Aria: a research project on the future of wearable AR
September 16, 2020
Update: September 13, 2021:As we announced at Facebook Connect last year, the goal of Project Aria is to help Facebook Reality Labs researchers build the software and hardware necessary for future AR glasses by capturing data from a first-person perspective in the types of situations in which people will eventually use them. While Project Aria has been used in public spaces in the US for the past year, we want to ensure our eventual products work in different cultural contexts equally well. As such, we need to start collecting and training our algorithms on a variety of data. Singapore is home to a wide range of cultures, ethnicities, and religions, and also serves as our hub for teams in the region. Starting in September, some Facebook employees and contractors will capture data with the Project Aria device in a handful of public landmarks in Singapore. While this is the first time we will be using the device in public places outside of the US, we are excited to bring this research to other regions in the coming months.
As part of our continuing series of updates from Facebook Reality Labs Research, we’ve shared how our researchers are designing lifelike avatars for VR and showcased our latest work in brain-computer interface and spatial audio and enhanced hearing. Today, we’re excited to unveil Project Aria — a new research project that will help us build the first generation of wearable augmented reality devices.
A spectacular challenge
Imagine a pair of glasses that add a 3D layer of useful, contextually-relevant, and meaningful information on top of the physical world. Such a device could help us perform everyday tasks better — like finding your keys, navigating a new city, or capturing a moment; but it could also open up an entirely new way of moving through the world. Smartphones are amazing devices, and they’re getting better all the time.
But at Facebook Reality Labs, we’re envisioning a time when we have all the benefits of connectivity (and more), without the need to keep our heads and our eyes down, looking at a device. Imagine calling a friend and chatting with their lifelike avatar across the table. Imagine a digital assistant smart enough to detect road hazards, offer up stats during a business meeting, or even help you hear better in a noisy environment. This is a world where the device itself disappears entirely into the ebb and flow of everyday life.
Of course, a lot of this is still the domain of science fiction. To actually build glasses flexible enough to work for most face shapes and sizes, and create the software to support them, we still need several generations of breakthroughs, like systems to enhance audio and visual input, contextualized AI, and a lightweight frame to house it all. This kind of AR requires a foundational shift in computing technology that mirrors the leap from libraries and landlines to personal computers and smartphones.
A visionary research tool
Project Aria is a step in that direction. We’ve built a research device that will help us understand how to build the software and hardware necessary for AR glasses. The Project Aria glasses are not a consumer product, nor are they a prototype. They won’t display any information on the inside of the lens, and research participants cannot view or listen to the raw data captured by the device. As a research device, the glasses will use sensors to capture video and audio from the wearer’s point of view, as well as eye movement and location data to help our engineers and programmers figure out how AR can work in practice. The glasses will encrypt, compress, and store data until it’s uploaded to a separate, designated back-end storage space.
From its conception to its implementation, Project Aria was designed as a way to help us innovate safely and responsibly. To help us develop the safeguards, policies, and even social norms necessary to govern the use of AR glasses and future wearable devices, we’re gathering feedback both from people wearing the device and from people who encounter other people wearing the device in public. And to better understand how this technology can benefit people with varying physical abilities, we're starting a pilot program with Carnegie Mellon University’s Cognitive Assistance Laboratory to build 3D maps of museums and airports that will have multiple applications, including helping people with visual impairments better navigate their surroundings.
Again, we are not releasing this device to the general public and it won’t be for sale. Starting in September, it will be made available to a limited group of Facebook employees and contractors in the United States, trained in both where and when to use the device, and where and when not to. By wearing these devices as they go about their day, at home, on Facebook campuses (once they reopen), and in public, the data they gather will support the development of head-tracking, eye-tracking, and audio algorithms that will one day make the dream of AR glasses real.
Project Aria will also provide the first-person perspective data necessary to build LiveMaps — the virtual 3D map needed to fulfill the potential of AR — and train a personalized assistant that can one day help people sense and understand the world. Because this initial dataset has to be relevant for as many people as possible, we’ll be asking people of diverse genders, heights, body types, ethnicities, and abilities to participate in the research program.
Privacy at the outset
Before getting into some of the technical details on how the research device will work, it’s important to clarify the steps we’re taking to protect people’s privacy. We’ve written some FAQs on this topic, which can be found here fb.com/projectaria, but here are the highlights:
When and where data gathering will take place
The initial batch of research participants will be limited to about one hundred Facebook employees and contractors, primarily located in the San Francisco Bay Area and Seattle.
All participants will first undergo training to learn where and when they should collect data. And they won’t be able to directly view or listen to the raw data captured by the device.
All devices will display a prominent white light that indicates when data is being collected.
All devices will have a physical mute button that will stop collecting data when pressed.
Project Aria research participants will record in Facebook offices, wearers’ private homes, and public spaces. They won’t record in venues such as stores or restaurants without written consent from such venues.
Recording is never permitted in sensitive areas like restrooms, prayer rooms, locker rooms, or in sensitive meetings and other private situations — and it is only allowed in the homes of wearers with consent from all members of the household.
When wearing the research device in public, each participant will be required to wear a lanyard and clothing that (a) identifies them as participants in the Project Aria research study and (b) includes a website where people can learn more about the project and its research goals. Over time, we’ll continue to evaluate these external indicators to ensure we’re being transparent.
At any time, participants can turn off the device, stop recording, or delete a segment of data. Participants can do so by selecting the timestamp on a companion app installed on their smartphones.
The companion app will also include instructions that reiterate the initial training lessons, e.g., how to delete recordings, where and when it’s appropriate to capture data, and other technical information on the device itself.
Data privacy protections
As with any mapping data, security of recorded data is paramount. Facebook will keep Project Aria data secure by using encryption and a secure ingestion system to upload the data from the device to a separate, designated storage space.
Facebook will securely store data and the maps that are derived from that data in this separate, designated storage space, and the data will be accessible only to researchers with approved access.
Once uploaded, captured data is kept under quarantine for three days, meaning it is not made available to researchers. During the quarantine period, participants can again choose to delete segments of captured data from the system by looking at low-resolution thumbnails.
Before any data gathered in a public place is made available to our researchers, it will be automatically scrubbed to blur faces and vehicle license plates.
The research glasses do not use facial recognition identification technology, and we don’t use this data to inform the ads people see across Facebook products.
These privacy and data protection controls were developed with the input of privacy experts, including Eli Dourado, Senior Research Fellow, Center for Growth and Opportunity, and Daniel Castro, Director of the Center for Data Innovation, Information Technology and Innovation Foundation.
As we continue to explore the possibilities of AR, we’ll continue to engage with privacy experts and refine our policies. The more we learn about this technology, the better we can anticipate issues and address them ahead of time. Any data we collect today is done in service of finding a way to do the least intrusive data capture with our hardware tomorrow, and to make sure whatever data we do collect provides as much user value as possible.
The biggest initial challenge we faced was finding a way to take all of the existing sensors necessary to make AR work, and fitting them on a pair of glasses that participants could actually wear. The Project Aria glasses are unique in that they not only sport the full sensor suite used in VR headsets for spatial awareness, they also compute location from GPS, take high-res pictures, and capture multichannel audio and eye images.
As a research device, the current sensor configuration is not set in stone. We’re particularly interested in whether we can build products that use a collection of sensors for multiple tasks in a power-efficient manner. Ideally, such a product would be able to use all of its sensors in tandem, to better understand your intent and offer information only when it’s useful. So, if you’re at a grocery store, the front-facing cameras might scan the room and identify the store’s contents, while the eye-tracking cameras recognize your gaze has fallen on a fruit, and the display function pops up with an identification, its price, its nutritional value, potential recipes, and so on.
This is just one example, but it’s easy to imagine how useful something like this could be for people with mental or physical disabilities, people with memory problems, people facing language barriers, skill barriers, or simply trying to navigate a new city — anyone for whom a little context and information delivered at the right time would make a world of difference.
We’re also exploring how the device’s head-tracking sensors can be used to add virtual objects to physical environments. So instead of owning a physical TV, you could conjure one up, place it virtually anywhere you’d like, and the image would stay fixed, even as you moved around. Longer term, to lower the overall power and thermal demands of future glasses and reduce the need for image sensing, we’ll use data from Project Aria to see whether we can leverage techniques such as Tight Learned Inertial Odometry.
Building a new kind of map
All that is technically challenging, but it gets even more interesting when you add a virtual object and want it to stay fixed in place even after you walk away. That’s because we need a framework for storing and indexing your virtual objects. And if you want to share virtual objects with other people, that requires a common coordinate system. And then if you want to be able to share space with an avatar of a friend who lives in another state, not only are precise location and orientation needed, but we also need to understand the real world objects in the space around you, so the avatar doesn't end up cut in half by a table, and can be placed in natural poses such as sitting in a chair.
And in order for AR glasses to know where they are in space and efficiently see and understand your surroundings, they need a virtual, 3D map of things around you. But it’s far too power-intensive to scan and reconstruct a space in real time from scratch, so AR glasses will need to rely on an existing map that they can then update with changes in real time.
At Oculus Connect 6, we shared our solution to these problems: LiveMaps. By using computer vision to construct a virtual, 3D representation of the parts of the world that are relevant to you, LiveMaps will let future glasses effectively mix the virtual world and real world. AR glasses will download the most recent data from the 3D map, and then only have to detect changes — like new street names or the appearance of a new parking garage, and update the 3D map with those changes. The Project Aria device is testing how this can work in practice.
Looking toward the future
Transforming our vision of AR devices and LiveMaps into reality is one of our most ambitious and far-reaching goals at FRL Research. We envision a world in which LiveMaps enables our AR products to help people connect with the world and one another in all kinds of meaningful ways.
From providing intuitive navigation support to people with disabilities, and enabling people to share space with a lifelike avatar of a sister or parent who lives in another country, to enhancing worker training, and helping people to hear better in noisy environments, particularly as they get older and their hearing fades, people from all walks of life will be able to use AR in different and deeply significant ways.
This is a world with fewer devices and more time spent enjoying moments with family and friends, where opportunity is defined by passions, not geography or circumstance. It’s a world in which advanced social presence enables people to work together in ways that aren’t possible today, leading to fresh ideas and new endeavors in entertainment, education, and beyond.
This is a world we’re looking forward to.
Watch the Facebook Connect keynote to see how our various platforms are adapting to these unprecedented times with faster, more intuitive tools for developers, creators, and businesses of all sizes.