When thinking about the future of augmented reality, one useful framework is to think about special effects in movies. What if those effects were available to everyone, all the time?
In 1993, Jurassic Park wowed the world with its incredible CGI (computer-generated imagery) of dinosaurs. That movie, directed by Steven Spielberg, cost $63 million to make, which is about $114 million in 2021 dollars. In order to achieve the effects, they had to hire the world’s leading 3D animators and computer programming experts to invent many brand new technologies. Each shot had to be carefully choreographed to give the illusion of the dinosaurs and the live actors both appearing in the same space, matching the exact placement, lighting, and shadows. When working on an effects shot, the artists and editors could only see very rough versions frame by frame, and in order to see the finished sequence, even if only a few seconds of footage, they would need to wait for it to render (export) which could take hours or even days– just to play it back! Then they would go back to tweak it and need to render it out again and again.
Today, almost 30 years later, there are multiple free apps available for your phone that can drop realistic 3D dinosaurs into your field of vision on the fly. The picture is rendered in real time, and as you move your phone, you can view the dinosaurs from any angle you want. (Needless to say, you can only watch Jurassic Park from one angle: the one it was shot from.) Movie magic is readily available to you as soon as you choose to open an app and hold your phone in front of your face. With AR, the screen will always be in front of your face.
There are four main ways that computer technology progresses that continue to enable the amazing innovations we see year after year. Things keep getting smaller, cheaper, faster, and more automated.
Let’s take the production of Jurassic Park vs. the dinosaur app on your phone. What used to require a big room full of dedicated machines now fits in your pocket. (smaller) Instead of costing tens or hundreds of millions of dollars, a phone costs hundreds of dollars, and the software on it often costs zero or close to zero. Even $5 for an app is considered expensive! (cheaper) The speed is no longer measured in hours or days; it renders in real time. (faster) And rather than needing to manually and methodically plan out each shot, each angle, the lighting and shadows, etc. - that’s all handled by the software. You don’t need to think about it or “do” anything to get a great result. (automated)
Over the span of 30 years, something that was only available to Steven Spielberg is now free for any chump with an iPhone. Broadening from there, we can imagine that everything available in Hollywood today will eventually become accessible to all of us. We should look at their trends as indicators of what’s to come for everyone, and especially how it’s all contributing to an AR future.
There are too many technologies at play to cover them exhaustively, but let’s look at a few big trends in the entertainment industry that are likely to apply to AR. One is the act of replacing practical (physical) effects with software effects. Think about a big explosion in an action movie. It used to be that the only way to achieve that sort of effect was by actually blowing something up! Today, you have the option of hiring a team of pyrotechnic experts and spending a lot of money, or using software to generate the appearance of an explosion. The same goes for weather– a lot of the rain and snow we see in movies and TV today are added with software, as well as when actors exhale in cold weather and you can see their breath– try to notice the next time you’re watching a scene that is supposed to take place in cold weather; it’s likely CGI.
What used to be puppets and animatronics used for monsters and aliens are now being replaced with software characters. Even for human actors, rather than using stunt doubles, there are now many shots that are 100% software generated. A superhero flying around? That is likely not a person hanging on strings anymore. Software is even being used to make actors’ faces look younger or older, rather than making them suffer through hours of makeup.
It would not make sense for most of us to have makeup teams, stunt doubles, puppeteers, or pyrotechnic advisors, but now that these are in the realm of software, it’s only a matter of time before things get smaller, cheaper, faster, and more automated enough to give us the ability to do all of the above.
The technology that allows for actors being fully digital characters is worth lingering on for a moment. You might not realize it yet, but altering the appearances of faces is a fundamental building block of an AR future. Using movie tech as a guide, we can look to the ongoing evolution of facial recognition and motion tracking.
In 2001, the movie The Lord of the Rings: The Fellowship of the Ring featured a computer generated character named Gollum, played by human actor Andy Serkis. That was one of the first times that motion capture acting really entered the mainstream conversation. If you watch behind the scenes footage of movies like that one, what you will find is actors with little sensors all over their body and/or dots of light projected on their faces.
The sensors help to track movement. Imagine one on your shoulder, one on your elbow, and one at your wrist. If the order top to bottom is shoulder, elbow, wrist, then your arm must be down. But if the order top to bottom is wrist, elbow, shoulder, then your arm must be up above your head. The computer software doesn’t know about your arm per se, but it can track the sensors, and then animation software can use those cues to create digital characters that match your real movements.
Projected dots do something similar, they track the contours and movement of your face. Imagine tiny dots being projected at your face. On your cheeks they would appear far apart, but under your nose they’d appear very close together. The software can track the location of the dots and reconstruct your face and its movements digitally. This is how an actor controls the facial expressions of a CGI character, by their face naturally moving the dots.
What you might not know is that your iPhone uses this exact same technology to unlock every time you use Face ID. I wrote a piece in 2017 called “Replace Your Face: iPhone X as the First Mass Market AR Device” when it was first released. This incredible movie magic tech was suddenly available in your phone, not only to be used for Face ID but for other things like photo filters and “digital masks” as I call them, or what Apple calls Animoji and Memoji or what Snapchat uses for its filters, where you overlay an animal or humanoid face onto yours that moves and acts exactly like your real face is doing into the camera. You are Gollum in that moment.
Perhaps the most basic and current example of movie effects making their way to consumer technology is changing your Zoom background. There was a time when using green screens (they actually started as blue screens) in Hollywood was considered incredibly high tech. This technique, developed in the early 1900s, was expensive, cumbersome, and took an insane amount of painstaking work. Once computers entered the fray, green screen effects required not only the latest and most powerful hardware and software, but also domain experts from the most elite effects studios. The first time I remember getting to use something akin to the green screen effect was when Macs started shipping with the app Photo Booth in 2005, which allowed you to swap out your background just to make silly recordings for yourself. Today, kindergarteners are using sophisticated virtual backgrounds during their Zoom classes without a second thought.
Like all things in tech, we are not at the end of that progression; we are in the endless middle. We know how we got here, but where do we go next? You can try to extrapolate what will happen further along the continuum by drawing a line from purely physical backgrounds to Hollywood green screen effects, from there to consumer-level video conferencing, and if we keep extending that line, that’s how we can start to make predictions about the future.
There is a lot more to explore as far as Hollywood effects and its increasing overlap with video game technology, both of which are laying the bricks for ubiquitous AR.
For now, just remember that as the tech becomes cheap/small enough for us all to have access, it is also becoming faster and more automated so that we don’t need to be experts to use it, or in many cases we don’t even have to learn how to use it at all. What was once novel and specialized becomes completely commonplace and second nature. We don’t even about how Face ID or Snapchat filters work. They just do, and we intuitively know what to do with them.
If you want to predict what the capabilities of AR will be, just watch a lot of high produced movies and imagine that what you see on your TV will eventually be integrated into your field of vision, all day, every day.