In anticipation of Apple's WWDC and the rumors around a new AR device, this is an exploration of how a new OS for Apple smart glasses might work from a UX perspective.

Every year, Apple announces a new iPhone, and every year, you will hear the familiar critique, "It's basically the same, just with a better camera."

Damn straight! The camera is the most important part of the device, and not just because you like taking pictures of your brunch in Portrait mode. The camera is where the next battle over internet dominance will take place, and Apple knows it.

Check out the art for this year's WWDC– that circle around the Swift logo...

https://developer.apple.com/wwdc22/

...comes into focus in the individual invites.

https://twitter.com/ChristianSelig/status/1528783701143855105

Looks pretty camera-y to me! In 2017, I wrote that iPhone X was the first mass market AR device. True, but AR is not the iPhone's primary mode. Smart glasses will put the camera* front and center.

Think of how Snapchat opens right into the camera. But now imagine if every app was that way, because the screen is always in front of your eyes.

How might Apple build a UX around that?

*Btw, whenever I say "camera", I really mean the camera + a variety of sensors and computations that give the device and the user data about physical surroundings. See The future of photos isn’t cameras

Building a camera-based OS

Each type of Apple device has its own OS and UX paradigm.

While each is quite different, iOS, tvOS, and watchOS all launch apps into full screen, where you experience them one at a time.

On macOS, you can can have various apps visible at once, arranged however you'd like, but only one active in the foreground at a time.

iPadOS has Split View where multiple apps can be visible at once.

There are also other kinds of elements like notifications and widgets that allow a user to see and interact with an app's functionality outside the confines of the app itself.

Apple's AR device will have its own novel UX and its own OS, and it will all be built around the camera. While there are heavy rumors about it being called realityOS, I am calling it cameraOS here to emphasize the camera-ness.

Camera as the base layer

In the default mode, you will see the world as it really exists– this is base reality, 1:1 with the physical reality you'd see with your own eyes**. I would imagine there will be an easy and accessible way to get to this mode at any time, from anywhere, perhaps with a hardware button. This is the equivalent of hitting the lock button on your iPhone, except you still need to be able to see through it. (Note: You can also always take off your glasses.)

Apps will not take over your whole field of vision (yes in VR type experiences, but I'm talking about AR here)– instead, UI elements from apps will live on top of that base camera layer.

One initial question is whether Apple will be the only provider of the camera layer or if they will allow third-party apps to provide their own camera views. In the latter scenario, when you open such an app, you would be switching to its proprietary camera.

That seems... unlikely. What we will probably see is an Apple-provided camera base layer that every other app must build on top of.

The native Apple apps will work seamlessly right out of the box, and third party developers will be working frantically to get camera-ready versions of their apps ready.

**A quick comment on optical AR (seeing the world with your eyes + digital artifacts superimposed on top) vs. passthrough AR (seeing the world through the camera + digital artifacts superimposed on top)– regardless of what that base layer ends up being, the rest of these challenges stay the same.

Integrated vs. floating UI

Some AR objects will be integrated into the scene, dependent on the particulars of the world in front of you. Objects might be anchored to the ground or to buildings or even to people. Your precise location in the world, which way you're facing, whether you're moving or still, what is around you– all these things will affect the UI in realtime.

Imagine opening Maps to see walking directions overlaid onto the sidewalk, constellations labeled next to stars with Night Sky, Honey discounts displayed outside every retail store, or LinkedIn profiles floating above everyone's heads at a networking event.

This is in contrast to what I'll call "floating" UI– things that are independent of the scene. Think about checking your email or Messages in AR. The elements will be in your field of vision, but what's coming through the camera doesn't really matter and likely won't influence how the is UI presented. That being said, you might want to have specifically areas of your vision open to see, so it is is likely that you'll be able to place, size, and hide windows as you see fit. (More like a Mac and less like an iPhone.)

We can see examples of both in current AR apps available for iPhone. But with a phone, even an immersive experience is happening within the confines of a screen that is a few inches big. You can simply move your entire phone out of the way to see what's behind it. Once the screen takes up your entire field of vision, managing the UI is going to be a huge challenge.

App switching and overlapping apps

If using this sort of device is like using an iPhone, then you will open and switch between apps one at a time. You check Maps, then Messages, then Twitter, and so on. But that would be selling the possibilities of AR short.

Instead, I imagine multiple apps being open, each overlaying different sorts of information and functionality, but simultaneously. Like in the real world, you see everything all at once. However unlike the physical world, in AR multiple things can be in the same place at the same time.

This immediately presents some difficult UX problems. What if apps have pieces of UI that overlap? I pass by the Shake Shack in Madison Square Park, and there is UI present from the Shake Shack app itself (Your order is ready!) but also from Yelp (4 stars!) and from Google (4.6 stars!) and from Wikipedia (Shake Shack is an American fast casual restaurant chain based in New York City!). That's not going to work.

I could see something like Siri Suggestions working here.

Perhaps there will be a hierarchy, where there is one primary app that you choose to actively use at any given time, but you can still see parts of other apps in the background. Maybe app developers will be able to provide two different types of UIs– their primary interfaces and then secondary, more passive views, akin to widgets.

There might even be prescriptive types of widgets that limit custom UI and streamline functionality, like complications on Apple Watch. Like a "local business" widget that allows for a star ratings and a featured review only. Or contextual, inline notifications.

Maybe when you have no primary apps open, the "Home Screen" is the base camera view with these more minimal, contextual widgets and notifications from multiple apps all visible as you look around.

Look up at the sky and see the weather. Look at the entrance to a subway station to see the wait time to the next train. Look at a historic monument to see what year it was built. Today we use the entire internet through a small rectangle located in one place– on your desk or in your hand. With smart glasses, the internet will be overlaid on top of the world, spread out like reality itself.

User generated content

One big can of worms is social apps and crowd-sourced AR content. Social posts could be tagged to specific physical locations, you could create digital graffiti, and the rabbit hole goes deep and dark from there. I don't have a good sense of how Apple might anticipate the potential problems here, but content moderation will take on an interesting new twist.

Imagine you are at a Holocaust museum. There might be plenty of legitimate use cases for AR there to enhance the educational experience. But it might be that Apple and other platform providers just make a place like that a no fly zone for AR content to avoid hateful and offensive content. They will ultimately control not only what can be viewed, but where it can be viewed.

Who controls the rights to the exteriors of buildings? To public spaces? To the sky? To the appearances of other people?! These are big messes waiting to happen.

Input

Ideally, you will be able to control the UI using hand gestures that the camera interprets. But it's also important to remember that presumably everyone using this device will also have an iPhone.

When it comes time to input text on your Apple TV, you have the option of using your phone to type with the keyboard instead of suffering through selecting each character with the Apple TV remote. I could see AR working similarly, where gestures are used for simple navigation, but your phone can be used for more nuanced or cumbersome inputs.

Eventually, we will no longer have phones as all screens will be virtualized (smart glasses are the last screen you'll ever need), but for the time being, some reliance on phones might make sense.

Will anyone wear it?

This is where someone in the back of the room stands up and yells, "But what about Google Glass!"

I will simply state that many people thought having a computer at home was a dumb idea, having a touch screen keyboard was a dumb idea, that AirPods looked too dumb to wear, and that Apple Watch was pointless.

Does that mean this new device will necessarily be a smash hit? No.

Does it mean that you should quiet any knee jerk reactions you might have to the idea of wearing smart glasses? Yes.

Don't assume the way things have been is how they will always be. In fact, it's a safe bet that will not be the case.

A final word on predictions

I do not know if this type of Apple AR device will be announced this year. It could be something more basic like a stripped down, heads up display, providing notifications and weather. It could be something VR-based, opaque and intended to be used while stationary. It could be just a vague teaser for developers to start thinking more seriously about ARKit. Or it could be that they announce nothing.

But some day Apple (and others) will have an AR device, and whenever that is, these are the sorts of UX challenges that will need to have been solved to create the next mass market internet device.

I've been waiting for this for a long time. (Me, circa 2017)

I'm writing a book called After the Smartphone about how smart glasses are going to change our world all over again.

I also have a free weekly newsletter called Product Tips that is exactly what it sounds like.