Overview

Atheer acquired Flype in 2019, aiming to merge its capabilities with their existing tools focused on remote assistance and augmented reality. Flype, initially a low-code enterprise platform, offered features like digitization, integrations, user controls, and workflow building.

With Flype in its early stages, the integration process was somewhat simplified. The goal was to combine both products into a unified enterprise platform supporting digitization and offering a front-end experience for frontline workers, accessible across various devices, including hands-free and AR glasses.

Guiding Strategic & Product Design Considerations

1. Low-Code Enterprise Digitization
We decided to move forward as a low-code platform. Our differentiating factor was our ability to support the authoring of AR experiences and make them easy to build and deploy quickly at scale

2. Mobile-First
Atheer’s product strategy and design system had always been led by glasses (which introduce many limiting constraints), so this was exciting and liberating! But it did mean we needed to create an experience optimized for mobile devices.

3. Optimize per Form Factor
Splitting and optimizing our experiences per form factor, with as much a shared base as possible, would enable us to create a more intuitive experience regardless of device, e.g. wearables.

4. Flexible, Modular UX/UI
Since the end-user experience was the manifestation of whatever was built in our low-code Studio environment, experiences and components needed to be modular (e.g. a customer could hypothetically disable entire swaths of the application - changing navs, settings, listings, etc.)

Information Architecture

First step was IA. Flype had a shell of a mobile app that was pretty in line with our general IA approach, so we started there. Typically, when starting to define IA for a project like this, we’d want to know

1) who are our users

2) what are our use cases/critical user journeys

3) what do these users need or want to do?

From there, we would define the organizing principle(s) for both the low-code Studio arm of our product and application, which would be experienced by end-users through a mobile app that provided a consistently structured UI/UX experience across nav, IA, components, etc.

But in our situation, we had some unique constraints: namely, how do you create a consistently structured interface that is modular and flexible enough to support any experience authored using our low-code studio? And additionally, have that interface scale across all edge devices - mobile, tablet, monocular, and binocular devices.

We needed to abstract out high-level user needs to identify our core constructs, including both the straightforward requirements for our typical use cases (e.g. profile, settings, a place to search for resources (discover/explore), a place to communicate (connect), and a place to track work), while also creating an organizational, structural paradigm that was modular and flexible. Our approach would need to handle different experiences authored on the backend and accommodate for different feature definition sets and/or configurations that any organization (aka customer or “workspace”) could define.

There were a few structural approaches we considered for how to split and organize the end-user application in a way that would create a consistent, intuitive experience while responding to workspace-defined packages of features and capabilities:

1. The Absolute Basics of Work
Home/“For You”: contextual, curated summary of information and actions
Pushed information: data we would contextually serve users based on location, user profile, audience, etc.
Pulled information: users searching or scanning for something
Communication: social and collaboration layer across app

2. Window to the Augmented World (by action)
Detecting the augmented world
Interacting with and experiencing this AR world via persistent, dynamic and static elements
Manipulating and/or adding to this augmented world (create and edit)
Interacting with people or content outside of AR world (e.g. capturing with camera, viewing a PDF, receiving assistance via video chat)

Entry Point

Once the IA was defined, we needed to determine our priorities and criteria for defining the end-user’s entry point for the application. This decision would have large implications not only on the end-user experience but also on how we marketed, sold, and pitched ourselves. We had to raise a round of funding on the promise of this product. Unsurprisingly, there were strong opinions and valid concerns (you have to love working with people who care).

Being at the intersection of many individual stakeholders, business functions/needs, customer-facing teams (Sales and CS) and engineering constraints is one of my favorite parts of working in Design and Product. We get to take all that in, filter it through an (ideally) unbiased evaluation, research, and design process to make the best experience we can, all the while considering our responsibility as the advocate for the end-user.

From these interactions and research, we defined a few clear goals to guide our process. Our users are just trying to get work done;

How can we provide access to what they need with as little friction as possible?

1

Leverage existing interaction paradigms our users are familiar with

2

Support our existing use cases and account for future, potential use cases to target

3

Contextualization is key to a good AR experience - so how can we make this as intelligent as possible

4

Add value in 90 days: a very real constraint for us, as organizations would not buy us if we could not prove this

Application Entry Point Options

There were three viable options to consider for an end-user’s entry experience in our app.

Did we want to be the Snapchat of Enterprise? Having the camera experience open by default did create some exciting opportunities - and did support some of our existing use cases really well, (e.g. use cases like a mechanic at a Porsche dealership). For many of these technicians, their first step whenever they begin a job is to open the camera, take photos/videos of a car they are checking in (both for compliance and insurance or warranty documentation), and create a ticket afterwards. This would certainly make this easier.

If we leveraged this type of navigational paradigm on wearables, particularly on monocular devices, which tend to be RGB-camera only devices with no depth sensors. This approach might allow us to leverage large swipe navigation gestures that are easier to identify instead of discrete, point-based gestures.

Unfortunately, this whole approach was predicated on the assumption that landing end-users in the camera would yield a better experience or increase output and efficiency. This assumption wasn’t particularly sound in our target use cases and industries. It didn’t help that in our initial tests, a significant group of users did not find this paradigm intuitive & would freeze in such an open-ended initial experience.

We also knew that once we actually could build in all of the intelligence we were planning on, it would be awesome. Once we could know who this user is, where they are, what job(s) they’ve been assigned, what car they’re looking at, that car’s service history, etc., we could make this experience rich and seriously improve the workflow of that tech.

Another thing to consider was the potential interaction overlap with glasses navigation that a camera-first experience would provide. Snapchat was groundbreaking in how their user interface is laid out in such a linear way - where users can move through all screens just by swiping their thumb left, right, up, down.

Providing a curated, directed experience through a Home construct was a clear candidate for the initial landing screen and ultimately was the option we chose to move forward with. It’s not particularly sexy but we no doubt could provide our users with a great personalized, contextualized experience by taking advantage of our audience segmentation, integrations and extensible user profiles. It’s also the paradigm our users were best equipped to navigate without any training.

Our third candidate was an assistant-led chat experience. We keep hearing that voice is the future of the interface. It definitely will play a big role in effective, frictionless AR experiences (especially with input). It would also provide some nice overlaps with glasses interactions (our most commonly used device today is the RealWear HMT-1, a monocular wearable primarily interacted with via voice commands).

I and my teammate got very excited about the visual and UX challenge that an assistant-led experience would present. We were dreaming of robust, flexible card components that combined information and actions, or how we could explore mixing the somewhat ephemeral nature of chat with the convenience of embedded actions/interactive chat UI elements. We learned a lot about virtual assistants and chat-centered interfaces. One takeaway stood out: to achieve adoption, a chat-centered experience requires a certain level of sophistication. We knew we could get there and were excited about the journey, but we also needed something that added value immediately. We didn’t have time to reach the maturity required for an effective chat experience; and as is, it presented too much friction and required users to learn a new interaction paradigm to navigate interfaces that accompany and support them.

Our third candidate was an assistant-led chat experience. We keep hearing that voice is the future of the interface. It definitely will play a big role in effective, frictionless AR experiences (especially with input). It would also provide some nice overlaps with glasses interactions (our most commonly used device today is the RealWear HMT-1, a monocular wearable primarily interacted with via voice commands).

I and my teammate got very excited about the visual and UX challenge that an assistant-led experience would present. We were dreaming of robust, flexible card components that combined information and actions, or how we could explore mixing the somewhat ephemeral nature of chat with the convenience of embedded actions/interactive chat UI elements. We learned a lot about virtual assistants and chat-centered interfaces. One takeaway stood out: to achieve adoption, a chat-centered experience requires a certain level of sophistication. We knew we could get there and were excited about the journey, but we also needed something that added value immediately. We didn’t have time to reach the maturity required for an effective chat experience; and as is, it presented too much friction and required users to learn a new interaction paradigm to navigate interfaces that accompany and support them.

Final Decision

The assistant and camera-led experiences were exciting to explore and would have been interesting to watch mature over time (there’s definitely still potential to transition to those entry points in the future). But it was clear that the Home entry point was the most practical approach to create an intuitive experience, given our user base.