Orion AI Outcomes

The Adoption Problem

Michael opened Pillar 4 with a warning that landed hard.

"Most AI projects fail not because the technology doesn't work, but because people don't use it."

Lisa nodded slowly. NorthRidge had seen this pattern before. The company had invested in workflow automation tools that sat unused, collaboration platforms that employees routed around, and analytics dashboards that no one checked. The technology worked; the experience didn't.

"We have governance. We have data. Now I need to know that people will actually use these agents—not because they're mandated, but because they're genuinely better than the alternative."

David added the operations perspective. "My field teams are already skeptical. They've been burned by 'productivity tools' that created more work. If this feels like another checkbox, they'll ignore it."

What the Survey Revealed

Michael pulled up findings from the Superintelligent survey—data that would shape every design decision.

"You already have rich data about how your people actually work. Not how they're supposed to work, but how they really operate day-to-day. This becomes the foundation for experience design."

He highlighted four insights that changed how the team thought about the agents:

1. Field surveyors work in disconnected bursts.

"They spend hours in the field without connectivity, then upload data in batches," Mike Torres confirmed. "Any AI interaction needs to accommodate that rhythm, not fight against it."

2. QA reviewers are overwhelmed by context switching.

Jennifer Liu recognized this immediately. "We jump between reports, regulations, and client requirements constantly. If AI requires us to go find context, it's adding burden, not removing it."

3. Senior experts are protective of their judgment.

"Twenty years of experience isn't something you just hand over to a machine," said Tom Rodriguez, a senior QA reviewer who'd joined the session. "I need to know this thing respects what I bring to the table."

4. Informal workarounds are everywhere.

Employees had developed personal systems—spreadsheets, templates, note-taking habits—that made their work manageable. The survey revealed dozens of these workarounds.

"The best AI experience is invisible," Michael concluded. "It enhances work without demanding attention or requiring users to change how they think."

The Design Sessions

Ana Rodriguez, Orion's experience designer, facilitated three design sessions—one for each prioritized agent. Unlike the governance sessions (which included Legal and Compliance), these sessions brought together the people who would actually use the tools.

Participants:

End users — Field surveyors, QA reviewers, project managers
Workflow owners — People who understood how work actually flowed
Senior practitioners — Veterans who had "seen everything" and understood edge cases

Each session followed the same structure: understand the current state, design the interaction model, and build in transparency.

Session 1: Pre-QA Validation Agent

The session opened with Jennifer walking through a typical day for her QA team.

"A report comes in. I open it. I check it against maybe sixty different things—boundary completeness, methodology documentation, regulatory markers. Most of these checks are mechanical. Did they include the right forms? Did they follow the right sequence? I can tell in thirty seconds if something's wrong, but I still have to verify everything."

Ana asked the key question: "Where do you want the AI to fit into that flow?"

Jennifer thought for a moment. "Before I even open the report. If the AI could tell me 'here are the three things that need your attention,' I could skip the mechanical stuff and focus on what actually requires my judgment."

Tom added a concern. "But I need to know why it's flagging something. If it just says 'problem here' without context, I'm going to check everything anyway because I don't trust it."

"That's exactly what we need to design for," Ana said. "Transparency isn't optional—it's the core of the experience."

The group designed the interaction model:

Entry: Upload report to review queue → agent begins validation automatically
Handoff: Agent presents summary → "I checked 47 items. Here are 3 that need your attention."
Context: Each flag includes the specific issue, the rule it triggered, confidence level, and one-click access to the relevant regulation
Resolution: Reviewer accepts, modifies, or dismisses each flag
Exit: Reviewer marks validation complete → agent logs the decision

Jennifer tested the model mentally. "So it never says 'this report is good.' It says 'here's what I found.' And I make the call."

"Exactly," Michael confirmed. "The agent is a thorough assistant, not an authority."

Session 2: Field Note Normalization Agent

Mike Torres arrived with printed examples—the same inconsistent field notes that had surfaced in Pillar 2.

"This one says 'IP w/ cap.' This one says 'iron post capped.' Same thing, completely different notation. My surveyors waste hours cleaning this up before they can even start on the actual report."

Ana asked: "When do they do that cleanup?"

"Usually at the end of the day, or on the drive back. They're tired. They're rushing. That's when mistakes happen."

A field surveyor named Carlos Mendez spoke up. "What I want is something that cleans up my notes while I'm still in the field. So when I get back, it's already done."

"That's the rhythm we need to design for," Ana said. "Submit notes in the field, get normalized output before you're back at your desk."

The group designed the interaction model with mobile-first thinking:

Entry: Submit field notes (text, photos) from mobile device
Processing: Real-time progress indicator shows normalization in progress
Handoff: Side-by-side comparison—original notes next to normalized version
Review: Changes highlighted by type (terminology in blue, inferred data in yellow)
Exit: Accept all, review individually, or revert to original

Carlos liked the transparency. "If I can see what it changed, I can trust it. If it's a black box, I'll redo everything myself."

David raised a practical concern. "What about offline? Half their day they don't have signal."

Ana noted the requirement. "Queue for sync when connectivity returns. The surveyor sees 'pending normalization' until it processes."

Session 3: Exception Routing Agent

Sarah Martinez, VP Operations, owned this session. The stakes were different—lower volume but higher consequences.

"When something unusual comes in—a high-value property, a client with history, a weird methodology situation—it needs to go to the right expert. Right now that's me triaging based on gut feel. I miss things."

Ana probed: "What happens when you miss something?"

"Best case, it takes longer because it bounces around. Worst case, the wrong person handles it and we have to redo the work—or worse, we deliver something that shouldn't have left the building."

Tom offered the expert perspective. "I don't want AI deciding what's important. I want it telling me why it thinks something might be important, so I can make that call."

This insight shaped the entire design:

Entry: Automatic monitoring—agent continuously analyzes incoming work
Detection: Agent identifies exception based on defined risk patterns
Context: Alert includes why it was flagged, similar past cases, recommended expert, urgency assessment
Decision: Human assigns routing or handling
Learning: Agent learns from routing decisions to improve recommendations

Sarah tested the model. "So it's surfacing information, not making decisions."

"Right. It's saying 'this looks like the Thompson case from last year, and Maria handled that well.' You decide if that's relevant."

Testing with Real Users

Before finalizing, Ana conducted walk-through sessions with clickable prototypes. The feedback changed several designs:

Finding 1: Confidence indicators needed calibration

Users initially ignored "high confidence" flags because they didn't know what the threshold meant.

"What does 85% confidence mean?" Tom asked. "Is that good? Should I be worried?"

Solution: Show confidence as a range with context—"Similar to 94% of cases I've seen."

Finding 2: Explanations needed to be optional

Power users wanted to move fast; new users wanted to understand.

Jennifer: "After a week, I won't need the explanation every time. But new reviewers will."

Solution: Collapsed explanations with "Why?" links that expand on demand.

Finding 3: Mobile experience was critical

Field surveyors needed to interact with the normalization agent from tablets in the field.

Carlos: "If I have to wait until I'm at my desk, I'll just clean up the notes myself on the drive home like I always have."

Solution: Mobile-first design for field-facing features.

Finding 4: Expert matching needed human override

The exception routing agent's expert recommendations were sometimes wrong due to workload or availability.

Sarah: "Maria might be the best person for this type of case, but if she's already handling five emergencies, I need to route it somewhere else."

Solution: Easy reassignment with feedback mechanism so the agent learns from overrides.

The Experience Test

Before any agent went into development, Michael posed a question that became NorthRidge's adoption litmus test:

"Why would someone choose to use this instead of their current approach?"

For each agent, the answer had to be compelling:

Pre-QA Validation: "Because it surfaces the three things I need to focus on instead of making me check sixty things I already know are fine."
Field Note Normalization: "Because my notes are cleaned up before I get back to my desk instead of taking another hour of my day."
Exception Routing: "Because the right expert sees the case immediately instead of it bouncing around until someone notices it's urgent."

If the answer wasn't compelling, the design wasn't ready.

What Changed at NorthRidge

By the end of Pillar 4, NorthRidge had something rare: AI designs that users actually wanted.

Tom, the skeptical veteran, summarized the shift: "I came in thinking this was about replacing what I do. It's actually about letting me do more of what I'm good at."

Carlos was more direct: "Finally, something that actually helps instead of creating more work."

The experience design sessions had accomplished several things:

Users felt ownership — They had shaped how the agents would work, not just been told about them
Experts felt valued — The designs explicitly positioned AI as an amplifier of their judgment
Adoption barriers were identified early — Issues that would have killed adoption post-launch were caught in design
Trust was built into the architecture — Transparency wasn't an afterthought but a core design element