Where Every Step Matters: Designing AI for High-Consequence Work

Written by Amy Johnson | Chief Product Officer Propel | Mar 3, 2026 6:42:08 AM

In the middle of a product validation cycle, a user of the product we were testing framed their evaluation criteria in a single line:

“If it hallucinates when I try it on something I know well, we’re done.”

They were describing how they would test the system: take it into their own area of deep expertise and see how it behaved under scrutiny. One confident but incorrect result in a domain they understood intimately would be enough to end the experiment.

That moment captured something many AI teams underestimate: in high-stakes environments, adoption depends less on the breadth of capability and more on perceived reliability under pressure. If trust in the product breaks, the conversation rarely continues.

I was working with an organisation developing an AI-powered assistant designed to support complex professional workflows. Think a legal practice, an engineering firm, scientific research or a financial advisory team. The ambition was to create an intelligent workspace that helps experts move from question to analysis to interpretation while preserving traceability and professional rigour.

Early users could see the potential to reduce time spent wading through dense material and repetitive analysis. However, that single comment revealed the real constraint shaping adoption.

When a product is powered by AI, trust cannot be treated as an enhancement to be layered on after the core functionality is complete. It influences architectural decisions, data flows, governance models, and user experience. If it is not designed deliberately from the outset, driving adoption may well be an insurmountable challenge.

How risk conscious professionals evaluate products

Like many AI teams, we initially focused on the obvious value proposition. Professionals are overwhelmed by information, and an intelligent assistant can summarise, synthesise, and highlight what matters most. Time is saved, cognitive load is reduced, and productivity improves.

Many users described meaningful time savings and relief from information overload. Yet what we observed repeatedly was that risk conscious experts (scientists, lawyers, financial advisers) do not adopt AI tools in the same way that some of us adopt a new tool to create slide decks or synthesize interviews. They approach them as they would any critical instrument in their practice.

They begin in their area of greatest expertise.
They test edge cases.
They probe for weaknesses.
They look for evidence of fabrication or overconfidence.

If your product surfaces an obvious error in an area they know intimately, the impact is disproportionate. From a product team’s perspective, it may be a defect among many. From the expert’s perspective, it calls into question the reliability of the entire system.

Professionals whose reputation depends on accuracy will not persist with a tool that compromises it.

This leads to a critical design question. What does your architecture do when the model is wrong? Because it will be wrong.

We all know by now that Gen AI, regardless of sophistication, can produce confident inaccuracies. Research across domains, from scientific literature review to legal drafting consistently shows that hallucinations require structural mitigation.

The deeper concern: “What happens to what I put in?”

Accuracy is only the most visible trust issue however in many professional contexts, the more destabilising concern relates to inputs rather than outputs.

When experts consider using an AI assistant, they inevitably ask what happens to the information they provide. In a research environment, this may include unpublished hypotheses or experimental designs. In a legal firm, it could involve confidential client strategies. In finance, it may encompass sensitive deal information. Across domains, these inputs represent competitive advantage, fiduciary responsibility, and professional liability.

“If I put my ideas into this system, will they be used by someone else?”

Sophisticated professionals may not know the nuances of each vendor’s policy, but they understand risk. In the absence of clear, enforceable boundaries, they assume exposure.

When these concerns are taken seriously, “trust as architecture” becomes a set of concrete enabling constraints.

The architectural decisions trust forces you to make

Instead of identifying a list of trust features, the focus shifts to defining trust properties that must hold true across the system.

1. Define and enforce the trust boundary

If users are unsure whether their inputs might be used to train models or shared beyond their intended context, reassurance in onboarding flows is insufficient. The credible response requires alignment between policy, contracts, and system behaviour.

This involves deliberate choices about data retention, model training practices, access controls, permissioning, audit trails, and default configurations. Defaults are particularly important because they shape behaviour and expectations.

Where possible, trust boundaries should be made explicit and configurable. If a project or workspace can be designated as private, restricted, or shareable, that designation must govern data handling in practice.

2. Prioritise traceability over abstract explainability

In high-stakes professions, users are less concerned with theoretical discussions about model interpretability and more concerned with practical traceability.

They want to know which sources were used, which steps were taken, which tools were executed, and where uncertainty enters the process. They need to reproduce results, inspect intermediate outputs, and apply human judgment at critical points.

Designing for traceability often means integrating AI into existing workflows rather than attempting to replace them wholesale. When an AI system orchestrates or augments established processes that professionals already trust, it inherits some of that credibility.

3. Make quality visible and continuously managed

A common pattern in AI teams is to treat model quality as an internal optimisation problem and user trust as an external perception issue. In practice, they are tightly coupled.

In domains where mistakes carry reputational, financial, or legal consequences, a single high-profile error can have lasting effects on adoption. Addressing this requires more than disclaimers. It demands mechanisms for monitoring output quality, capturing structured feedback, reviewing edge cases, and iterating deliberately.

When users can see that quality is measured and improved over time, trust becomes a shared endeavour rather than a blind leap of faith.

4. Recognise that trust decays without maintenance

Trust evolves alongside models, regulations, vendor policies, and user expectations. Changes in data retention rules, legal requirements, or third-party dependencies can alter the effective trust boundary without a single line of product code changing.

Sustaining trust requires ongoing governance, risk assessment, and transparent communication. It resembles operating critical infrastructure more than releasing periodic feature updates.

Trust as infrastructure

Rather than thinking of it as a clever assistant layered on top of professional practice, we began treating it as professional software that incorporated AI components. This reframing influenced prioritisation, investment, and internal language.

We focused less on persuading users to trust the system and more on engineering behaviours that deserved trust. We treated integration with established tools and processes as part of the trust equation, reducing risky workarounds.

We also became more precise about appropriate use cases. In any professional setting, there are tasks where AI assistance is valuable and others where reliance may be premature. A well-architected system helps users calibrate their expectations, signalling where it is robust and where caution is warranted.

A question for product leaders

If your most valuable user entered their most sensitive client matter, hypothesis, strategy, or deal into your product today, could you clearly and accurately explain where that information goes, who can access it, how long it is retained, and whether it influences any future model training?

The answer needs to be visible in the architecture, enforced by defaults, and observable in how your products behave.

View full post