Translating a trust architecture into embedded, family-grade UX · 7 minute read
Five design decisions in this case study:
Nanosarte turns a child’s artwork and voice into illustrated stories for grandparents living far away. An AI agent called Nano picks up the parent’s creative load — capturing stories, adapting them per recipient, scheduling sends around birthdays and holidays, and eventually handling small gift purchases.
The design problem isn’t what Nano can do. It’s how much the parent will let Nano do. For a product handling children’s content and family relationships, the cost of a trust failure is permanent — one badly-shared story can damage a family connection that doesn’t get repaired.
This case study walks through the trust architecture I designed to make that delegation safe, the onboarding redesign that made it work for non-technical parents, and five specific design decisions in the delegation flow that hold it together.
I led this design end-to-end. My process:
I used Stitch for rapid low-fidelity exploration, Figma for production craft, and Claude for pressure-testing decisions and maintaining design-system consistency across 780 specification lines. The tools accelerated specific steps. The architecture, the decisions, and the trade-offs are mine
Families separated by distance have no good way to share the everyday moments of a child’s life with grandparents and extended family. Existing tools — WhatsApp, FaceTime, Google Photos — weren’t built for this relationship. Kids can’t type. Grandparents struggle with apps. The parent in the middle becomes an exhausting full-time relay.
Nanosarte ran as an e-commerce service where parents uploaded artwork and ordered personalized gifts. Parents still did all the curation, scheduling, translation, and recipient management manually. This project asks a harder question: what if an AI agent did that work on the parent’s behalf — and what would it take for the parent to actually let it?
The three user types span a 70-year age range and three literacy levels:
This case study focuses on the parent experience. The grandparent response side and Nano’s behavior specification are covered in a companion case study.
I built on existing research rather than starting over:
Informal conversations with five parents, 20–30 minutes each, focused on AI agent scenarios. This wasn’t structured research — it was listening. Four themes emerged.
The relay exhausts the parent, not the grandparent. “I’m the reason my mom sees my kids at all. If I’m tired, she doesn’t get photos.” Four of five parents named this specifically.
Cultural resonance matters more than translation. Two families had grandparents in a different country. Translation was rarely the hard part. “Would my mom actually enjoy this, or is it just ‘my kid made art’?”
Reversibility beats preview. When I described preview-before-send, parents agreed it was valuable but said they’d skip it after a while. When I described per-recipient pause and mid-flight stop, every parent leaned in. Being able to undo is a deeper form of trust than being able to review.
Parents don’t want to configure trust — they want to feel their family is cared for. Every parent wanted less to do. A system that asks them to configure autonomy has already failed, because it’s asked them to do the work they came to the product to escape.
I read Anthropic’s and OpenAI’s public work on agent autonomy and behavioral specifications, plus HCI literature on calibrated trust. These provided vocabulary for why “auto-share a story” and “auto-buy a gift” are categorically different tasks, and why over-trust and under-trust are both failure modes.
My first onboarding build followed enterprise UX conventions for trust-architected products. It had a trust-mode onboarding flow explaining three tiers (Collaborator, Delegator, Sponsor), a value-calibration step with hypothetical scenarios (“What would you do if your child drew a picture of a sad day at school?”), and a governance dashboard showing Nano’s recent decisions and learned preferences.
Every piece had a principled reason. None of it worked for parents.
Onboarding reduced to three screens — Meet Nano, Your Family, Connect Artwork — ending in under two minutes with Nano’s first prepared story on the dashboard. No value calibration. No trust-mode explainer. No governance dashboard. Nano arrives with common-sense defaults and learns the family’s preferences through contextual questions asked only when content genuinely warrants them.
The trust architecture still exists — it just doesn’t have dedicated screens. Legibility shows up as inline annotations on specific stories. Boundaries show up as the Autopilot Credit runtime. Reversibility shows up as five embedded moments the parent encounters where and when each is relevant. The architecture is the scaffolding. The felt experience is what stays.
Five design decisions make the delegation flow work. Each solves a specific AI product design problem that isn’t unique to Nanosarte.
Action: Replaced the evidence-based progression prompt with a three-sentence agent-voice observation.
Task: My first progression design tracked how often the parent approved Nano’s drafts unchanged, then fed that data back at progression time: “You approved 12 of 13 stories — ready for Delegator mode?” The framework said this was the accuracy play. The psychology was wrong. Parents read the data and felt watched, not reassured. A product handling children’s content can’t frame its relationship with the user as performance surveillance.
I changed the progression model entirely. Nano earns trust by doing good work — preparing drafts proactively, learning from what the parent changes — and the progression prompt references shared experience rather than observation logs. The new prompt is a dashboard card: “The last three stories I prepared, you sent without changes. Want me to start sending the quick ones directly, and only ask you about the trickier ones?”
Strategic call — Trust is earned through demonstrated value, not demonstrated surveillance. The agent’s track record is the product itself, not a metrics dashboard. UX craft — Progression compressed from a full-screen evidence modal to a 3-sentence dashboard card in the agent’s voice. Shared experience replaces behavioral analytics. Information architecture — Approval tracking removed entirely. Parent behavior never surfaces as visible system state. Trade-off accepted — Less precise trust calibration. In exchange, the parent never feels watched — which is what made them willing to keep using the product.
Autonomy before / after
Result: The progression prompt went from a multi-section ceremony to a natural beat in the flow. The parent reads it in under ten seconds. Trust progression stops feeling like a product upgrade and starts feeling like a relationship milestone.
Action: Hid the three-tier classification from the parent and let the agent’s behavior carry the meaning instead.
Task: Every AI that classifies content faces this choice: show the user the classification label, or let them experience it through how the product behaves. I built both. V1 showed Clear / Borderline / Hold tiers as colored badges on the activity feed. V4 hides them entirely.
Parents in V1 started arguing with the labels — “why is this Borderline, it’s fine.” They were treating them as editorial judgment rather than behavioral state. The labels leaked backend state into the experience and gave the parent a system to game.
In V4, the three tiers drive three distinct behaviors: Clear content appears with “Sent.” Borderline content surfaces an inline question. Hold content blocks and explains. The parent never sees the label. Typography and natural language carry state — no color coding, no severity indicators.
Strategic call — Classification is backend architecture, not UI. The parent experiences the agent’s confidence through behavior, not through system labels. UX craft — Three behaviors replace three labels. Verb-first activity states carry meaning through language alone. Information architecture — The feed groups by what Nano did, not by what Nano thought. Tier-based filtering moves to backend analytics the parent never sees. Trade-off accepted — Auditability decreased. A parent can no longer see “Nano classified this with 72% confidence.” Right trade for a non-technical audience.
Confidence before / after
Result: Parents stopped arguing with labels because there were no labels to argue with. They reacted to what Nano did, and Nano calibrated from those reactions — better data from shorter conversations.
Action: Built a question card that sits between the story draft and the send button — visible, required, never alarming.
Task: When Nano encounters ambiguous content, she needs the parent to answer a question before sending. Two common patterns fail here. Inline notes at the bottom of a draft get scrolled past, especially on a polished-looking story. Hard-blocking modals with warning UI make the parent think something is wrong with content that isn’t actually problematic.
I designed an inline-but-blocking pattern. The question card sits between the story draft preview and the send button. The send button is visible but inactive until the parent answers. The question is framed warmly and specifically — “Emma mentioned feeling lonely. Want me to flag moments like this for you in the future?” — never as a warning. Once the parent answers, send activates immediately.
One question per story, maximum. Only when Nano’s common sense genuinely detects ambiguity. Never manufactured to simulate calibration.
Strategic call — Guarantee engagement without creating alarm. The question is the last step before value, not a barrier to it. UX craft — Card sits in the send path, not outside it. Warm specific framing replaces system-alert language. Send disables until answered, re-enables immediately. Information architecture — Maximum one question per story. Only when genuinely ambiguous. The question never appears when Nano is confident or when Nano is blocking. Trade-off accepted — Some parents will find the question slightly annoying in the first few story cycles. Annoyance is the cost of the calibration signal. It decays rapidly once Nano has learned the family’s preferences.
Questions pattern
Result: Parents stopped arguing with labels because there were no labels to argue with. They reacted to what Nano did, and Nano calibrated from those reactions — better data from shorter conversations.
Action: Deleted the central Reversibility Log. Distributed undo across five moments the parent encounters where and when each applies.
Task: The trust architecture prescribed reversibility as a dedicated screen — a Reversibility Log listing every recent action with a “revert” button. I built it. It created anxiety instead of dissolving it. A dedicated screen listing everything that might need undoing made the system feel fragile. Parents either ignored the log or got stuck on it.
In V4, reversibility lives in five specific moments, each sitting exactly where the action it reverses happens:
Strategic call — Reversibility embedded in moments is stronger than reversibility surfaced in a log. Users need to undo at the point of doing, not in an audit panel. UX craft — Five distinct surfaces, each with its own interaction pattern: inline cancel control, post-action banner with timer, persistent global button, per-recipient toggle, story-level feedback capsule.Information architecture — Deleted the Reversibility Log from the dashboard entirely. Distributed its function across contextual moments. Trade-off accepted — Reversibility is less discoverable as a single named feature. A parent browsing the app won’t find a “Reversibility Center.” In exchange, the reversibility they actually need is always present at the moment they need it — the only kind that matters.
Undo: five embedded moments
Result: Parents who asked for “undo” during research were looking for the ability to stop and redirect future agent behavior, not to audit past behavior. Embedded moments deliver exactly that.
Action: Made the $10 milestone reward restricted to Autopilot runtime only — not a flexible wallet.
Task: The Story Moments milestone awards the parent $10 for completing ten stories with Nano. My first instinct was to make that a flexible wallet the parent could spend however they wanted. It felt generous.
It would have broken the trial mechanic. A flexible wallet means the parent applies the $10 toward a gift they were planning to buy anyway, and Nano never gets the designed moment to demonstrate Autopilot. The milestone stops being a trust-transfer mechanism and becomes a generic discount coupon.
In V4, the $10 is Autopilot Credit — restricted to Autopilot runtime costs (story delivery and Nano-purchased gifts). For anything outside Autopilot — a parent manually ordering a gift in Collaborative mode — the parent pays by credit card at checkout. Two payment surfaces, two clear purposes.
Strategic call — Earned currency pulls users toward the behavior it was earned in. Purchased currency should be flexible. Conflating them sacrifices strategic intent for the illusion of generosity. UX craft — Two payment surfaces, never mixed. Autopilot Credit appears where Autopilot is active. Credit-card checkout appears for user-initiated purchases.Information architecture — Balance displays only on the milestone screen and post-activation dashboard. Not in Settings. Not as a global badge. Where it matters, when it matters. Trade-off accepted — Some parents will feel the restriction and want to use the $10 flexibly. That’s a 30-second friction in one specific scenario that protects a multi-month retention mechanic.
Trial credit before / after
Result: The milestone pulls parents into Autopilot as designed. The trial currency doesn’t leak out. The user’s own money stays separately usable for whatever they want to buy manually.
Every AI system has failure modes. Most portfolios skip this. I designed for three specific scenarios where Nano either gets something wrong, lacks confidence, or can’t proceed.
Three failure scenarios
What goes wrong: Nano flags emotional content in a story where the child is clearly role-playing, not sharing real distress. The flag is wrong.
Designed recovery:
What goes wrong: Nano asks a contextually appropriate question, but the parent can’t tell what she’s actually asking for.
Designed recovery:
What goes wrong: Nano has scheduled a story but the credit balance hits zero before the send executes.
Designed recovery:
Human-in-the-loop isn’t a banner. It’s what happens when specific things go wrong. These three scenarios demonstrate how the principle holds up under real failure conditions — not as a claim, as a design.
After the delegation flow shipped, I ran a second round of conversations with the same five parents plus two new ones. Four findings changed what shipped.
The milestone CTA caused a re-read. “Let Nano handle it” made parents pause — handle what, exactly? Renamed to “Activate Autopilot.” More mechanical, matches what the button does, parents stopped hesitating.
Nano’s visual pattern was inconsistent across screens. Two different presentations (warm-surface annotation vs. gold illustration tile) got mixed in one review round. Consolidated in the design system with explicit usage rules — annotation, action, and celebration are now distinct patterns.
An intermediate setup screen broke the flow. I’d designed an Autopilot Setup screen between the milestone CTA and the post-activation dashboard. Reviewing the flow, I realized the CTA was a promise — tapping it should activate, not route to a form. Removed the setup screen entirely and moved any residual config into the post-activation dashboard state.
The “free stories” framing was wrong. “Free from what?” asked one parent, correctly identifying that real API costs exist. Changed to “stories are covered by your balance.” Removed “free” from all product surfaces.