Designing Trust into an AI Agent

How an AI agent earns the right to act for a parent and knows when it shouldn’t.

At a glance

What it is. A storytelling app for far-apart, multilingual families. A child’s drawing and voice become a short animated story for grandparents, and over time it helps keep the family’s language and culture alive.
My role. Product strategy, end-to-end UX, and the build, designed and developed with AI assistance. A self-initiated study under Nanosarte.
Scope. This goes deep on one part: the trust architecture. The rest of the app is out of scope here on purpose.
Timeline. 6 weeks
Tools. Stitch for exploration, Figma for design, React for a working prototype, with the design system kept as a structured, machine-readable spec.

Overview

Nanosarte connects far-apart, multilingual families. A child draws and tells a story about it; the app turns that into a short animated story, helps it reach a grandparent in their own language, and over time carries the family’s language and culture forward.

This case study is about one part of it: the trust architecture, how an AI agent earns the right to act on a parent’s behalf. The job the parent is hiring it for: keep my child close to family across distance and language, without making me the full-time middleman, and without ever doing something with my child’s words that I wouldn’t.

RESEARCH

Problem: Drifting Grandparent/Grandchild bond

Parents want their kids creative, grounded in the family’s values, and close to their grandparents, but language and distance thin the bond, and grandparents can’t take on new technology.

By the third generation native languages are getting lost which makes it hard for grandkids to communicate and connect with grandparents and their culture. The bond thins from both sides at once, while the child’s attention drifts to consuming screens and a weekly language class runs at the wrong cadence to fix it.

Prior research

I built on existing research rather than starting over:

12 months of customer email review: Common themes around cultural adaptation, mockup requests, and product-fit concerns
Data Analytics: 79.2% new visitors, 52.9% mobile, 55+ age group growing 10% YoY
Usability testing (6 users) on the pre-AI product
Personas, empathy map, journey map already established

Qualitative research: 10 parent interviews

Informal conversations with 10 parents, 20–30 minutes each, focused on AI agent scenarios. Five themes emerged.

A grandparent call runs only two or three minutes, a few times a week, and stalls on the language gap, so the kids drift to YouTube, Lego videos, and games instead.
On the other end, the grandparents are simply lonely.
The relay exhausts the parent, not the grandparent: “I’m the reason my mom sees my kids at all.”
Reversibility beats preview: every parent leaned in at “pause” and “undo” far more than at review.
Parents don’t want to configure trust; they want to feel their family is cared for.

The opening this points to: give the child their own art and voice as a story to make and send, more engaging than a stalled call, and easy enough that a grandparent needs no new app to receive it.

Secondary Research: NDSU Extension, “Strengthening Grandparent/Grandchild Ties,” Fuller, H. et al. (2022).

DEFINE

Personas: The family and the agent

In an agentic product the cast is two-sided: the people you design for, and the agent you design. Both shaped this.

The people I designed for

The agent I designed: Nano

Before it can earn trust, an agent needs a clear character, so I defined who Nano is, what it does, and the boundaries it holds. In short: a warm, gradual family helper, a friend who gets better over time, never a salesperson and never a system studying you. It reads artwork, drafts and translates stories, spots occasions, and prepares gifts, and it never shows metrics, never acts without earned permission, never shares without confirmation, and always escalates a child’s distress to the parent rather than deciding itself.

DESIGN

Onboarding: Mapping Parent's mental model

Onboarding V1: Failed

A parent started with the onboarding that had the three trust modes up front, asked parents to pre-answer hypothetical scenarios so Nano could “calibrate” (“what should I do if your child draws a sad day at school?”).
During usability testing with parent’s focus group, none of it worked.

What failed, specifically

Value calibration felt like programming an AI. Parents said “I don’t know how I’d feel about that” and “it depends on who’s asking.” The hypothetical-scenario approach asked them to declare values rather than reveal them through use.
The governance dashboard was noise. Parents don’t care what Nano’s approval rate is. They care whether the last story was good. Showing metrics framed the relationship as surveillance.
The three-mode explainer was premature. Introducing the trust architecture to a parent who hadn’t yet experienced the product was like listing a friend’s three capability tiers before the friendship started.

Onboarding V2: Success

The create-and-send loop

Onboarding reduced to three screens — Meet Nano, Your Family, Connect Artwork — ending in under two minutes with Nano’s first prepared story on the dashboard. No value calibration. No trust-mode explainer. No governance dashboard. Nano arrives with common-sense defaults and learns the family’s preferences through contextual questions asked only when content genuinely warrants them.

The trust architecture still exists — it just doesn’t have dedicated screens. Legibility shows up as inline annotations on specific stories. Boundaries show up as the spend limits on the Fund Card. Reversibility shows up as five embedded moments the parent encounters where and when each is relevant. The architecture is the scaffolding. The felt experience is what stays.

Nano has a question

The trust progression: onto the loop, then out of it

The create-and-send loop above is where every family starts, fully in the loop, nothing leaving without them. From there, trust earns two handoffs, and each tells the same story at a higher volume: what the parent does, what they give up, and the safeguard that makes that much autonomy safe. As they hand over more, the safety net grows to match, which is exactly why the trust can be earned.

Delegator (onto the loop)

After enough stories, Nano has learned the family’s taste, so it offers to take the first pass: “I can draft the next one for you, want to try?” Now Nano drafts, the parent reviews, and it auto-sends the clear ones and brings only the judgment calls. This is the handoff that matters most: the first time a story can reach the family without the parent’s eyes on it.

Trade-off: far less to review, but the parent gives up seeing every one.

What makes it safe: Nano can only send what the parent has authorised, a key it can’t forge; the parent can recall a mistake; and a trusted deputy can stand in when they’re away, so no timer ever auto-releases a held story.

Autopilot (out of the loop, by the parent's choice)

The parent hands over the routine (“just handle Grandma’s monthly gift”); Nano runs it and reports what it sent, what it spent, and how the family reacted.

Trade-off: almost nothing to do, but live visibility traded for summaries, and the parent keeps the wheel, one tap pauses Nano, turns review back on, or pauses it for a single recipient.

What makes it safe: a spend cap the parent sets; recall before it’s opened, remove after, or erase everywhere; and one honest limit named plainly, once a story reaches family, Nano can’t stop a screenshot, so it’s designed to be unlikely, not pretended away.

Throughout, three things never bend: nothing shares unless Nano is sure; a child’s distress goes straight to the parent as its own alert, never handled as a sharing question; and the parent can see, and reset, everything Nano has learned, at any time.

PROCESS

How I built it, with AI

I designed and built this end to end with AI, and how I worked with it matters as much as what I made. I explored in Stitch, refined in Figma, and built a working React prototype, directing the AI rather than pointing it at a black box. To keep it disciplined, I wrote the design system as a structured, machine-readable spec the tools could read, so the build stayed true to my decisions. A decision log recorded every hard call, its tension, and the principle it left behind, and the trust moment alone went through four versions before it was right.

Deliverables to process:

Information architecture — mapped the agent delegation model, user roles, and system structure before any screens existed
Product requirements doc — scope, constraints, success criteria, what’s in v1 vs. deferred
Feature prioritization — what lands first, what earns its place later, what gets cut
Design, iterate, document — every decision captured in a 28-entry structured log with intent, trade-off, resolution, and principle

Outcome: success is a family staying close

Success here isn’t stories sent on time. It’s whether a family stays close, and whether a child keeps the language and culture that tie them to where they came from. The product measures it that way: the dashboard tracks the grandparent’s engagement, how many times a story was played and how many were opened, not how many tasks ran. The key learning is the reframe: I caught my own design making the parent feel watched, and fixed it by changing who the agent watches, the work, not the family.

How it’s designed to hold up: The milestone is timed to invite parents into Autopilot when trust is at its peak. Story Cashback stays scoped to stories, so it can’t be spent like cash, while the parent’s own money on the Fund Card stays separate for anything they choose to buy.

Learning

The most important lesson came from catching my own mistake. My first design quietly made the parent the subject — the thing being watched — and testing surfaced the dread before I’d named it. The fix wasn’t a gentler surveillance screen; it was inverting who the agent watches. The check I’ll carry forward: does this design make the user the subject, or the client?
Frameworks are vocabulary, not templates. The trust mechanisms (legibility, boundaries, reversibility) are durable. The surfaces a textbook application produces — dashboards, logs, configuration screens — are not. My job was to keep the mechanisms and rebuild the surfaces.
Embedded is harder than explicit. The conventional build was faster because each trust mechanism got its own screen. The shipped version took longer because every mechanism had to find its moment in a flow that already had other work to do. Embedded UX is the harder discipline, and the one that separates consumer AI products from ones that explain themselves to death.
Users want to stop the system more than they want to audit it. Pause All, per-recipient pause, and the 60-second undo did more for trust than any audit log ever could have.