AWS QUICK SUITE • Concept 2025
Evaluating model outputs

My Role
UX Lead, Product requirements, User research
Timeline
3 weeks sprint
The Product
Amazon Quick Flows is an AI-powered automation tool that lets business users build workflows for repetitive tasks using natural language prompts, no technical skills required. It connects data across Amazon Quick Suite and third-party apps like Jira, enables real-time web data retrieval and browser automation, and supports conversational refinement of outputs. The tool can generate images and visualizations using AI models and QuickSight dashboards. Examples include sales teams automating meeting prep and reports, marketers generating social media content, and HR creating job descriptions and onboarding materials.
The Challenge
Vision for the future of Quick Flows
In Q4 2025, our product and engineering lacked a unified direction, leading to stakeholder misalignment and stalled velocity. I was tasked with defining a long-term UX vision to move the team from ambiguity to execution. The vision was executed in 2 phases: the first focused on redesigning the tool to address usability challenges from previous research studies, and the second focused on making the experience sticky for both the users and creators of the Flow
Key question: How might we enable Flow creators to define sense of success or quality?
Solution
Evaluate model generated outputs using test cases, success criteria with ease
Generate test cases
The goal is to give users an easy way to generate test cases while using generated input data. Hence, users start with a simple screen to generate test cases - this is the primary path.
Add your own data
In research, we heard some users are interested in adding their own data while testing especially for use cases like financial data.
Edit test cases
Edits success criteria and inputs for each test cases to test the output against.

Test Summary and fixes
Replaced verbose logs with a scannable summary view to highlight critical failures immediately. AI-driven fixes that provide contextual suggestions to resolve errors instantly.
Team Ideation
I was given key questions to investigate and explore within the theme which were ambiguous. I conducted ideation sessions with the design team and product partners to gather inputs that I then synthesized into a coherent UX flow

Iterations
To meet aggressive timelines, I bypassed low-fidelity wireframing in favor of rapid, high-fidelity iteration. This allowed for immediate, high-signal reviews with Engineering and Product stakeholders. To ensure quality wasn't sacrificed for speed, I conducted targeted user research sessions to validate the desirability of the new feature and refine the core concepts early in the cycle
Iteration 1 (Tested with users)

Empty state communicates a clear Generate intent

Show what has been generated

KPIs, Test Summary, and recommendations
Empty state iterations

User Insights
What went well
1. AI generated performance summary provides details like accuracy and hallucination that builds trust
2. Test cases solve a problem for users today especially for use cases data handling, connections, and duplicate prevention.
Opportunities to improve
1. Optimize the AI evaluation loop
By mitigating cognitive overload through reduced signal noise in test cases, clearer performance benchmarks, and the translation of dense logs into actionable insights.
2. 'One-click' error resolution is the primary value proposition
Viewing AI-assisted fixes as a critical aha moment from friction to resolution.
3. AI-generated test data and test cases need flexibility
Allowing users to add their own data/test cases as required.