I Automated My Entire Morning Routine with AI Agents - Here's What Happened

For 14 days, I let AI agents run the first 90 minutes of my morning. Calendar prep, news brief, email triage, a workout plan tailored to my soreness, even what I should eat. I logged every minute, every API call, and every time something broke.

The outcome surprised me. Not because the agents were magical, but because the failure modes were so specific and so human. Here's the full breakdown - what to copy, what to skip, and what it actually costs.

Key Takeaways

A 5-agent morning stack saved 38 minutes per day on average across 14 days, but only after a 6-day tuning phase
Daily token spend stabilized at $1.40 - under $45/month - using a mix of Claude Haiku 4.5 for routine tasks and Sonnet 4.6 for synthesis
The biggest wins came from agents that summarized existing data; the worst failures came from agents asked to decide without enough context
Email triage was the highest-value automation; "what should I eat" was the lowest, despite feeling impressive
An agent that runs 7 days a week without occasional human review will drift - schedule a weekly audit or it gets worse, not better

Short Answer

Did automating my morning routine actually work? Yes, but not the way the influencer videos promise. The agents that read inputs and produced summaries (calendar, news, email) saved real time from day one. The agents that tried to make decisions for me (workouts, meals, priorities) needed constant correction until I narrowed their scope. The honest verdict: AI agents are excellent personal interns and mediocre personal assistants - at least until you spend a week teaching them.

The full prompt set, agent definitions, and cost log from this experiment are open-sourced in the openbooklet/morning-stack repo. Skill packages for each agent are searchable on OpenBooklet.

The Stack: 5 Agents, 1 Orchestrator

I picked 5 high-impact tasks that consumed predictable time every morning. Each became its own agent, and a single orchestrator chained them together.

Agent	Job	Model	Avg time saved/day
Brief	Pull 8 news sources, summarize what matters in my domain	Haiku 4.5	12 min
Calendar	Review today's meetings, prep notes per attendee from CRM	Sonnet 4.6	9 min
Triage	Sort overnight email into respond / archive / ignore	Haiku 4.5	11 min
Coach	Suggest a workout based on yesterday's strain + sleep	Sonnet 4.6	4 min
Kitchen	Recommend breakfast given fridge contents + macros	Haiku 4.5	2 min

The orchestrator was a simple workflow that ran at 6:55 AM, fanned out to all 5 agents in parallel, and dropped results into a single markdown file in my Obsidian vault by 7:02.

Why fan-out, not chain

I tried both. Chaining made each agent depend on the previous one's output - clean architecturally, but a single failure killed the whole brief. Fan-out is uglier but resilient. If Coach times out (which happened twice), I still get my email triage and news. The trade-off is worth it for a daily workflow where partial output beats none.

"If your agent system can't degrade gracefully, it's not a system - it's a single point of failure with extra steps."

That's a lesson I learned the hard way on day 4, when a malformed JSON response from one agent took down the entire chain and I missed two morning meetings I should've prepped for.

The Tuning Phase Nobody Talks About

Days 1–6 were rough. Output was technically working but practically useless. The agents were summarizing the wrong things, prioritizing the wrong emails, and one of them kept suggesting I do leg day even when my watch reported I'd just done one.

The fix wasn't in the model. It was in context.

# Before - generic, brittle
You are a productivity assistant.
Summarize my calendar for today and prep me.

# After - scoped, opinionated
You are reviewing today's calendar for Priya, who works in growth marketing
at a B2B SaaS company. For each meeting:
- Pull the attendee from the CRM (provided as context)
- Note any open Linear tickets they own
- Flag if this is a first meeting or recurring
- Skip internal standups unless attendee count > 8

Output: one-line briefs only. No filler.

The "after" version cut prep output by 60% but increased usefulness by far more. Generic prompts produce generic output. Specific prompts produce useful output. Anthropic's prompt engineering guidance makes this point repeatedly, and it bit me hard until I followed it.

Treat each agent's prompt as a contract you renegotiate weekly. What you wrote in week 1 will be wrong by week 3 - your context, your priorities, and your input data all shift.

What Worked: Summarization Beats Decision-Making

Looking at the 14-day log, the clearest pattern was this: agents that summarized data I already had performed reliably. Agents that tried to decide on my behalf failed often.

Brief had a 0% override rate by day 14 - I almost never edited the news summary
Calendar had a 12% override rate - I added context the agent didn't have access to
Triage had an 18% override rate - but rarely on what to ignore, mostly on tone of replies
Coach had a 47% override rate - too many factors it couldn't see
Kitchen had a 71% override rate - basically a glorified random suggester

The clearer pattern: as the agent moved from "compress information" to "make a value judgment," accuracy collapsed. This matches what GitHub's Octoverse 2024 found across coding agents - humans still trust AI more for explanation than for prescription.

What I'd Do Differently

If I were starting fresh today, I'd cut the stack in half.

Keep Brief, Calendar, and Triage - they pay for themselves daily
Drop Coach and Kitchen - they were novelty, not utility
Add a 7th-day audit - every Sunday, review the week's logs and tune the prompts
Pin a model version - when Anthropic shipped a new Sonnet mid-experiment, two of my agents started behaving differently overnight
Don't chain - fan out - fragility hides until you're already running 7 days deep

I'd also build the stack inside a single OpenBooklet skill package instead of scattered scripts. When I tried sharing this with two friends, the fact that it was a hand-rolled mess of cron jobs and Python files killed any chance of them adopting it.

The Real Cost

Here's the 14-day cost breakdown, including the false starts and re-runs.

Item	Cost
Anthropic API (Haiku + Sonnet)	$19.60
OpenAI embeddings (for the brief retrieval)	$1.40
n8n self-hosted (free, but $4 hosting allocation)	$4.00
Total over 14 days	$25.00
Projected monthly (steady state, no tuning)	~$45

For 38 minutes per weekday, that's roughly $0.06 per saved minute. However, the comparison isn't really cost-vs-time. It's cost-vs-cognitive-load. The actual win wasn't the minutes saved - it was starting the day with my decisions already prioritized for me, instead of fighting an inbox.

FAQ

Should I run agents on a cron, or trigger them manually?

It depends on your routine consistency. I run mine on a cron because my mornings start within a 20-minute window. If your schedule varies more, a manual trigger from your phone (an iOS Shortcut or a Telegram bot) is more reliable than a cron that fires when you're not there to consume the output.

What's the smallest version of this someone could try?

Start with one agent: email triage. It's the highest-ROI single automation in this stack. Spend a week tuning it before adding anything else. Most people who try to launch a 5-agent stack from day one give up by day 4 because the maintenance overhead is brutal during the tuning phase.

Can I do this without coding?

Yes, but with caveats. Tools like Make.com and Zapier now support AI agent steps, and they're sufficient for a 2–3 agent stack. However, when you hit 5+ agents that need shared context (like my CRM integration), you start fighting the visual editor more than it helps.

How do I know when an agent has drifted?

The honest answer: you don't, until something embarrassing happens. The defense is the weekly audit - read 3 random outputs from each agent every Sunday and ask "would I have written this?" If the answer is no, retune the prompt. Drift is not always visible; only deliberate review catches it early.

Is this worth it for someone with a normal job?

If your morning involves any of: a calendar with 3+ meetings, an inbox with 30+ overnight emails, or content monitoring across multiple sources - yes, the email triage agent alone will pay for the entire stack. If your morning is "coffee, gym, code" with no information work, the answer is no.

Closing Key Takeaways

Summarization wins, prescription fails - automate "compress this for me" tasks before you automate "decide for me" tasks
The tuning phase is unavoidable - budget 6+ days before you trust any output
Fan-out beats chaining for daily resilience - partial output beats none

Further reading: The Solopreneur's AI Stack | 10 AI Agent Skills Every Developer Should Install | Browse the Ideas hub | Explore agent skills on OpenBooklet

Ready to supercharge your AI agents?

OpenBooklet is the free, open skills marketplace for AI agents. Discover verified skills, publish your own, and make your agents smarter.

Browse Skills

About the author

Priya builds products at the intersection of AI and developer tooling. She writes about ideas, workflows, and the future of agentic software.

Priya Mehta · Product Engineer

I Automated My Entire Morning Routine with AI Agents - Here's What Happened

Key Takeaways

Short Answer

The Stack: 5 Agents, 1 Orchestrator

Why fan-out, not chain

The Tuning Phase Nobody Talks About

What Worked: Summarization Beats Decision-Making

What I'd Do Differently

The Real Cost

FAQ

Should I run agents on a cron, or trigger them manually?

What's the smallest version of this someone could try?

Can I do this without coding?

How do I know when an agent has drifted?

Is this worth it for someone with a normal job?

Closing Key Takeaways

Ready to supercharge your AI agents?

About the author

Related Articles

The Solopreneur's AI Stack: How One-Person Businesses Do the Work of 10

10 AI Agent Skills Every Developer Should Install Right Now

Top 10 AI Agent Skills Every Developer Needs