How I Built My AI Stack: 7 Tools That Actually Work Together

Building a personal AI stack isn’t about finding the “one perfect tool”—it’s about orchestrating multiple specialized tools that complement each other. After 8 months of experimentation (and plenty of failures), I’ve landed on a 7-tool stack that handles everything from email triage to voice-activated task management. Total cost: under $40/month. Time saved: roughly 12 hours per week.

This article breaks down exactly what I use, why these specific tools, and how you can build your own version.

TL;DR — The Quick Take

My personal AI stack uses Claude as the brain, Whisper for voice transcription, n8n for automation, Qdrant for memory, Ollama for local models, Khal for calendar, and Ntfy for notifications. Total: ~$20-40/month plus one-time infrastructure setup.

The Morning That Changed Everything

Six months ago, I woke up to 147 unread emails, three calendar conflicts I’d somehow missed, and a sticky note on my monitor that said “REMEMBER: client call at 9.” It was 9:14.

I’d already tried ChatGPT for drafting emails. I had Claude for coding help. Notion for notes. Zapier for automation. Google Calendar obviously. But none of them talked to each other in any meaningful way. I was the glue—manually copying context from one tool to another, re-explaining who I was and what I needed, watching productivity gains evaporate in the overhead of context-switching.

That morning, after apologizing profusely to a very understanding client, I made a decision: I would build an AI stack that actually worked as a system. Not a collection of disconnected tools. A stack.

Eight months later, I have exactly that. My AI stack now:

Triages my email and drafts responses in my voice
Manages my calendar with actual intelligence about my preferences
Transcribes voice notes into actionable tasks
Remembers context across conversations—who I talked to, what we discussed, what I care about
Runs on my own infrastructure (mostly)

Here’s exactly how I built it.

What I Mean by “Stack”

Before diving into the tools, let me define what a “personal AI stack” actually means—because it’s different from just using AI tools.

A stack has three characteristics:

Integration: Tools pass information to each other without manual intervention
Context: The system maintains memory of who you are and what you’ve done
Orchestration: There’s a central “brain” that coordinates actions across tools

Think of it like the difference between a pile of ingredients and a kitchen. The ingredients are useful, sure. But the kitchen—with its workflow, organization, and your cooking instincts—is what turns ingredients into meals.

Most people have AI ingredients. I wanted a kitchen.

The 7-Tool Stack: Overview

Here’s what I settled on after testing dozens of alternatives:

Tool	Role	Cost
Claude (Opus/Sonnet)	Primary reasoning engine	$20/mo
Whisper	Voice transcription	Free (self-hosted)
n8n	Workflow orchestration	Free (self-hosted)
Qdrant	Vector memory/search	Free (self-hosted)
Ollama	Local model runner	Free (self-hosted)
Khal + vdirsyncer	Calendar management	Free (open source)
Ntfy	Push notifications	Free (self-hosted)

Total: ~$20-40/month depending on API usage, plus one-time infrastructure setup.

Let me break down why each piece exists and how they connect.

Tool #1: Claude — The Brain

What it does: Primary reasoning, writing, coding, and decision-making.

I tried being “model-agnostic” for a while, spreading my usage across ChatGPT, Claude, Gemini, and various open-source models. It was a mistake. Not because the other models are bad—they’re not—but because context fragmentation kills the stack.

Claude became my primary brain for three reasons (see our ChatGPT vs Claude comparison for why I chose Claude):

Consistent personality — After enough interaction, Claude genuinely learns my writing style, preferences, and working patterns. Switching models means losing that continuity.
Tool use reliability — Claude’s function calling (especially with MCP) is remarkably stable. When I need the AI to actually do things—query a database, send a notification, check my calendar—it works.
Long context window — I can dump entire project contexts, conversation histories, and reference docs into a single conversation. That’s critical for the “stack” concept.

How it fits: Claude is the reasoning layer. It receives context from other tools (calendar events, email threads, voice transcriptions) and produces outputs that flow back to those tools. It doesn’t need to do everything—it needs to think about everything.

Config tip: I maintain a SOUL.md file that defines my assistant’s personality, preferences, and constraints. This gets injected into every conversation. The difference between “generic AI responses” and “responses that actually sound like they came from someone who knows me” is entirely in this context engineering.

Tool #2: Whisper — My Voice

What it does: Converts speech to text with surprising accuracy.

This was a game-changer I didn’t expect. I used to think voice interfaces were gimmicky—the stuff of sci-fi movies and frustrated smart speaker interactions. I was wrong.

Here’s the reality: I can speak roughly 150 words per minute. I can type maybe 70. That’s a 2x throughput improvement for capturing thoughts, and thoughts captured are thoughts not lost.

I run Whisper locally using the small model (best balance of speed and accuracy for my English/Dutch mix). Setup was trivial:

pip install openai-whisper
whisper voice_note.m4a --model small --output_format txt

How it fits: Whisper feeds into n8n workflows. I record a voice note on my phone, it syncs to my server, Whisper transcribes it, Claude interprets the intent, and actions happen. For dedicated meeting transcription tools, see our AI meeting assistants guide. “Remind me to email Sarah about the Q3 numbers tomorrow morning” becomes a calendar event and a draft email—without me touching a keyboard.

Real example from last week: I was driving when I remembered a crucial detail about a client project. Voice note → transcription → task created in my inbox → reminder set for when I arrived at my desk. Total effort: 15 seconds of speaking.

Tool #3: n8n — The Nervous System

What it does: Connects everything. Routes data. Triggers actions.

If Claude is the brain, n8n is the nervous system. It’s an open-source workflow automation platform—think Zapier, but self-hosted, more powerful, and free.

My n8n instance runs about 40 active workflows, including:

Email triage: New emails get classified by priority, sender relationship, and required action. Low-priority newsletters get archived. High-priority client messages trigger immediate notifications.
Morning briefing: Every day at 7 AM, a workflow compiles my calendar, weather, pending tasks, and any flagged emails into a single summary that gets pushed to my phone.
Voice note processing: The Whisper → Claude → action pipeline I mentioned earlier.
Memory management: Important conversations get summarized and stored in Qdrant for later retrieval.

How it fits: n8n is the orchestration layer. It doesn’t make decisions—it routes information to Claude, which makes decisions, then executes those decisions across other tools. The separation of concerns matters: orchestration logic stays in n8n (where it’s visible and debuggable), reasoning stays in Claude (where it’s flexible and intelligent).

Setup tip: Start with one workflow. Get it working perfectly. Then add another. I tried building my entire system at once initially, and debugging was impossible. Incremental building is everything.

📬 Building your AI stack? Get weekly tool reviews and productivity tips — subscribe to the newsletter.

Tool #4: Qdrant — Long-Term Memory

What it does: Vector database for semantic search across everything I’ve saved.

Here’s the dirty secret of AI assistants: they don’t actually remember you. They simulate memory by stuffing context into prompts, but when the conversation ends, it’s gone. Tomorrow’s Claude doesn’t remember today’s revelations.

Qdrant fixes that. It’s a vector database—meaning it stores information as embeddings that can be searched semantically rather than just by keywords.

Practical example: Three months ago, I had a conversation about a client’s specific technical requirements. Yesterday, I asked my stack “What did Alex from TechCorp say about their API constraints?” Qdrant found that conversation, Claude read the relevant excerpt, and I had my answer in seconds.

How it fits: Important conversations, decisions, and notes get embedded and stored in Qdrant. When I start a new task, relevant context gets automatically retrieved and injected into Claude’s prompt. It’s like having a research assistant who’s read everything I’ve ever written and can instantly recall the relevant bits.

Storage stats: After 8 months, I have about 2,400 memory chunks stored. Qdrant search returns useful results in under 200ms. The entire database is about 1.2GB.

Tool #5: Ollama — The Local Option

What it does: Runs open-source models locally for specific tasks.

Not everything needs Claude’s full reasoning power. Some tasks are better handled by smaller, faster, local models—especially when privacy matters or I want to avoid API costs.

I run Ollama with a few models:

Llama 3.2 (3B): Quick classification and simple extraction tasks
CodeLlama: Code review and simple refactoring when I’m working offline
Mistral (7B): Drafting and summarization when I’m rate-limited

How it fits: n8n routes simple tasks to Ollama and complex tasks to Claude. Email classification? Ollama. Nuanced client response requiring my voice? Claude. This hybrid approach cut my API costs by about 40% without noticeable quality degradation.

Reality check: Local models aren’t as good as Claude or GPT-5. Don’t pretend they are. But for specific, bounded tasks, they’re plenty good—and instant, private, and free.

Tool #6: Khal + vdirsyncer — Calendar That Understands

What it does: Calendar management with CalDAV sync.

Yes, I could just use Google Calendar. But I wanted my calendar in my stack—meaning Claude needed to read and write events, n8n needed to trigger on schedule changes, and I needed it all to sync with my actual calendar (which my wife also uses for family coordination).

Khal is a command-line calendar that stores events in standard CalDAV format. Vdirsyncer syncs that with Google Calendar bidirectionally. The combination gives me:

Full programmatic access: Claude can create, modify, and query events
Real sync: Changes appear on my phone and my wife’s shared calendar
Offline capability: Calendar works without internet (syncs when reconnected)

How it fits: Claude queries Khal before making scheduling decisions. “Schedule a call with Jordan next week” triggers a lookup of my availability, Jordan’s preferred times (stored in Qdrant), and creates an event that syncs everywhere.

Why not Google Calendar API directly? I tried. The OAuth flow is painful, rate limits are aggressive, and Google’s API is frankly overengineered for personal use. CalDAV is simpler.

Tool #7: Ntfy — The Alert System

What it does: Push notifications to my phone and desktop.

The stack needs a way to get my attention. Ntfy is a dead-simple push notification service. Self-hosted or use their free tier—either works.

When my stack needs me to know something:

Urgent email from a key client
Calendar conflict detected
Voice note couldn’t be parsed (needs manual review)
Daily briefing ready

…it sends a notification via Ntfy. I have different priority levels: critical (makes noise), high (banner on lock screen), normal (appears in notification center), low (batched into end-of-day summary).

How it fits: n8n sends to Ntfy. That’s it. Simple integration, reliable delivery, no dependencies on proprietary ecosystems.

The Integration: How They Actually Connect

Here’s a real workflow to illustrate how the pieces fit:

Scenario: It’s Monday morning. My stack prepares my weekly briefing.

n8n triggers at 6:45 AM (cron job)
Khal query retrieves this week’s calendar
Qdrant query retrieves pending commitments from past conversations
Email check pulls unread count and flagged messages
Claude receives all this context, generates a natural-language briefing
Ntfy pushes the briefing to my phone
Memory stores the briefing in Qdrant for future reference

Total time from trigger to notification: about 8 seconds. I wake up to a personalized summary that actually knows what I care about.

What Didn’t Work (The Failures)

The stack you see is version 4. Let me be honest about what I tried and abandoned:

LangChain: Overly complex for my needs. Great for building products; overkill for personal use. I replaced it with direct API calls and n8n.

Notion as the central hub: The API is slow and limited. Moving to plain markdown files with Qdrant indexing was dramatically better.

Voice-first everything: I tried making voice the primary input method. Turns out, some things are just better typed. Now voice is one input among several.

Over-automation: Early versions tried to act without confirmation on too many things. I added a “draft and confirm” pattern for any action with external consequences. Less efficient, but I sleep better.

Building Your Own: Where to Start

If you want to build something similar, here’s my recommended order:

Start with Claude + one integration. Get email triage working before adding more complexity.
Add n8n early. Even if you only have two tools, having an orchestration layer makes the third tool easier to add.
Memory comes later. Qdrant is powerful but not essential for v1. Start with session-based context.
Self-hosting is optional. Most of these tools have cloud-hosted alternatives. Don’t let infrastructure block progress.
Document everything. Your future self needs to debug this. Write down why you made each choice.

The Results

After 8 months with this stack:

Email processing time: 2 hours/day → 25 minutes/day
Missed calendar conflicts: ~4/month → 0 in last 3 months
Ideas captured vs. lost: Immeasurable, but I have 400+ voice notes transcribed
Context switching overhead: Significant reduction (hard to quantify, but real)

The stack isn’t perfect. It breaks sometimes. Claude misunderstands context occasionally. Some integrations are fragile. But it’s mine—tuned to my workflows, my preferences, my voice.

That’s the real point. Not using AI tools. Building an AI system that works for how you actually work.

What’s Next

I’m currently experimenting with:

MCP (Model Context Protocol) for tighter Claude integration — learn about MCP here
Local embedding models to reduce Qdrant query latency
Multi-agent coordination for complex tasks

The future isn’t using one AI tool. It’s building your own AI stack.

If you’re not ready to build your own infrastructure, check our guide on how to build your AI tech stack from scratch for simpler options at every budget level. Just getting started? Our beginner’s guide to AI tools covers the fundamentals. For free alternatives that don’t require infrastructure, see our best free AI tools guide.

Want to share your stack setup? Drop a comment—I learn as much from others’ setups as I do from my own experiments.

📬 Get weekly AI tool reviews and comparisons delivered to your inbox — subscribe to the AristoAIStack newsletter.

Keep Reading

Last updated: February 2026