Process
1
Architecture
RAG pipeline grounded in real organizational content
- Structured 75 session transcripts (45 minutes each) into a ChromaDB vector store using FastEmbed for embedding.
- Built multi-step prompt chains that draw from verified organizational content — not generic model knowledge.
- Designed intake patterns, response formatting, confidence framing, and edge-case handling for each agent.
2
Infrastructure
Self-hosted on WD PR4100 NAS — no cloud dependency
- Deployed all three systems on a self-hosted NAS with Cloudflare tunnel for remote access.
- Implemented a watchdog process for automatic crash recovery — maintaining consistent availability.
- Investigated and documented the full NAS boot chain to resolve persistent auto-start failures — identified volatile RAM rebuild as root cause and established a reliable restart procedure.
3
Production Bug Fix
Diagnosed and resolved a message fragmentation issue in production
- Identified an asyncio message fragmentation bug causing disconnected multi-part replies in the live agent.
- Implemented a debounce buffer at the code level to restore coherent response behavior.
- Documented the fix and updated the recovery procedures for future incidents.
Outcomes
- Three production AI systems running continuously — FAQ agent handling common inquiries automatically, two RAG coaching bots grounded in real client content.
- Self-hosted infrastructure with no ongoing cloud costs and full control over data.
- Production bug resolved without service interruption — system has maintained reliable operation since.