All work

Process

1

Architecture

RAG pipeline grounded in real organizational content

  • Structured 75 session transcripts (45 minutes each) into a ChromaDB vector store using FastEmbed for embedding.
  • Built multi-step prompt chains that draw from verified organizational content — not generic model knowledge.
  • Designed intake patterns, response formatting, confidence framing, and edge-case handling for each agent.
2

Infrastructure

Self-hosted on WD PR4100 NAS — no cloud dependency

  • Deployed all three systems on a self-hosted NAS with Cloudflare tunnel for remote access.
  • Implemented a watchdog process for automatic crash recovery — maintaining consistent availability.
  • Investigated and documented the full NAS boot chain to resolve persistent auto-start failures — identified volatile RAM rebuild as root cause and established a reliable restart procedure.
3

Production Bug Fix

Diagnosed and resolved a message fragmentation issue in production

  • Identified an asyncio message fragmentation bug causing disconnected multi-part replies in the live agent.
  • Implemented a debounce buffer at the code level to restore coherent response behavior.
  • Documented the fix and updated the recovery procedures for future incidents.

Outcomes

  • Three production AI systems running continuously — FAQ agent handling common inquiries automatically, two RAG coaching bots grounded in real client content.
  • Self-hosted infrastructure with no ongoing cloud costs and full control over data.
  • Production bug resolved without service interruption — system has maintained reliable operation since.