Skip to main content

Bella Chat Service

The Bella Chat Service is the AI orchestration engine for Bella Keys. It routes user queries to the right tool — financial data via MCP, knowledge search via RAG, or a direct LLM response — and streams the result back token-by-token over SSE.


Financial Data Access Natural language queries against accounts, spending entries, and budgeting periods. Routed to the EMS MCP Server via LLM tool calls.

Semantic Knowledge Search (RAG) Retrieves grounded answers from the personal wiki using Qdrant vector search. Responses include source hyperlinks.

Persistent Conversation Memory Multi-turn sessions are checkpointed to PostgreSQL via AsyncPostgresSaver. Conversation history survives Electron window restarts.

SSE Streaming Responses stream token-by-token as text/event-stream. Event types: thinking, tool_call, tool_result, response, error, done.

Configurable LLM Backend Supports Ollama (local, default: qwen2.5vl:7b) and Google Gemini, switchable via SYNTHESIS_MODEL_PROVIDER.

Observability LangChain traces instrumented with Arize Phoenix via openinference. Trace data stored in PostgreSQL.