Knowledge Base

The Knowledge Base is an AI-compiled wiki system. Upload documents, URLs, or text — a dedicated LLM compiles them into structured, cross-linked wiki pages that your agents and MCP clients can search and read.

You can create multiple knowledge bases to organize knowledge by domain, team, or purpose. One is always the default, and agents can be linked to specific knowledge bases for reading and writing.

How It Works

  1. Add sources — upload documents (PDF, DOCX, CSV, text), paste URLs, or let conversations auto-ingest
  2. Automatic compilation — a dedicated LLM reads each source and produces structured wiki pages with cross-references
  3. Agents read the wiki — the knowledge_base built-in tool lets agents search and read pages from their linked knowledge bases
  4. MCP clients access it too — Claude Desktop, ChatGPT, and other MCP clients can search and add to the wiki

Managing Knowledge Bases

Via the BlueNexus App

Navigate to Knowledge in the top menu. You'll see three tabs:

  • Wiki — browse compiled pages, search, view with rendered markdown, edit pages manually
  • Sources — view ingested sources, add new sources (text, URL, or file upload), monitor compilation status
  • Settings — enable/disable the knowledge base system, toggle conversation auto-ingest, configure per-KB settings (max pages, compilation instructions)

You can create additional knowledge bases and switch between them from the Knowledge page.

Via the API

All pages and sources are scoped to a knowledge base via the :kbId path parameter.

# List all knowledge bases
curl -H "Authorization: Bearer $TOKEN" \
  https://api.bluenexus.ai/api/v1/knowledge/bases

# Create a new knowledge base
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name":"Product Docs","description":"Product documentation and FAQs"}' \
  https://api.bluenexus.ai/api/v1/knowledge/bases

# List wiki pages in a knowledge base
curl -H "Authorization: Bearer $TOKEN" \
  https://api.bluenexus.ai/api/v1/knowledge/bases/$KB_ID/pages

# Search pages
curl -H "Authorization: Bearer $TOKEN" \
  "https://api.bluenexus.ai/api/v1/knowledge/bases/$KB_ID/pages/search?q=pricing"

# Read a specific page
curl -H "Authorization: Bearer $TOKEN" \
  https://api.bluenexus.ai/api/v1/knowledge/bases/$KB_ID/pages/pricing-policy

# Add a text source
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type":"text","name":"Product FAQ","rawContent":"# FAQ\n\nQ: What is the pricing?\nA: ..."}' \
  https://api.bluenexus.ai/api/v1/knowledge/bases/$KB_ID/sources

# Add a URL source
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type":"url","name":"Documentation","url":"https://docs.example.com/guide"}' \
  https://api.bluenexus.ai/api/v1/knowledge/bases/$KB_ID/sources

Agent Integration

Enable the knowledge-base built-in tool on your agent. The agent gets access to these actions:

Action Description
search AI-powered search — reads the wiki index, uses an LLM to find the most relevant pages, and returns their full content. Falls back to keyword search automatically.
get_page Read the full content of a specific page by slug (includes backlinks and relationships)
get_related Find pages related to a given page, optionally filtered by relationship type (e.g., works_at, attended, part_of)
list_pages Browse pages filtered by tag or type
get_index Read the table of contents — lists all pages with summaries

Linking knowledge bases to agents

Each agent has two knowledge base settings in its configuration:

  • Linked Knowledge Bases — which knowledge bases the agent can read from. When the agent searches, it searches across all linked KBs. If none are linked, the account default is used.
  • Target Knowledge Base — where conversation auto-ingest writes to. When a conversation with this agent is auto-ingested, the source is created in this KB. Falls back to the account default if not set.

When the knowledge base tool is enabled, the agent's system prompt automatically includes the table of contents from all linked knowledge bases, so the agent knows what information is available upfront.

Confidence scoring

Each page has a confidence level based on how many sources back it:

  • Low — 1 source
  • Medium — 2 sources
  • High — 3+ sources

Confidence is included in search results and page responses, helping agents gauge reliability.

Typed relationships

Pages are connected by typed relationships like works_at, founded, attended, part_of, and more. Agents can use get_related to traverse the knowledge graph — for example, finding all people who work at a company, or all topics mentioned in a meeting.

MCP Integration

Two MCP tools are available for external clients:

search-knowledge-base (read)

Search and read the compiled wiki from the account's default knowledge base. Requires universal-mcp-read scope.

{
  "name": "search-knowledge-base",
  "arguments": {
    "action": "get_index"
  }
}

Actions: get_index, search (with query param), get_page (with slug param), get_related (with slug and optional relationship_type param).

add-to-knowledge-base (write)

Add documents to the account's default knowledge base. Requires universal-mcp-read-write scope.

{
  "name": "add-to-knowledge-base",
  "arguments": {
    "name": "Meeting Notes — April 15",
    "content": "# Meeting Notes\n\n## Attendees\n- Alice\n- Bob\n\n## Decisions\n..."
  }
}

The tool description instructs LLMs to use it proactively — any content encountered during a conversation that could be useful in the future should be added.

Conversation Auto-Ingest

When enabled in Settings, conversations are automatically captured:

  1. The system monitors all agent conversations
  2. Conversations with 10+ substantive messages are flagged
  3. After 30 minutes of inactivity, the conversation is compiled into the agent's target knowledge base (or the account default)
  4. Active conversations are never compiled — the timer resets on each new message
  5. All content is captured: user messages, assistant responses, tool results (including fetched webpages), and URLs

Enable via Knowledge > Settings > Auto-Ingest Conversations.

File Uploads

In Chat

Attach files directly in the chat interface:

  • Click the paperclip icon in the chat composer
  • Drag and drop files onto the chat window
  • Paste images from clipboard

Supported formats: PDF, DOCX, CSV, TXT, MD, PNG, JPG, GIF, WEBP.

Files are stored persistently and their content is extracted for the LLM (text from documents, base64 data URIs for images).

In Knowledge Sources

Upload files directly to the knowledge base via Knowledge > Sources > Add Source > File. Supported: PDF, DOCX, CSV, TXT, MD.

Compilation Details

The compilation LLM:

  • Receives the source content + the current wiki index
  • Produces structured pages with [[wiki-link]] cross-references and typed relationships
  • Enforces one concept per page — broad topics are split into focused, granular pages for better retrieval precision
  • Extracts typed relationships (e.g., works_at, founded, attended) both from structured LLM output and deterministic regex patterns
  • Computes confidence scoring automatically based on how many sources back each page
  • The index is rebuilt programmatically after each compilation (not dependent on LLM output)
  • Broken links are cleaned automatically (only valid page references are stored)
  • Uses models from the KNOWLEDGE_BASE_LLM_MODELS environment configuration (falls back to PROVIDER_AGENT_LLM_MODELS)

User Edits

You can manually edit any wiki page. Edited pages are marked as lastCompiledBy: "user" — the compiler won't overwrite your changes during future compilations.

Wiki Health

A daily lint job checks for:

  • Broken cross-references
  • Orphan pages (not linked or indexed)
  • Stale pages (not updated in 90+ days)
  • Duplicate titles
  • Overly broad pages (exceeding 2500 words — flagged for review)
  • Similar pages that should be consolidated (merged via LLM)

Supported File Formats

Format Parsing
PDF Text extraction via pdf-parse
DOCX Text extraction via mammoth
CSV Direct text read
TXT / Markdown Direct text read
Images (PNG, JPG, GIF, WEBP) Sent to vision-capable LLMs as base64