Read-Only RAG · Built for Agenvoy

Drop a Folder.
Get a Searchable Knowledge Base.

KuraDB watches a folder, parses every document, and embeds it automatically — then serves semantic and CJK-aware keyword search over a strictly read-only local API.

curl -fsSL https://kuradb.agenvoy.com/scripts/install.sh | bash
kura
kura add notes
  created inbox ~/Kura_notes -> ~/.config/kuradb/notes/inbox
  drop your PDFs, docs, sheets, or markdown here
 
kura
  watch notes · 14 files · 312 chunks
  embed 312/312 · text-embedding-3-small · dim 512
  serving http://127.0.0.1:48213
 
curl '127.0.0.1:48213/api/semantic?db=notes&q=retry+backoff+strategy'
 
{ "results": [{
  "source": "resilience-patterns.pdf",
  "matches": [{ "chunk": 7, "content": "Exponential backoff with jitter..." }]
}] }
Features

A RAG Backend That Stays Out of Your Way

Watch, parse, embed, and serve — a single Go binary with zero external vector database.

Automatic Ingestion

Drop a file in the watched folder. KuraDB sniffs, parses, chunks, and queues it for embedding — no commands, no schemas.

Semantic Vector Search

Two-stage cosine search: coarse source-level ranking, then a parallel chunk-level scan across CPU cores. Self-implemented, no ToriiDB.

CJK Keyword Search

gse-powered Chinese segmentation with stopwords. Scores rows by keyword hit count for precise, language-aware matching.

🔒

Read-Only Contract

The HTTP API has no mutation endpoints by design. Data enters one way only — through the watcher pipeline. A trust boundary, not a config flag.

SQLite Source of Truth

Every chunk and vector lives in SQLite. The in-memory vector cache is rebuildable — never authoritative, never lost on restart.

Smart Re-embedding

Upsert only invalidates a vector when content actually changed. Identical files keep their embeddings — no wasted OpenAI round-trips.

Query Cache

Query embeddings are cached in a global SQLite store. Repeated searches skip the embedding call and go straight to the vector scan.

Soft-Delete Consistency

Removed files are dismissed, not dropped. Every semantic and keyword query strictly filters dismissed content from results.

🔌

Agenvoy-Native

Binds a random local port and publishes its URL to an endpoint file, so Agenvoy discovers and consumes it as a child process automatically.

Pipeline

One-Way Ingestion, by Design

Data enters through a single path. The database package is the only write entry point — nothing else touches SQLite.

Step 1

Watch

Poll the inbox every 10s. Detect changes via a persisted size + mtime snapshot.

walkFiles
Step 2

Parse

Sniff text, dispatch by type, split into chunks — PDF, Office, sheets, markdown.

parser
Step 3

Embed

Tick every 5s, batch of 64. Embed pending rows, write vectors back to SQLite.

openai
Step 4

Serve

Read-only HTTP over localhost. Two-stage vector + keyword search over fresh rows.

gin
Formats

Bring Any Document

Text sniffing skips binaries and media automatically. Everything else is parsed, chunked, and embedded.

📄

PDF

Full text extraction, page-aware chunking

📝

DOCX · PPTX

Word and PowerPoint document bodies

📊

XLSX · CSV

Tabular sheets flattened to rows

Markdown

Markdown and plain UTF-8 text

REST API

Read-only HTTP for any consumer

Interface

Read-Only API & a Tiny CLI

Four endpoints to query, four subcommands to manage. No mutation surface anywhere.

HTTP Endpoints

All under /api, bound to localhost on a random free port. limit defaults to 10, max 100.

GET/api/health
GET/api/list
GET/api/semantic?db=&q=&limit=
GET/api/keyword?db=&q=&limit=

CLI Commands

Run kura with no args to start the server. Subcommands manage the database registry.

kura · start server kura add <name> kura list kura remove <name> kura edit <old> <new> kura help
Under the Hood

Lean Stack, No Heavy Dependencies

Personal-scale data needs a linear scan and a good index — not a distributed vector cluster.

Go 1.25
Single static binary
SQLite
Source of truth
OpenAI
embedding-3-small · 512
Gin
Read-only HTTP
go-sqlkit
Read/write pools
go-pkg
Parsers · keychain
go-ego/gse
CJK segmentation
Vector
Self-implemented
Get Started

From Folder to Search in Four Steps

Install once, register a database, drop your files, and start querying.

1

Install

Bootstraps Go and builds the kura binary. macOS and Linux.

curl -fsSL https://kuradb.agenvoy.com/scripts/install.sh | bash
2

Add a Database

Creates an inbox and a ~/Kura_notes symlink.

kura add notes
3

Drop Files

Copy documents into the symlinked inbox folder.

cp ~/Documents/*.pdf ~/Kura_notes/
4

Start & Query

Server auto-loads, watches, and embeds.

kura

Turn Your Files Into Knowledge

One binary. One folder. Agent-ready search.