Commit graph

4 commits

Author SHA1 Message Date
lichenblankie
5ded9f1339 added hybrid semantic search with reranking
Implements a three-stage search pipeline:
1. BM25 keyword search via FTS5 with column weights
2. Semantic search via Snowflake arctic-embed-s bi-encoder + HNSW index
3. Optional cross-encoder reranking (on by default, toggleable in settings)

Top 20 results are reranked for precision, next 10 appended from RRF
for coverage, giving 30 total results across 3 pages.

- New embeddings.py with ONNX Runtime inference, text chunking, HNSW
  index management, RRF fusion, and cross-encoder reranking
- Meta description extraction for authentic page snippets with centroid
  extractive fallback
- Stopword filtering in FTS5 queries to avoid overly strict matching
- /reindex page for batch embedding of existing pages
- Semantic embedding of remote pages during subscription sync
- ~125MB dependency footprint (onnxruntime, tokenizers, hnswlib, numpy)
- Models: 34MB bi-encoder + 22MB cross-encoder (downloaded on first use)
2026-06-05 05:29:35 +00:00
lichenblankie
e802ed4fe3 added Dockerfile + compose setup 2026-06-05 05:29:35 +00:00
lichenblankie
104bb7ba2d created themes folder with kodama template
Save the custom kodama template to themes/kodama.html so it's
version-controlled as a file rather than only living in the database.
Stop tracking index.db since it's runtime data, not source code.
2026-06-05 05:29:35 +00:00
lichenblankie
4b4e7e8081 ported everything to Reticulum mesh
Replace HTTP server with Reticulum-native architecture. The server
now speaks only Reticulum, with a client-side gateway providing
browser access by translating HTTP to/from RNS requests.

- Extract db layer (db.py), templates (templates.py), handlers (handlers.py)
- app.py is now the RNS server with persistent identity and destination
- gateway.py bridges HTTP on localhost:8080 to RNS link requests
- Add rns dependency, add .gitignore
2026-06-05 05:29:35 +00:00