- Bulk delete and retag from browse page with checkboxes
- Select all / deselect all toggle
- Delete confirmation shows count of selected pages
- Auto-cleanup orphaned tags on delete, edit, and bulk actions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace Google Fonts with system font stacks across all themes
- Add Referrer-Policy, X-Content-Type-Options, X-Frame-Options, CSP headers
- Add rel="noreferrer noopener" on all outbound links
- Add no-referrer and dns-prefetch-control meta tags to all themes
- Clean tracking params on outbound links from trusted/remote sources
- Remove Google domains from CSP whitelists
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set min-height: 100vh on html/body so the cursor-bearing elements
fill the viewport even when content is short.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds pagination, meta, and success message styles, plus input
selectors for new form fields (edit page, manual entry, transport node).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add pyinstaller.spec and GitHub/Forgejo CI workflows for cross-platform builds
- Add AGPLv3 license
- Move data storage to ~/.tinyweb/
- Add --version and --port CLI flags
- Add transport node selection in /style (smart regeneration preserves Reticulum config)
- Add discover more nodes link to rmap.world
- Add semantic_search setting to toggle AI-powered search on/off
- Skip embedding generation, hybrid search, and model preloading when disabled
- Use site owner's meta description as snippet instead of heuristic extraction
- Remove _generate_summary() and snippet() - no more generated snippets
- Show reranker/reindex controls grayed out when semantic search is off
- AI dependencies (onnxruntime, hnswlib, etc.) are now fully optional
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Case-insensitive meta description extraction (fixes sites like Lemmy
with capitalized "Description" meta name)
- Strip aside and noscript tags for cleaner body text
- Extract paragraph text separately for better sentence quality
- Prefer sentences mentioning the site name, then first quality
paragraph, then title as fallback
- Skip meta descriptions under 20 chars (e.g. just "Lemmy")
- Remove embedding/centroid dependency from summary generation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Lemmy and other JS-heavy sites include noscript fallback text like
"Javascript is disabled" that pollutes the stored body text and
generated snippets/summaries.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously reindex skipped pages that already had chunks, leaving stale
embeddings in place. It also overwrote good meta description summaries
with auto-generated ones. Now it clears all chunks first so everything
is re-embedded, and only generates summaries for pages missing one.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements a three-stage search pipeline:
1. BM25 keyword search via FTS5 with column weights
2. Semantic search via Snowflake arctic-embed-s bi-encoder + HNSW index
3. Optional cross-encoder reranking (on by default, toggleable in settings)
Top 20 results are reranked for precision, next 10 appended from RRF
for coverage, giving 30 total results across 3 pages.
- New embeddings.py with ONNX Runtime inference, text chunking, HNSW
index management, RRF fusion, and cross-encoder reranking
- Meta description extraction for authentic page snippets with centroid
extractive fallback
- Stopword filtering in FTS5 queries to avoid overly strict matching
- /reindex page for batch embedding of existing pages
- Semantic embedding of remote pages during subscription sync
- ~125MB dependency footprint (onnxruntime, tokenizers, hnswlib, numpy)
- Models: 34MB bi-encoder + 22MB cross-encoder (downloaded on first use)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous fix only normalized \r\n for comparison but stored the raw
template with browser line endings. Now all \r\n and \r are converted to
\n before both comparing and storing, preventing the bare skeleton from
ever being saved as a custom template.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Browser textarea submissions convert \n to \r\n, causing the template
comparison against DEFAULT_TEMPLATE to always fail. This saved the bare
skeleton as a custom template, overriding the default navbar.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace cramped table layout with card-based design that works
better in narrow viewports and across different themes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
With share_instance = Yes, announces weren't being sent over TCP
in Docker environments. Setting it to No ensures each TinyWeb
instance manages its own Reticulum interfaces directly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The announce was firing before the TCP transport connection was fully
established, causing Docker instances to never announce over the mesh.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New TinyWeb instances now auto-connect to reticulum.derickphan.com:4242
so users get internet mesh connectivity out of the box without any
manual Reticulum configuration. Env var overrides still supported.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces static CMD with an entrypoint that generates RNS config from
environment variables (RNS_TCP_HOST/PORT), enabling TCP transport for
environments without LAN auto-discovery (e.g. Docker on macOS).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Better fit for a curated personal search engine — keeps pages
focused and renders faster over low-bandwidth mesh links.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WAL + pooling:
- Enable WAL journal mode for concurrent read/write support
- Add connection pool (size 4) with return_db() to reuse connections
instead of opening/closing on every request
Pagination:
- Search results, /pages, and /tags/<name> now paginate at 50 per page
- Prev/next navigation links appear when results exceed one page
Delta sync:
- Pages table gains last_modified timestamp, set on insert/update
- /api/sites accepts ?since= param to return only changed pages
- Subscription sync uses last_sync timestamp for incremental fetches
- Remote pages upserted instead of delete-all/re-insert
- Full sync includes all_urls list for detecting remote deletions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clean_url() now canonicalizes: http→https, strips www., removes
trailing slashes, drops default ports, and sorts query params.
Prevents the same page from being indexed multiple times under
different URL variations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
lastrowid returns 0 when ON CONFLICT DO UPDATE fires on an existing
row, causing links to not be cleaned up or associated correctly on
re-index. Now fetches the actual row ID with a SELECT after upsert.
Also adds try/finally for connection safety.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>