Commit graph

61 commits

Author SHA1 Message Date
Derick Phan
b86e139bdd
Privacy hardening: degoogle, security headers, referrer protection
- Replace Google Fonts with system font stacks across all themes
- Add Referrer-Policy, X-Content-Type-Options, X-Frame-Options, CSP headers
- Add rel="noreferrer noopener" on all outbound links
- Add no-referrer and dns-prefetch-control meta tags to all themes
- Clean tracking params on outbound links from trusted/remote sources
- Remove Google domains from CSP whitelists

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 10:11:57 -07:00
Derick Phan
23b634d0e0
Fix kodama2 custom cursor disappearing on scroll
Set min-height: 100vh on html/body so the cursor-bearing elements
fill the viewport even when content is short.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 09:08:19 -07:00
Derick Phan
aff8c654cc
Add kodama2 theme with styles for new handler features
Adds pagination, meta, and success message styles, plus input
selectors for new form fields (edit page, manual entry, transport node).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 09:05:12 -07:00
lichenblankie
c844e2c81e Disabled semantic search and reranker by default
Some checks are pending
/ build (push) Waiting to run
/ release (push) Blocked by required conditions
2026-04-08 05:21:08 +00:00
Test User
57a79e5e8e Add PyInstaller builds, AGPLv3 license, transport node selection, and rmap.world link
- Add pyinstaller.spec and GitHub/Forgejo CI workflows for cross-platform builds
- Add AGPLv3 license
- Move data storage to ~/.tinyweb/
- Add --version and --port CLI flags
- Add transport node selection in /style (smart regeneration preserves Reticulum config)
- Add discover more nodes link to rmap.world
2026-04-08 04:36:28 +00:00
696a32cef9 Update add form 2026-03-30 23:14:54 +00:00
Test User
f2f4682fa1 Hide toggle for now 2026-03-30 23:13:00 +00:00
Test User
387714a221 Move extra line break after note for spacing before tags 2026-03-30 23:06:45 +00:00
Test User
da95e580f4 Add extra line break between note and tags for better spacing 2026-03-30 23:06:12 +00:00
Test User
3bebb5734b Remove CSS, use consistent br spacing in add form 2026-03-30 23:04:52 +00:00
Test User
756493e286 Fix toggle gap using CSS margin for consistent spacing 2026-03-30 23:03:57 +00:00
Test User
fb4d4dbaec Fix inconsistent spacing in add form when toggling input type 2026-03-30 23:03:00 +00:00
Test User
ea8f256882 Fix gap in add form between URL and note fields 2026-03-30 23:01:23 +00:00
Test User
395e38d2ab Merge branch 'test-reticulum-hash' of https://git.derickphan.com/lichenblankie/tinyweb into test-reticulum-hash 2026-03-30 22:55:56 +00:00
Test User
7795662154 Add radio toggle for URL vs Reticulum hash input in add page 2026-03-30 22:54:29 +00:00
67fc2f7649 Merge branch 'test-reticulum-hash' of https://git.derickphan.com/lichenblankie/tinyweb into test-reticulum-hash 2026-03-30 22:50:02 +00:00
d6616f69d5 Update handler 2026-03-30 22:49:57 +00:00
Test User
a3429409eb Add dropdown to switch between add site and subscribe in same input box 2026-03-30 22:48:45 +00:00
Test User
80a1d44dee Add reticulum destination hash option to add URL page 2026-03-30 22:36:58 +00:00
blankie
6119ed3aef Updated manual entry 2026-03-28 21:52:53 -07:00
426aa670fa Update manual add 2026-03-29 04:37:31 +00:00
blankie
5593d802b3 Added manual entry 2026-03-28 21:24:10 -07:00
Derick Phan
c959ee98ae
Make semantic search and reranking optional, use site meta descriptions for snippets
- Add semantic_search setting to toggle AI-powered search on/off
- Skip embedding generation, hybrid search, and model preloading when disabled
- Use site owner's meta description as snippet instead of heuristic extraction
- Remove _generate_summary() and snippet() - no more generated snippets
- Show reranker/reindex controls grayed out when semantic search is off
- AI dependencies (onnxruntime, hnswlib, etc.) are now fully optional

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 20:58:04 -07:00
Derick Phan
c9a8cba9d1
Improve snippet generation with heuristic extraction instead of AI
- Case-insensitive meta description extraction (fixes sites like Lemmy
  with capitalized "Description" meta name)
- Strip aside and noscript tags for cleaner body text
- Extract paragraph text separately for better sentence quality
- Prefer sentences mentioning the site name, then first quality
  paragraph, then title as fallback
- Skip meta descriptions under 20 chars (e.g. just "Lemmy")
- Remove embedding/centroid dependency from summary generation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 15:44:07 -07:00
Derick Phan
570d876b8e
Strip noscript tags when parsing pages to remove JS-disabled messages
Lemmy and other JS-heavy sites include noscript fallback text like
"Javascript is disabled" that pollutes the stored body text and
generated snippets/summaries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 14:18:54 -07:00
Derick Phan
fd20454fa4
Fix reindex to re-embed all pages and preserve existing summaries
Previously reindex skipped pages that already had chunks, leaving stale
embeddings in place. It also overwrote good meta description summaries
with auto-generated ones. Now it clears all chunks first so everything
is re-embedded, and only generates summaries for pages missing one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 14:08:04 -07:00
Derick Phan
299735f816
Add junimo theme and increase browse page size to 50
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 10:59:37 -07:00
Derick Phan
395fc17092
Add hybrid semantic search with optional cross-encoder reranking
Implements a three-stage search pipeline:
1. BM25 keyword search via FTS5 with column weights
2. Semantic search via Snowflake arctic-embed-s bi-encoder + HNSW index
3. Optional cross-encoder reranking (on by default, toggleable in settings)

Top 20 results are reranked for precision, next 10 appended from RRF
for coverage, giving 30 total results across 3 pages.

- New embeddings.py with ONNX Runtime inference, text chunking, HNSW
  index management, RRF fusion, and cross-encoder reranking
- Meta description extraction for authentic page snippets with centroid
  extractive fallback
- Stopword filtering in FTS5 queries to avoid overly strict matching
- /reindex page for batch embedding of existing pages
- Semantic embedding of remote pages during subscription sync
- ~125MB dependency footprint (onnxruntime, tokenizers, hnswlib, numpy)
- Models: 34MB bi-encoder + 22MB cross-encoder (downloaded on first use)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 03:24:41 -07:00
Derick Phan
2df92752b6
Normalize template line endings before storing to fix navbar disappearing
The previous fix only normalized \r\n for comparison but stored the raw
template with browser line endings. Now all \r\n and \r are converted to
\n before both comparing and storing, preventing the bare skeleton from
ever being saved as a custom template.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 22:29:45 -07:00
Derick Phan
6070e09834
Fix navbar disappearing when saving customize form
Browser textarea submissions convert \n to \r\n, causing the template
comparison against DEFAULT_TEMPLATE to always fail. This saved the bare
skeleton as a custom template, overriding the default navbar.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 22:00:27 -07:00
Derick Phan
7c225145f0
Redesign subscriptions page with card layout
Replace cramped table layout with card-based design that works
better in narrow viewports and across different themes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 21:51:58 -07:00
Derick Phan
ffdfb821c8
Set share_instance = No for reliable mesh announces
With share_instance = Yes, announces weren't being sent over TCP
in Docker environments. Setting it to No ensures each TinyWeb
instance manages its own Reticulum interfaces directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 21:39:37 -07:00
Derick Phan
fe0e15edc4
Add delay before announce to ensure TCP interface is ready
The announce was firing before the TCP transport connection was fully
established, causing Docker instances to never announce over the mesh.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 21:32:11 -07:00
Derick Phan
2adef21ec6
Add default internet transport node for zero-config mesh connectivity
New TinyWeb instances now auto-connect to reticulum.derickphan.com:4242
so users get internet mesh connectivity out of the box without any
manual Reticulum configuration. Env var overrides still supported.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 20:13:35 -07:00
Derick Phan
b49b1df8b5
Set PYTHONUNBUFFERED=1 in Dockerfile for real-time log output
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 19:23:03 -07:00
Derick Phan
5f8863ce77
Add entrypoint script for configurable Reticulum networking in Docker
Replaces static CMD with an entrypoint that generates RNS config from
environment variables (RNS_TCP_HOST/PORT), enabling TCP transport for
environments without LAN auto-discovery (e.g. Docker on macOS).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 18:44:26 -07:00
Derick Phan
ddaba43710
Add Dockerfile and Docker Compose for one-command setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 17:59:08 -07:00
Derick Phan
fede0287e9
Change pagination from 50 to 10 results per page
Better fit for a curated personal search engine — keeps pages
focused and renders faster over low-bandwidth mesh links.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 12:03:03 -07:00
Derick Phan
f2e8dd042a
Add WAL mode, connection pooling, pagination, and delta sync
WAL + pooling:
- Enable WAL journal mode for concurrent read/write support
- Add connection pool (size 4) with return_db() to reuse connections
  instead of opening/closing on every request

Pagination:
- Search results, /pages, and /tags/<name> now paginate at 50 per page
- Prev/next navigation links appear when results exceed one page

Delta sync:
- Pages table gains last_modified timestamp, set on insert/update
- /api/sites accepts ?since= param to return only changed pages
- Subscription sync uses last_sync timestamp for incremental fetches
- Remote pages upserted instead of delete-all/re-insert
- Full sync includes all_urls list for detecting remote deletions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 12:00:43 -07:00
Derick Phan
6981d39ddd
Normalize URLs to prevent duplicate indexing
clean_url() now canonicalizes: http→https, strips www., removes
trailing slashes, drops default ports, and sorts query params.
Prevents the same page from being indexed multiple times under
different URL variations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:34:15 -07:00
Derick Phan
d2cb0d00bc
Add README with setup, usage, architecture, and security docs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:29:14 -07:00
Derick Phan
86e4c1f151
Fix index_url using wrong page_id after upsert
lastrowid returns 0 when ON CONFLICT DO UPDATE fires on an existing
row, causing links to not be cleaned up or associated correctly on
re-index. Now fetches the actual row ID with a SELECT after upsert.
Also adds try/finally for connection safety.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:24:01 -07:00
Derick Phan
c10aa7955c
Fix SSRF redirect bypass, identity permissions, error leakage, and DB connection leaks
- SSRF: disable automatic redirects, manually follow up to 5 hops with
  IP re-validation at each step to prevent redirect-to-localhost bypass
- Identity file: enforce 0600 permissions on tinyweb_identity at load
  and creation to prevent other users from reading the private key
- Error messages: replace raw exception strings with generic messages
  to avoid leaking internal paths/hostnames to the UI
- DB connections: wrap all get_db() usage in try/finally to guarantee
  close() even when handlers throw mid-operation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:18:47 -07:00
Derick Phan
d5f2d01651
Harden security: bookmark auth, CSP headers, per-session CSRF, and more
- Bookmark endpoint now requires a secret token (stored in settings)
- Style reset moved from GET to POST with CSRF protection
- Open redirect prevention in _redirect() helper
- Import capped at 100 URLs to prevent abuse
- page_tags cleaned up on delete + PRAGMA foreign_keys enabled
- CSP, X-Frame-Options, X-Content-Type-Options on all responses
- CSRF tokens now per-session via double-submit cookie pattern
- Tag names URL-decoded for special characters
- Gateway forwards cookies in request data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:10:37 -07:00
Derick Phan
9ddecf71db
Add security hardening: CSRF, SSRF, FTS5, and DELETE via POST
- CSRF: Generate random token at startup, include as hidden field in
  all 11 POST forms, validate at top of POST dispatch (returns 403)
- SSRF: Block private/internal IP ranges (127/8, 10/8, 172.16/12,
  192.168/16, 169.254/16, ::1, fc00::/7) by resolving hostname before
  fetch. Remove verify=False from requests.get().
- DELETE: Change /delete/<id> from GET (instant delete) to GET
  (confirmation page) + POST (actual delete) to prevent accidental
  deletion from prefetchers/crawlers.
- FTS5: Wrap search input in double quotes to neutralize FTS5
  operators (AND, OR, NOT, *, column:). Add try/except fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:54:22 -07:00
Derick Phan
9c4ed9ac9e
Add themes folder with kodama template and gitignore index.db
Save the custom kodama template to themes/kodama.html so it's
version-controlled as a file rather than only living in the database.
Stop tracking index.db since it's runtime data, not source code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:11:32 -07:00
Derick Phan
862a383101
Add kodama tree spirit overlay and clean up orphaned remote pages
Add animated kodama (tree spirits from Princess Mononoke) to the
custom template as a canvas overlay. Each spirit has unique organic
proportions: rock-like blob head shapes, varied eye spacing/size,
optional mouths and arms, and a soft luminous glow. They fade in/out,
bob gently, and occasionally rattle their heads.

Also removed 3 orphaned remote_pages rows from deleted subscriptions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:08:17 -07:00
Derick Phan
b17988fc95
Fix custom template rendering and ensure customize page uses default layout
Add use_default parameter to wrap_page/respond so the customize page
always renders with the default template (preventing a broken custom
template from locking out the editor). Also fix the stored custom
template: add <!DOCTYPE html> to prevent quirks mode and remove
newlines inside CSS cursor data URIs that caused CSS parse errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 09:45:42 -07:00
Derick Phan
8741c2fffb
Add custom HTML template editor and clean up UI
- Replace CSS-only customization with full HTML template editing
- Users edit the entire page wrapper with {{content}} placeholder
- Add /style?reset escape hatch to recover from broken templates
- Move nav links to template, remove redundant nav from search page
- Delete remote pages when unsubscribing from an instance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 09:04:23 -07:00
Derick Phan
4df0ef03f5
Add CLAUDE.md with project architecture and conventions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 08:17:38 -07:00