No description
Find a file
Derick Phan 8dffd8ccea
Add data-loss guards and first-run empty state
- Bulk delete now routes through a server-rendered confirmation page
  listing the selected titles; a `confirmed=1` form field is required
  before pages are actually deleted. Mirrors the single-delete flow.
- Reset-template button gains a JS confirm() so stray clicks don't wipe
  the custom template.
- Homepage shows a short, neutral empty-state block when the index has
  zero pages and no query — just names what tinyweb is and links to
  /add, /style, and /subscriptions as equal options.
- /about gains a "your data" section explaining what lives in
  ~/.tinyweb/ (identity file, index.db), what losing each costs, and
  how /export differs from a full backup.
- README gains a "Backups" subsection mirroring the /about copy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 09:38:07 -07:00
.forgejo/workflows Fixed workflow build 2026-04-11 07:08:41 +00:00
themes Privacy hardening: degoogle, security headers, referrer protection 2026-04-08 10:11:57 -07:00
.dockerignore Harden network and privacy defaults; fix several bugs 2026-04-23 15:37:45 -07:00
.gitignore Add hybrid semantic search with optional cross-encoder reranking 2026-03-27 03:24:41 -07:00
app.py Harden network and privacy defaults; fix several bugs 2026-04-23 15:37:45 -07:00
CLAUDE.md Add CLAUDE.md with project architecture and conventions 2026-03-26 08:17:38 -07:00
db.py Harden network and privacy defaults; fix several bugs 2026-04-23 15:37:45 -07:00
docker-compose.yml Add entrypoint script for configurable Reticulum networking in Docker 2026-03-26 18:44:26 -07:00
Dockerfile Fixed workflow build 2026-04-11 07:12:26 +00:00
embeddings.py Harden network and privacy defaults; fix several bugs 2026-04-23 15:37:45 -07:00
entrypoint.sh Harden network and privacy defaults; fix several bugs 2026-04-23 15:37:45 -07:00
gateway.py Harden network and privacy defaults; fix several bugs 2026-04-23 15:37:45 -07:00
handlers.py Add data-loss guards and first-run empty state 2026-04-24 09:38:07 -07:00
LICENSE Add PyInstaller builds, AGPLv3 license, transport node selection, and rmap.world link 2026-04-08 04:36:28 +00:00
pyinstaller.spec Add PyInstaller builds, AGPLv3 license, transport node selection, and rmap.world link 2026-04-08 04:36:28 +00:00
README.md Add data-loss guards and first-run empty state 2026-04-24 09:38:07 -07:00
requirements.txt Add hybrid semantic search with optional cross-encoder reranking 2026-03-27 03:24:41 -07:00
rns_client.py Add LoRa support with background sync and settings UI 2026-04-22 08:47:09 -07:00
templates.py Privacy hardening: degoogle, security headers, referrer protection 2026-04-08 10:11:57 -07:00

TinyWeb

A personal, decentralized search engine built on the Reticulum mesh network. Curate your own index of web pages, search it locally, and share collections with friends over an encrypted mesh. No algorithms, no ads, no tracking.

Features

  • Personal search index — Save pages you find valuable, search them with full-text search (SQLite FTS5)
  • Tagging — Organize saved pages with comma-separated tags
  • Bookmarklet — One-click indexing from any browser tab
  • Subscriptions — Subscribe to friends' TinyWeb instances over Reticulum and search their indexes alongside yours
  • Custom templates — Full HTML/CSS/JS template editor to personalize your instance
  • Import/export — JSON-based backup and restore
  • Mesh-native — Works over Reticulum without the internet; encrypted and decentralized by default

Performance & Scale

Search Speed

Pages indexed Search speed Notes
1,000 ~50ms Fast local FTS5
10,000 ~50-100ms Full-text search
100,000 ~100-200ms Combined BM25 + semantic
500,000 ~200-400ms With semantic enabled
1,000,000 ~300-500ms Hybrid search

Times are estimates for combined BM25 + semantic search. Actual performance varies by hardware, storage type (SSD/HDD), and search complexity.

Concurrent Connections

  • Database pool: 16 simultaneous connections
  • Suitable for single-user + a few subscriptions

Export

  • Paginated at 10,000 pages per request
  • Use ?batch=N to export in chunks: /export?batch=0, /export?batch=1, etc.

Download (pre-built binaries)

Download the latest release for your platform from the Releases page:

Platform File
Windows TinyWeb-windows-x64.exe
macOS TinyWeb-macos-arm64
Linux TinyWeb-linux-x64

Run the downloaded file — no installation required.

Docker

Pull and run TinyWeb from the container registry:

docker run -p 8080:8080 registry.derickphan.com/tinyweb:latest

Or with a specific version:

docker run -p 8080:8080 registry.derickphan.com/tinyweb:v0.1.0

Docker Compose

services:
  tinyweb:
    image: registry.derickphan.com/tinyweb:latest
    ports:
      - "8080:8080"
    volumes:
      - tinyweb-data:/data

volumes:
  tinyweb-data:

Run with docker compose up -d.

Storage Estimates

Average web page content is ~15KB per page:

Pages Database Embeddings* Total
10,000 150MB 80MB ~250MB
100,000 1.5GB 800MB ~2.5GB
500,000 7.5GB 4GB ~12GB
1,000,000 15GB 8GB ~25GB

*Embeddings require semantic search to be enabled. With compression enabled (Settings > Search > AI), embeddings use ~50% less storage.

Enable optional compression in Settings > Search > AI to reduce embedding storage by ~50%.

Data storage

Local (Python/binary)

Your data is stored in ~/.tinyweb/:

File Description
index.db SQLite database with your indexed pages
tinyweb_identity Your Reticulum identity (keep safe!)
models/ Downloaded AI models for semantic search
index.hnsw Semantic search index

This allows your data to persist between upgrades and stay separate from the application.

Backups

Back up the whole ~/.tinyweb/ directory periodically. The two files that matter:

  • tinyweb_identity is your permanent mesh identity. If you lose it, your destination hash changes and every subscriber has to re-subscribe to the new one. Keep it somewhere you trust; the file is 0600 by default.
  • index.db is your full reading history — every page, note, tag, and synced remote page. Losing it loses everything you've curated.

models/ and index.hnsw are re-derivable (the model will re-download, and the HNSW index rebuilds from the database on next startup with semantic search enabled) so they don't need to be backed up.

The /export page produces a JSON dump of your pages. It's a migration aid — it doesn't preserve your identity file, your custom template, or subscription state. A full restore needs a copy of ~/.tinyweb/.

Docker

Data is stored in the /data volume inside the container. Use a volume mount to persist data:

docker run -p 8080:8080 -v tinyweb-data:/data registry.derickphan.com/tinyweb:latest

Or with docker-compose (see above) — data persists in the named volume.

Command line options

./TinyWeb --version          # Show version
./TinyWeb -p 9000            # Use port 9000 instead of default 8080
./TinyWeb --bind 0.0.0.0     # Expose the web UI to your LAN (see warning below)

By default, the web UI binds to 127.0.0.1 and is only reachable from the machine running TinyWeb. The UI has no authentication — anyone who can reach the port can read, add, and delete entries, and change settings. Only pass --bind 0.0.0.0 if you fully trust your network, or put TinyWeb behind an authenticating reverse proxy.

Getting started

pip install -r requirements.txt
python app.py

This starts the Reticulum server and an HTTP gateway on http://127.0.0.1:8080. Open it in your browser. The UI is localhost-only by default; see --bind under Command line options if you want to reach it from another machine.

Your destination hash is printed on startup — share it with friends so they can subscribe to your index.

Remote gateway

To browse a remote TinyWeb instance without running your own index:

python gateway.py <destination_hash>

This connects over Reticulum and serves the remote instance at http://localhost:8080.

How it works

  1. Save pages — Use the /add form or the bookmarklet (found on /style) to index any URL
  2. Search — Full-text search across your saved pages, linked pages from trusted sites, and synced subscriptions
  3. Subscribe — Add a friend's destination hash on /subscriptions to sync their shared index
  4. Customize — Edit your site name, HTML template, and sharing settings on /style

Project structure

app.py          — Entry point: boots Reticulum, starts HTTP gateway
gateway.py      — HTTP-to-RNS bridge (local or remote dispatch)
handlers.py     — Route dispatcher and all request handlers
db.py           — SQLite database, FTS5, URL fetching, SSRF protection
templates.py    — HTML template rendering and escaping
rns_client.py   — Reticulum client for fetching remote site lists
themes/         — Saved HTML templates (e.g. kodama.html)

Security

The web UI has no authentication. It is bound to 127.0.0.1 by default, so only processes on the local machine can reach it. If you pass --bind 0.0.0.0 (or run inside a container with a published port), anyone who can reach that address can fully control your instance — reading private entries, changing settings, and modifying the HTML template (which runs in your browser). Put TinyWeb behind a reverse proxy with auth before exposing it beyond localhost.

Other hardening measures:

  • CSRF protection — All POST forms use per-session tokens via double-submit cookies
  • SSRF prevention — URL fetching validates hostnames against private IP ranges, with redirect re-validation
  • FTS5 injection prevention — Search queries are sanitized before passing to SQLite MATCH
  • Content Security Policy — CSP headers on all HTML responses restrict script/style/frame sources
  • XSS escaping — All user-supplied content is HTML-escaped before rendering
  • Bookmark authentication — The bookmarklet endpoint requires a secret token
  • Identity file protection — The Reticulum identity key is restricted to owner-only permissions (0600)

Maintenance

Database Vacuum

Over time, deleted pages leave empty space in the database. Run the vacuum tool periodically to reclaim space:

  1. Go to /style in your browser
  2. Click "vacuum database" at the bottom of the page

Optional Compression

To reduce storage for semantic search embeddings (~50% savings):

  1. Go to /style > Search > AI
  2. Enable "compress embeddings"
  3. Re-index your existing pages for the compression to apply to existing embeddings

Dependencies

Philosophy

TinyWeb is built for the slow web — intentionality over speed, human curation over algorithmic feeds, privacy over surveillance, and community over corporations. Every page in your index was saved because you found it valuable, not because an algorithm told you to click.