This commit is contained in:
parent
552311b730
commit
8ecb963be4
4 changed files with 172 additions and 29 deletions
56
README.md
56
README.md
|
|
@ -12,6 +12,30 @@ A personal, decentralized search engine built on the [Reticulum](https://reticul
|
|||
- **Import/export** — JSON-based backup and restore
|
||||
- **Mesh-native** — Works over Reticulum without the internet; encrypted and decentralized by default
|
||||
|
||||
## Performance & Scale
|
||||
|
||||
### Search Speed
|
||||
|
||||
| Pages indexed | Search speed | Notes |
|
||||
|--------------|-------------|-------|
|
||||
| 1,000 | ~50ms | Fast local FTS5 |
|
||||
| 10,000 | ~50-100ms | Full-text search |
|
||||
| 100,000 | ~100-200ms | Combined BM25 + semantic |
|
||||
| 500,000 | ~200-400ms | With semantic enabled |
|
||||
| 1,000,000 | ~300-500ms | Hybrid search |
|
||||
|
||||
*Times are estimates for combined BM25 + semantic search. Actual performance varies by hardware, storage type (SSD/HDD), and search complexity.*
|
||||
|
||||
### Concurrent Connections
|
||||
|
||||
- Database pool: 16 simultaneous connections
|
||||
- Suitable for single-user + a few subscriptions
|
||||
|
||||
### Export
|
||||
|
||||
- Paginated at 10,000 pages per request
|
||||
- Use `?batch=N` to export in chunks: `/export?batch=0`, `/export?batch=1`, etc.
|
||||
|
||||
## Download (pre-built binaries)
|
||||
|
||||
Download the latest release for your platform from the [Releases](https://git.derickphan.com/lichenblankie/tinyweb/releases) page:
|
||||
|
|
@ -55,6 +79,21 @@ volumes:
|
|||
|
||||
Run with `docker compose up -d`.
|
||||
|
||||
### Storage Estimates
|
||||
|
||||
Average web page content is ~15KB per page:
|
||||
|
||||
| Pages | Database | Embeddings* | Total |
|
||||
|-------|----------|------------|-------|
|
||||
| 10,000 | 150MB | 80MB | ~250MB |
|
||||
| 100,000 | 1.5GB | 800MB | ~2.5GB |
|
||||
| 500,000 | 7.5GB | 4GB | ~12GB |
|
||||
| 1,000,000 | 15GB | 8GB | ~25GB |
|
||||
|
||||
*Embeddings require semantic search to be enabled. With compression enabled (Settings > Search > AI), embeddings use ~50% less storage.
|
||||
|
||||
Enable optional compression in Settings > Search > AI to reduce embedding storage by ~50%.
|
||||
|
||||
## Data storage
|
||||
|
||||
### Local (Python/binary)
|
||||
|
|
@ -139,6 +178,23 @@ TinyWeb includes several hardening measures:
|
|||
- **Bookmark authentication** — The bookmarklet endpoint requires a secret token
|
||||
- **Identity file protection** — The Reticulum identity key is restricted to owner-only permissions (0600)
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Database Vacuum
|
||||
|
||||
Over time, deleted pages leave empty space in the database. Run the vacuum tool periodically to reclaim space:
|
||||
|
||||
1. Go to `/style` in your browser
|
||||
2. Click "vacuum database" at the bottom of the page
|
||||
|
||||
### Optional Compression
|
||||
|
||||
To reduce storage for semantic search embeddings (~50% savings):
|
||||
|
||||
1. Go to `/style` > Search > AI
|
||||
2. Enable "compress embeddings"
|
||||
3. Re-index your existing pages for the compression to apply to existing embeddings
|
||||
|
||||
## Dependencies
|
||||
|
||||
- [requests](https://docs.python-requests.org/) — HTTP fetching
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue