CHANGELOG
All notable changes to this project are documented here.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased
Added — v0.2 RAG overhaul (schema v2)
scripts/rag/indexer.pyrewritten end-to-end. Batched async embeddings via Ollama's/api/embed(plural) endpoint replace the per-chunk serial round-trips that dominated wall time. Batch size is profile-driven (Pi=16, laptop=64, workstation=128). With lxml HTML parsing, larger 2000-char chunks, and title-prefixed embeddings, dense indexing is 10–30× faster on CPU and saturates a GPU on a laptop. Resume support: a crashed run picks up at the nextlast_entry_id, and aUNIQUE(zim_name, article_path, chunk_idx)constraint withON CONFLICT DO UPDATEprevents the duplicate-rows footgun the v0.1 indexer had when a partial run was restarted.- Schema v2 (
meta.schema_version=2). Chunk text is no longer duplicated into the DB; the indexer stores(char_offset, char_len)and the RAG server reads the text from the backing ZIM lazily at query time via a sharedtextproc.html_to_textextractor. The extractor is versioned (meta.extractor_version) and the server refuses to start on mismatch, pointing the user atscripts/reindex.sh --force. On Wikipedia-class archives this cuts index size by ~60% before quantization. - Vector quantization —
int8is the new default storage format for embeddings (4× smaller than the v0.1 float32, <1 MTEB-point recall drop on cosine-normalised models).float32remains available viaRAG_QUANTIZATION=float32for workstations that don't want any recall hit. Stored inmeta.quantization; the server packs the query vector to match. - Hybrid retrieval (off by default, on for Pi
profile). ZIMs at or above
RAG_HYBRID_THRESHOLD_GB(default 4 GB when hybrid is enabled) are registered withmode='bm25'inindexed_zimsinstead of dense-indexed. At query time the RAG server queries each BM25 ZIM via libzim's built-in Xapian search and merges the candidates with the dense top-K through reciprocal rank fusion. This skips dense indexing entirely for multi-100-GB archives — a Pi 5 with a USB SSD can now hold thecomprehensivebundle in a useful, searchable state in hours instead of weeks. - Profile presets —
RAG_PROFILE=pi|laptop|workstation. A single knob bundles sensible defaults for chunk size, quantization, hybrid mode and threshold, batch size, and embed model. IndividualRAG_*env vars / CLI flags still override.scripts/bootstrap.sh --profileandscripts/reindex.sh --profilethread the choice through, andcompose/docker-compose.pi.ymldefaults to thepiprofile so a fresh Pi bootstrap gets the right tradeoffs without flags. Profiles are documented indocs/rag-optimization.md. scripts/rag/textproc.py,quant.py,profiles.py— new shared modules.textprocis the single source of truth for HTML → text and chunk-offset computation; lxml is the parser.quantexposes pack / normalize / degenerate-vector helpers across float32 and int8 modes.profilesdefines the preset table.- Custom bundle support —
scripts/fetch-bundle.sh custom --add …andscripts/bootstrap.sh --bundle custom --add …. Users can append any ZIM tobundles/custom/manifest.jsonby passing a full URL or a Kiwix library handle (e.g.wikipedia_en_simple_all_maxi_2026-03); handles are resolved against the Kiwix project-prefix → download directory map inscripts/build-custom-manifest.py. The download + verify path reuses the existing fetch flow; sizes are populated from HEAD requests at manifest build time.bundles/custom/manifest.jsonis gitignored (per-user, not checked in);bundles/custom/LICENSE.mdships a template reminding users they own the license-tracking burden for self-supplied archives. scripts/rag/server.py/statusendpoint now reports the active index meta (schema_version,embed_model,embed_dim,quantization,chunk_size,extractor_version) plus the list of BM25-registered ZIMs, so the landing page can display the active retrieval mode honestly.docs/rag-optimization.md— new doc walking through the profile matrix, the schema-v2 storage savings math, quantization trade-offs, hybrid retrieval semantics, custom-bundle workflow, and migration notes from a v0.1 index. Indexed inDOCS.md; linked fromREADME.md,docs/bundles/README.md, and the install guides.lxml==5.3.0pinned inscripts/rag/requirements.txt;libxml2/libxslt1.1added to the rag Dockerfile.
Changed
scripts/reindex.shlearns--profile pi|laptop|workstationand passes it to the indexer. The existing--level quick|standard|fullflag still controls the article cap.scripts/bootstrap.shshows the indexing profile in the platform-summary prompt and writes it tocompose/.envasRAG_PROFILE. The/statusline at the end of bootstrap reports BM25-mode ZIMs separately from dense ZIMs in the coverage table.compose/.env.example,compose/docker-compose.yml,compose/docker-compose.pi.yml— addedRAG_PROFILE,RAG_QUANTIZATION,RAG_CHUNK_SIZE,RAG_CHUNK_OVERLAP,RAG_BATCH_SIZE,RAG_HYBRID,RAG_HYBRID_THRESHOLD_GB, andRAG_BM25_K. Pi compose defaults to thepiprofile + hybrid mode at 4 GB.RAG_MAX_ARTICLESdefault changed from5000to0incompose/.env.example. The platform-aware bootstrap defaults are unchanged; only the manual-edit fallback shifts. Unlimited indexing is now the recommended default with the new pipeline because the per-article cost is no longer prohibitive.
Migration from v0.1
A v0.1 index (meta.schema_version unset or !=
2) is not compatible with the new server.
The indexer refuses to start on mismatch and prints the exact
reindex.sh --force command. There is no in-place migration
— the chunk text column was dropped and vectors are quantized
differently. Pi installs that built a v0.1 index should expect a
one-time rebuild, which the new pipeline finishes in a fraction of the
original time anyway.
Added (pre-overhaul backlog, still applies)
scripts/make-zim.sh— wrapper that turns a directory of user-supplied documents (markdown, docx, txt, rst, html, …) into a ZIM file via pinned Docker images (pandoc/minimal:3.5.0+ghcr.io/openzim/zim-tools:3.4.2). No host installs required beyond Docker. Stages source into a temp dir so the user's tree is never mutated, runs pandoc over convertible extensions, auto-generates anindex.htmllisting when none exists, then invokeszimwriterfswith metadata flags (--title,--description,--creator,--publisher,--language, etc.). Optional--installcopies the resulting.ziminto$ALLARKIVE_ZIM_DIRand restarts the kiwix container;--reindexadditionally chains intoscripts/reindex.shso the new content becomes searchable via the RAG layer. Surfaces the resulting file size and SHA-256 so users can pin them if they plan to redistribute. Addresses a recurring talk-Q&A question: "can I add my own docs?"docs/bundles/custom-docs.md— companion doc covering ZIM format internals (clusters, Zstandard, Xapian indexes, the URL/title pointer lists) at the level needed to understand the workflow, then walks throughscripts/make-zim.shfor the document-conversion path andzimitfor the website-mirror path. Includes a "what gets converted, what doesn't" matrix (PDFs are passed through but not full-text indexed unless converted), a hand-rolled equivalent of the script for users who prefer not to use the wrapper, the two-index gotcha (Kiwix-side Xapian vs RAG-side vector index — only the wrapper's--reindexhandles the latter), a license/trust reminder section pointing atTHREAT_MODEL.md, and a troubleshooting table. Referenced fromdocs/bundles/README.mdand indexed inDOCS.md.docs/index.html,docs/styles.css,docs/.nojekyll— public-facing landing page intended to be served by GitHub Pages frommain/docs. Short project explanation, install snippet, links to the hosted docs. Distinct from the local landing page atlanding/, which remains the in-stack entry point.bundles/minimal/LICENSE.md,bundles/balanced/LICENSE.md,bundles/comprehensive/LICENSE.md— per-bundle license summaries listing each ZIM's source, license, and attribution as required by CLAUDE.md.docs/TROUBLESHOOTING.md— new central operational reference. Covers: the cap/coverage story ("minimal" means small download, not small content); native-Ollama setup for Apple Silicon speedup; WSL2 GPU passthrough; the formerly-fatal sqlite-vec errors and how the indexer/server now handle them; cap-aware bootstrap re-runs; pace tiers per hardware; sanity-check commands for "is photosynthesis actually in my index"; demo-question selection. Linked fromREADME.md,DOCS.md, all four install guides, anddocs/deployment/pi-text-only.md.scripts/bootstrap.sh --max-articles <N>and--full-index(alias for--max-articles 0) let users control the per-ZIM article cap from bootstrap, no longer requiring them to editcompose/.envor run the indexer manually. Default cap is now platform-aware: 3000 on Pi (CPU-bound), 0 (unlimited) on mac / linux / wsl because GPU/Metal acceleration makes full coverage feasible. The previous global default of 5000 was randomly sampling 1% of ZIMs like Wikipedia (463k+ entries) and silently missing demo-critical articles like Photosynthesis. Bootstrap writes the chosen cap tocompose/.envasRAG_MAX_ARTICLES, so manualdocker compose exec rag python indexer.pyruns honour the same setting. Surfaces in the platform-summary prompt asindex cap: ….scripts/bootstrap.sh --no-modelenables search-only mode: skips the chat-model pull, writesCHAT_MODEL=__none__intocompose/.env, and the RAG service (scripts/rag/server.py) detects the sentinel and returns retrieved passages as a numbered citation list instead of calling Ollama's/api/chat. The embedding model is still pulled because indexing needs it. Surfaces in the platform-summary asmodel : (none — search-only mode). Useful for low-RAM Pi setups or "search the archive, no AI" deployments.scripts/rag/server.py/statusendpoint now reportssearch_only: true|false.landing/app.jsreads this on each status refresh and hides the "Ask AI" UI (nav link + mode toggle) when no chat model is configured, forcing the page into search-archive mode. The model line in the status bar readssearch-only (no chat model)in that state.bundles/comprehensive/manifest.jsonpopulated SHA-256 hashes for the five ZIMs that previously had"sha256": ""(Wikipedia maxi 115 GB, Gutenberg 206 GB, Stack Overflow 75 GB, SuperUser, Math Stack Exchange). Without these,scripts/fetch-bundle.shwould refuse to verify the downloaded files per the project policy in CLAUDE.md.docs/deployment/pi-text-only.mdgained two recipes: "Pi 5 + large external SSD + tiny model + comprehensive ZIMs" (explicit flags override the platform default of minimal+1.5b) and "search-only mode" (--no-model).scripts/bootstrap.shlearned a--platform <mac|linux|pi|wsl>flag withautodetection (Darwin → mac,/proc/device-tree/model→ pi,/proc/versionWSL marker → wsl, else linux). Each profile sets the compose file, data-dir base, default bundle, and default chat model. RAM probe auto-downgrades to the minimal bundle +qwen2.5:1.5bwhen total RAM < 6 GB and the user hasn't pinned bundle/model. Confirmation prompt shows detected platform/RAM/GPU and the chosen defaults before continuing;--yes/-yskips it for CI. On macOS without a host-side Ollama, the script now prints a "install native Ollama for Metal speedup" hint, since Dockerized Ollama on Darwin is CPU-only. On WSL2 with no NVIDIA GPU detected, it warns that indexing will be CPU-bound (recommends--bundle minimalornvidia-container-toolkit), since GPU passthrough is the only fast path inside Docker Desktop on Windows.--piis preserved as an alias for--platform pi.scripts/rag/indexer.pycap-aware resume: the indexer compares each ZIM's previously-recordedarticle_countinindexed_zimsagainst the archive's total entries and the new effective cap. If the prior run was capped and the new cap allows more articles (e.g. user re-ran with--full-indexor just raised--max-articles), the ZIM is dropped and re-indexed. ZIMs that were fully covered the first time are still skipped. Means./scripts/bootstrap.shJust Works after raising the cap — no manual_drop_zimcleanup needed, and partial-coverage ZIMs no longer hide indefinitely behind a stale completion marker.scripts/rag/indexer.pyMediaWiki HTML extraction fix. The previous_html_to_textusedclass_=re.compile("(toc|sidebar|navbox|reflist|mw-references|catlinks)")— an unanchored regex that BeautifulSoup matched against the joined class string. Modern Wikipedia ZIMs put Vector-skin classes likevector-toc-not-available/vector-feature-toc-pinned-clientpref-0on<body>, all containing the substringtoc. The unanchored regex matched the body wrapper,.decompose()killed the entire document, and_html_to_textreturned 0 characters. Every WikiMed article was being skipped viaif len(text) < 50: continue, while titles slipped through — resulting in ~8,700 chunks of useless title-only fragments instead of real content. Now uses anchored per-class matching (_STRIP_CLASS_PREFIXES) that won't fire on substring collisions. Verified on Aspirin (61k chars extracted) and Photosynthesis (54k chars extracted).scripts/rag/prompt.pysystem template softened. The previous wording ("If no passage is relevant to the question, respond with exactly: no sources found for this question.") causedqwen2.5:7bto refuse on tangentially-relevant passages — e.g. "Adverse effects of aspirin" was judged not-an-answer to "What is aspirin used for?". The new wording instructs the model to synthesise partial answers from related passages and reserve the "no sources" fallback for cases where the passages have no connection to the topic at all. Same[N]citation rule, same refusal text — just less trigger-happy.scripts/rag/indexer.pyhardening: a single bad embedding (zero magnitude, NaN/inf, or one sqlite-vec rejects) no longer crashes the whole indexer. The bad chunk is logged and skipped, indexing continues. Previously, one malformed vector from Ollama would raisesqlite3.OperationalError: Internal sqlite-vec error: could not write vector bloband kill the process.scripts/rag/server.pyopensindex.dbin read-only URI mode with a 30-secondbusy_timeout. This stops the FastAPI server's long-lived connection from contending with the indexer's writes on sqlite-vec's internal shadow tables, which previously surfaced assqlite3.OperationalError: locking protocolmid-indexing — especially pronounced once Ollama was Metal-accelerated and writes got fast. Indexing can now run viadocker compose exec rag python indexer.pywithout first stopping the rag service.
Changed
docs/TROUBLESHOOTING.mdgained five new sections from the May 12–13 hardening work: (1) "no sources found despite WikiMed indexed" pointing at the MediaWiki extraction bug + sanity check, (2) cap-aware logic re-indexing ZIMs you thought were full, with thearticle_count=999999pin workaround, (3) indexer + FastAPI server can't shareindex.dbeven with read-only mode — the canonical stop/run --rm/start dance, (4) auto-shutdown after long unattended runs viapmset schedule+caffeinate, (5) "Why isindex.dbbigger than my ZIM files?" — observed ratios per ZIM type, the per-chunk overhead breakdown, and v0.2 mitigations to consider.docs/ARCHITECTURE.mdDecisions to revisit in v0.2+ picks up matching bullets for the index-storage overhead and the cap-awareall_entry_countcomparison bug. The two earlier duplicate### Addedheaders under[Unreleased]are also collapsed.scripts/bootstrap.shnow runsdocker compose build ragon every invocation beforeup.docker compose upreuses locally-tagged images even when the source files behind them have changed, so prior bootstrap re-runs were silently shipping staleserver.py/indexer.pyuntil the user remembered to manually rebuild. Cached layers make the build near-instant (~5s) when nothing changed.scripts/bootstrap.shfinal summary now prints chunk counts per archive from the RAGindex.db, so a fresh user can see exactly what got indexed rather than discovering an empty/partial index only when a query returns "no sources found".Install guides (
README.md,docs/install/{laptop,macos,windows,server}.md,docs/deployment/pi-text-only.md,docs/index.html) anddocs/ARCHITECTURE.MDnow state that RAG indexing runs after the stack starts and takes hours on CPU for the balanced bundle, that the indexer is resumable and idempotent, and that Kiwix browsing works immediately while RAG coverage grows in the background. Replaces stale "10–30 minutes for the balanced bundle" claims indocs/install/server.mdanddocs/deployment/pi-text-only.md.All Docker images in
compose/docker-compose.yml,compose/docker-compose.pi.yml, andcompose/docker-compose.pi-archive.ymlare now pinned to SHA-256 digests (multi-arch manifest digests fetched viadocker buildx imagetools inspect). Images affected: kiwix-serve 3.8.2, ollama 0.22.1, open-webui 0.9.2, nginx 1.27.3-alpine.Bundle manifests promoted from
"version": "draft"to"version": "1.0"; ZIM selection finalised and model choices resolved:minimal→ WikiMed + iFixit, qwen2.5:3b;balanced→ Wikipedia text + WikiMed + iFixit + Gutenberg + SuperUser, qwen2.5:7b;comprehensive→ adds Wikipedia with images + Stack Overflow + Math SE, qwen2.5:7b.docs/install/laptop.md— rewritten from Milestone 2 draft to use docker-compose as the primary install path; manual-install appendix note preserved.docs/install/server.md— headless Linux server install guide; covers Docker Engine install, SSH port forwarding for browser access, data directory setup, and systemd service unit.docs/install/macos.md— macOS install guide; covers Docker Desktop, Apple Silicon Metal acceleration, resource limits, and file-sharing permissions.docs/install/windows.md— Windows install guide via WSL2 + Docker Desktop; covers WSL2 install, memory configuration, filesystem performance notes, NVIDIA GPU passthrough, and Defender exclusions.docs/bundles/README.md— bundle reference: contents, ZIM sizes, license summary, how to add custom ZIMs, and indexing time estimates for each bundle.docs/deployment/lan-access.md— opt-in LAN access guide; Caddy and nginx reverse-proxy examples, basic auth setup, firewall rules, and security posture summary.docs/THREAT_MODEL.md— renamed fromTHREAD_MODEL.MD(filename typo fix); added build-learnings section covering: Open WebUI auth-off default, RAG "no sources" enforcement, Docker bridge isolation, RAG API key not being a security boundary, image digest pinning status, and iFixit commercial-use restriction.
Changed
compose/docker-compose.pi-archive.yml— archive-only compose stack for a dedicated Raspberry Pi kiwix node (deployment pattern C); single service with aKIWIX_BINDvariable for toggling between loopback-only and LAN access without editing the compose file.docs/deployment/pi-text-only.md— full walkthrough for running the AllArkive stack (kiwix + Ollama + RAG + landing page) on a Raspberry Pi 4/5 with a USB SSD; covers OS imaging, SSD mount, Docker install, bootstrap, smoke tests, and ARM/low-RAM troubleshooting.docs/deployment/pi-archive-only.md— walkthrough for a dedicated kiwix archive node on a Pi; covers LAN exposure, firewall rules, ZIM bundle download, and verification steps.docs/deployment/split.md— guide for the split topology: AI (Ollama + RAG) on a laptop, archive (kiwix) on a separate Pi; covers NFS mount, index copy alternatives, network configuration, and citation URL consistency.
Changed
compose/docker-compose.pi.yml— addedOPENAI_API_BASE_URLSandOPENAI_API_KEYSto theopen-webuiservice soallarkive-ragappears as a selectable model (matching the laptop stack); addeddepends_on: ragwithcondition: service_healthytoopen-webui; addedcondition: service_healthytoragandlandingdepends_on blocks; extended Ollamastart_periodto 90 s; changed defaultCHAT_MODELfromqwen2.5:7btoqwen2.5:1.5bto fit Pi 4 with 4 GB RAM; added model size guidance comment.landing/index.html,landing/style.css,landing/app.js— local landing page served athttp://localhost:8080; single-page with Search/Chat and Manage sections; dark mode viaprefers-color-scheme; no external assets; all system fonts.landing/nginx.conf— nginx config serving static files on port 8080; proxies/api/rag/to the RAG service so the browser makes same-origin API calls.scripts/rag/server.py—/statusendpoint returning binding mode, installed archives (name, size, mtime), archive count and total size, chat/embed model names, and RAG index readiness; consumed by the landing page status line and Manage section.compose/docker-compose.yml—landingservice (nginx:1.27.3-alpine) onLANDING_PORT(default 8080); depends onragbeing healthy.compose/docker-compose.pi.yml—ragandlandingservices added, matching the laptop stack with Pi-appropriate timeouts and a lowerRAG_MAX_ARTICLEScap.compose/.env.example—LANDING_PORT=8080.scripts/rag/server.py— FastAPI RAG service exposing an OpenAI-compatible/v1/chat/completionsendpoint; every response is grounded in retrieved passages or refused with an explicit no-sources message; supports streaming via SSE.scripts/rag/indexer.py— CLI indexer that reads ZIM files vialibzim, chunks article text, embeds with Ollama (nomic-embed-text), and stores normalised float32 vectors in an sqlite-vec index at$ALLARKIVE_DATA_DIR/index/index.db. Idempotent: re-run safely; skips unchanged ZIMs;--forceto rebuild.scripts/rag/prompt.py— system prompt template requiring[N]citation markers and a hard refusal when no relevant passages are found.scripts/rag/citations.py— post-processor that rewrites[N]markers to Markdown links pointing at the local kiwix article viewer.scripts/rag/Dockerfile— multi-arch (linux/amd64,linux/arm64) Python 3.11 image for the RAG service.scripts/rag/requirements.txt— pinned Python dependencies for the RAG service.compose/docker-compose.yml— RAG service (allarkive-rag:0.1.0) added to the default stack; Open WebUI wired to the RAG service viaOPENAI_API_BASE_URLSsoallarkive-ragappears as a selectable model alongside Ollama models.compose/.env.example— RAG tuning knobs:EMBED_MODEL,CHAT_MODEL,RAG_TOP_K,RAG_MAX_DISTANCE,RAG_MAX_ARTICLES,KIWIX_PUBLIC_URL.scripts/bootstrap.sh— pullsnomic-embed-textembedding model and runs the indexer automatically on first boot.
Changed
compose/docker-compose.yml—open-webuinow depends onragbeing healthy before starting, ensuring the model is visible at first launch.scripts/bootstrap.sh— summary screen shows RAG port and re-index instructions.docs/ARCHITECTURE.md— filled in Milestone 4 decisions (vector DB, embedding model, Open WebUI integration approach).TODO.md— Milestone 4 CC tasks marked done; vector DB and embedding model open questions resolved.compose/docker-compose.yml— default stack with kiwix-serve, Ollama, and Open WebUI; all services bound to 127.0.0.1; CPU-only by default with a--profile gpuopt-in for NVIDIA GPU acceleration; telemetry disabled on Open WebUI; health checks on every service.compose/docker-compose.pi.yml— Raspberry Pi variant (deployment pattern B); data root defaults to/mnt/ssd/allarkive; extended healthcheck grace periods; Ollama memory cap viaOLLAMA_MEMORY_LIMIT.compose/.env.example— documented environment template covering ports, data directory, Open WebUI secret key, and Pi overrides.scripts/bootstrap.sh— first-run setup: checks prerequisites, creates data directories, fetches the default bundle, starts the compose stack, and pulls the default Ollama model with streamed progress.scripts/fetch-bundle.sh— downloads and SHA-256 verifies ZIM files listed in a bundle manifest; resumes partial downloads; pins new checksums back to the manifest after first verification.bundles/balanced/manifest.json— draft balanced bundle: Wikipedia EN nopic, WikiMed, iFixit, Project Gutenberg, SuperUser (~30 GB total ZIMs).bundles/minimal/manifest.json— draft minimal bundle: WikiMed + iFixit (~3.5 GB total ZIMs; Pi-friendly).bundles/comprehensive/manifest.json— draft comprehensive bundle: Wikipedia EN with images, WikiMed, iFixit, Gutenberg, Stack Overflow, SuperUser, Math SE (~130 GB+ total ZIMs).docs/install/laptop.md— manual install guide (draft v0) covering Ollama, Kiwix, and Open WebUI on Ubuntu 22.04/24.04 LTS; includes smoke tests, port summary, and troubleshooting. To be validated and corrected during Milestone 2.Initial repository scaffolding.
CLAUDE.mdmaster instructions for agent-driven development.ARCHITECTURE.mddescribing the three-layer design.DESIGN.mdcovering tone, voice, and landing page direction.THREAT_MODEL.mddocumenting what the project does and does not protect against.ROADMAP.mdseparating v0.1 scope from v0.2+ ideas.TODO.md,DOCS.md,GOVERNANCE.md.CONTRIBUTING.md,CODE_OF_CONDUCT.md,SECURITY.md.LICENSE— AGPL-3.0 full text..gitignorecovering Python, Node, Docker, OS clutter, and ZIM archives..editorconfigwith per-filetype indent and line-ending rules..github/workflows/ci.yml— markdown lint, shellcheck, compose validation, Python lint..github/workflows/mirror-codeberg.yml— push mirror to Codeberg on everymaincommit..github/ISSUE_TEMPLATE/— bug, feature, and bundle-proposal templates..github/PULL_REQUEST_TEMPLATE.md.Directory skeleton:
compose/,scripts/rag/,bundles/,landing/,docs/install/,docs/bundles/,docs/deployment/.License rationale paragraph in
CONTRIBUTING.md— documents why AGPL-3.0 was chosen over MIT/Apache-2.0.
Changed
- Nothing yet.
Deprecated
- Nothing yet.
Removed
- Nothing yet.
Fixed
- Nothing yet.
Security
- Nothing yet.
0.1.0 — Unreleased
First public release. Scope:
- One-command Docker install of Kiwix, Ollama, and Open WebUI.
- Default starter knowledge bundle (English Wikipedia text-only, a medical reference wiki, iFixit, Project Gutenberg).
- RAG pipeline with citation-required prompts.
- Local landing page (search archive, chat with AI, manage bundles).
- Documentation: install guides for laptop and server; deployment patterns for Pi text-only, Pi archive-only, and split.
- Demo GIF in the README.
- Codeberg mirror automated via GitHub Action.
- AGPL-3.0 licensed glue code; bundled content keeps original licenses.