Installing AllArkive on macOS
This guide installs AllArkive on macOS using Docker Desktop and the docker-compose stack. Tested on macOS 13 (Ventura) and 14 (Sonoma) on both Intel and Apple Silicon.
Time estimate: 15–30 minutes setup, then waiting for downloads. After the stack starts, RAG indexing runs in the background — expect several hours for the balanced bundle on CPU (much faster if Ollama is using your Apple Silicon GPU via Metal). See Indexing takes hours below.
Quick start (automated)
git clone https://github.com/Clupai8o0/allarkive.git
cd allarkive
cp compose/.env.example compose/.env
openssl rand -hex 32 # copy into WEBUI_SECRET_KEY= in compose/.env
nano compose/.env
./scripts/bootstrap.sh --bundle balancedOn macOS, bootstrap.sh automatically uses
~/allarkive-data as the data directory (avoids the
/var/lib/ permission issue). No extra config needed.
The manual steps below are equivalent — follow them for more control.
Indexing takes hours — leave it running
When bootstrap.sh finishes, the RAG indexer
keeps running in the rag container, embedding
every ZIM chunk through Ollama. Realistic times on a modern Mac:
- minimal bundle: 10–20 minutes
- balanced bundle: 2–6 hours (Apple Silicon, Metal-accelerated)
- comprehensive bundle: overnight
The indexer is resumable and idempotent — close the
lid, sleep the Mac, or reboot, and re-running bootstrap.sh
(or docker compose exec rag python indexer.py) picks up
where it left off.
Kiwix browsing at http://localhost:8081 works
immediately. RAG answers improve as coverage grows —
"no sources found" early on is expected for topics not yet indexed.
Watch progress:
docker compose -f compose/docker-compose.yml logs -f ragIf you have Apple Silicon, install Ollama natively
for 5–10× faster embedding — Docker Desktop on macOS can't see Metal, so
the Dockerized Ollama falls back to CPU. See docs/TROUBLESHOOTING.md
for the full setup, plus coverage-cap tuning, common errors, and how to
expand RAG coverage by re-running bootstrap.
Prerequisites
Hardware
| Minimum | Recommended | |
|---|---|---|
| RAM | 8 GB | 16 GB |
| Free disk | 10 GB (minimal bundle + model) | 30 GB (balanced bundle + model) |
Apple Silicon (M1/M2/M3/M4) is supported. Ollama runs models natively on Apple Silicon via Metal — no NVIDIA GPU needed.
Software
Docker Desktop for Mac (version 4.20 or later). Download from
https://www.docker.com/products/docker-desktop/.After installing, open Docker Desktop and confirm it is running (the whale icon appears in the menu bar). Verify in a terminal:
docker compose versionHomebrew (optional but useful for
git,openssl):/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Or use the Xcode command-line tools:
xcode-select --install
Step 1: Clone the repository
git clone https://github.com/Clupai8o0/allarkive.git
cd allarkiveStep 2: Set up configuration
cp compose/.env.example compose/.envGenerate a secret key:
openssl rand -hex 32Open compose/.env in your editor and paste the result
into WEBUI_SECRET_KEY=.
Data directory on macOS
The default data directory is /var/lib/allarkive. On
macOS you will need to either create this path or use a directory under
your home folder.
Option A — Use your home directory (simplest):
Set in compose/.env:
ALLARKIVE_DATA_DIR=/Users/YOUR_USERNAME/allarkive-dataThen create it:
mkdir -p ~/allarkive-data/{zim,index,models,data}Option B — Use /var/lib/allarkive (matches Linux
docs):
sudo mkdir -p /var/lib/allarkive/{zim,index,models,data}
sudo chown -R "$USER" /var/lib/allarkiveStep 3: Docker Desktop resource limits
By default, Docker Desktop allocates 50% of CPU and 50% of RAM. For
the balanced bundle with qwen2.5:7b, allocate
at least 8 GB RAM to Docker.
Open Docker Desktop → Settings → Resources and set:
- Memory: 10 GB (or more if you have it)
- CPUs: at least 4
Click Apply & Restart.
Step 4: Fetch a bundle
./scripts/fetch-bundle.sh balancedThe script downloads ZIM files to
$ALLARKIVE_DATA_DIR/zim/ and verifies checksums. A failed
checksum stops the script — do not proceed if this happens.
| Bundle | Contents | Disk (ZIMs only) |
|---|---|---|
minimal |
WikiMed + iFixit | ~4 GB |
balanced |
Wikipedia (mini) + WikiMed + iFixit + SuperUser + Unix SE + Ask Ubuntu | ~23 GB |
comprehensive |
Full Wikipedia (images) + Gutenberg + Stack Exchange | ~330 GB |
Step 5: Start the stack
cd compose/
docker compose up -dOn first run, Docker does two things before services start:
- Builds the RAG image from source
(
scripts/rag/) — 2–4 minutes. - Pulls the remaining images (kiwix-serve, Ollama, Open WebUI, nginx).
Subsequent starts skip both steps and are fast.
Watch progress:
docker compose logs -fWait for all containers to report healthy:
docker compose psTotal first-run time: 5–15 minutes depending on network.
Step 6: Pull AI models
Two models are needed — the chat model and the embedding model used by the RAG indexer. Pull them before indexing:
# Chat model (~4 GB for qwen2.5:7b):
docker compose exec ollama ollama pull qwen2.5:7b
# Embedding model (~270 MB):
docker compose exec ollama ollama pull nomic-embed-textBoth pulls resume automatically if interrupted.
Apple Silicon note: Ollama detects Metal automatically. No extra config needed — inference will already be using GPU acceleration.
Step 7: Index the archive
docker compose exec rag python indexer.py \
--zim-dir /data \
--index-dir /index \
--ollama-url http://ollama:11434To force a full rebuild of an existing index:
docker compose exec rag python indexer.py \
--zim-dir /data \
--index-dir /index \
--ollama-url http://ollama:11434 \
--forceIndexing time is roughly 10–30 minutes for the balanced bundle. The
index persists in $ALLARKIVE_DATA_DIR/index/ across
restarts.
Step 8: Open the landing page
Visit http://localhost:8080 in your browser.
Confirm the status line shows your archive size and model name. Test a search and an AI question to verify citations are working.
Apple Silicon notes
Ollama detects Apple Silicon automatically and uses Metal for GPU acceleration. Model inference is significantly faster than on CPU. No extra configuration needed.
The Docker images are built as multi-arch manifests
(linux/amd64 + linux/arm64). Docker Desktop on
Apple Silicon runs the native arm64 images — no Rosetta emulation for
any service, including Kiwix and Open WebUI.
What bootstrap.sh does on macOS
bootstrap.sh on macOS automatically uses
~/allarkive-data as the data directory (avoids
/var/lib/ permission issues), saves paths to
~/.config/allarkive/config.json for future runs, detects
port conflicts and auto-assigns alternatives, and pulls both the chat
model and embedding model before running the indexer. The manual steps
above are the exact equivalent.
Cleanup and uninstall
Use scripts/cleanup.sh. Nothing is deleted unless you
explicitly ask.
| Command | What it removes |
|---|---|
./scripts/cleanup.sh |
Stops and removes containers only. Data and images kept. |
./scripts/cleanup.sh --images |
Also removes Docker images (re-pulled on next start). |
./scripts/cleanup.sh --data |
Also deletes ~/allarkive-data: ZIMs, models, RAG index,
Open WebUI DB. Irreversible. Prompts before
deleting. |
./scripts/cleanup.sh --all |
--images + --data. Full wipe. Prompts
before deleting. |
After a full wipe, start fresh with
./scripts/bootstrap.sh --bundle balanced.
Port summary
| Service | Port | Bound to |
|---|---|---|
| Landing page | 8080 | 127.0.0.1 |
| kiwix-serve | 8081 | 127.0.0.1 |
| Open WebUI | 3000 | 127.0.0.1 |
| Ollama | 11434 | 127.0.0.1 |
| RAG service | 8000 | 127.0.0.1 |
Troubleshooting
Docker Desktop is not running
Click the whale icon in the menu bar and wait for it to show
"Running". Then retry docker compose up -d.
RAG image build fails
The RAG image is built from scripts/rag/ on first run.
If it fails:
- Check you have internet access during build (downloads Python packages)
- Check Docker Desktop has enough disk (Settings → Resources → Disk image size)
To retry:
docker compose build rag && docker compose up -d
bind: address already in use
on port 8080
Port 8080 is commonly used by dev servers. Change
LANDING_PORT=8082 (or any free port) in
compose/.env and restart.
Model inference is very slow
Check Docker Desktop memory allocation (see Step 3). Also check that
Docker Desktop is not competing with other memory-heavy apps. On Apple
Silicon, make sure Rosetta is not running the Ollama container (it
should not be, but you can verify with
docker inspect allarkive-ollama-1 | grep -i platform).
Kiwix volumes not mounting
On macOS, Docker Desktop must have permission to access the directory
you set as ALLARKIVE_DATA_DIR. Go to Docker Desktop
→ Settings → Resources → File sharing and add the parent
directory of your data path.
WEBUI_SECRET_KEY error
Open compose/.env and confirm
WEBUI_SECRET_KEY= is set to a 64-character hex string. If
missing, generate one: openssl rand -hex 32.