AllArkive / Install / macOS

Installing AllArkive on macOS

This guide installs AllArkive on macOS using Docker Desktop and the docker-compose stack. Tested on macOS 13 (Ventura) and 14 (Sonoma) on both Intel and Apple Silicon.

Time estimate: 15–30 minutes setup, then waiting for downloads. After the stack starts, RAG indexing runs in the background — expect several hours for the balanced bundle on CPU (much faster if Ollama is using your Apple Silicon GPU via Metal). See Indexing takes hours below.


Quick start (automated)

git clone https://github.com/Clupai8o0/allarkive.git
cd allarkive
cp compose/.env.example compose/.env
openssl rand -hex 32  # copy into WEBUI_SECRET_KEY= in compose/.env
nano compose/.env
./scripts/bootstrap.sh --bundle balanced

On macOS, bootstrap.sh automatically uses ~/allarkive-data as the data directory (avoids the /var/lib/ permission issue). No extra config needed.

The manual steps below are equivalent — follow them for more control.

Indexing takes hours — leave it running

When bootstrap.sh finishes, the RAG indexer keeps running in the rag container, embedding every ZIM chunk through Ollama. Realistic times on a modern Mac:

  • minimal bundle: 10–20 minutes
  • balanced bundle: 2–6 hours (Apple Silicon, Metal-accelerated)
  • comprehensive bundle: overnight

The indexer is resumable and idempotent — close the lid, sleep the Mac, or reboot, and re-running bootstrap.sh (or docker compose exec rag python indexer.py) picks up where it left off.

Kiwix browsing at http://localhost:8081 works immediately. RAG answers improve as coverage grows — "no sources found" early on is expected for topics not yet indexed. Watch progress:

docker compose -f compose/docker-compose.yml logs -f rag

If you have Apple Silicon, install Ollama natively for 5–10× faster embedding — Docker Desktop on macOS can't see Metal, so the Dockerized Ollama falls back to CPU. See docs/TROUBLESHOOTING.md for the full setup, plus coverage-cap tuning, common errors, and how to expand RAG coverage by re-running bootstrap.


Prerequisites

Hardware

Minimum Recommended
RAM 8 GB 16 GB
Free disk 10 GB (minimal bundle + model) 30 GB (balanced bundle + model)

Apple Silicon (M1/M2/M3/M4) is supported. Ollama runs models natively on Apple Silicon via Metal — no NVIDIA GPU needed.

Software

  • Docker Desktop for Mac (version 4.20 or later). Download from https://www.docker.com/products/docker-desktop/.

    After installing, open Docker Desktop and confirm it is running (the whale icon appears in the menu bar). Verify in a terminal:

    docker compose version
  • Homebrew (optional but useful for git, openssl):

    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

    Or use the Xcode command-line tools:

    xcode-select --install

Step 1: Clone the repository

git clone https://github.com/Clupai8o0/allarkive.git
cd allarkive

Step 2: Set up configuration

cp compose/.env.example compose/.env

Generate a secret key:

openssl rand -hex 32

Open compose/.env in your editor and paste the result into WEBUI_SECRET_KEY=.

Data directory on macOS

The default data directory is /var/lib/allarkive. On macOS you will need to either create this path or use a directory under your home folder.

Option A — Use your home directory (simplest):

Set in compose/.env:

ALLARKIVE_DATA_DIR=/Users/YOUR_USERNAME/allarkive-data

Then create it:

mkdir -p ~/allarkive-data/{zim,index,models,data}

Option B — Use /var/lib/allarkive (matches Linux docs):

sudo mkdir -p /var/lib/allarkive/{zim,index,models,data}
sudo chown -R "$USER" /var/lib/allarkive

Step 3: Docker Desktop resource limits

By default, Docker Desktop allocates 50% of CPU and 50% of RAM. For the balanced bundle with qwen2.5:7b, allocate at least 8 GB RAM to Docker.

Open Docker Desktop → Settings → Resources and set:

  • Memory: 10 GB (or more if you have it)
  • CPUs: at least 4

Click Apply & Restart.


Step 4: Fetch a bundle

./scripts/fetch-bundle.sh balanced

The script downloads ZIM files to $ALLARKIVE_DATA_DIR/zim/ and verifies checksums. A failed checksum stops the script — do not proceed if this happens.

Bundle Contents Disk (ZIMs only)
minimal WikiMed + iFixit ~4 GB
balanced Wikipedia (mini) + WikiMed + iFixit + SuperUser + Unix SE + Ask Ubuntu ~23 GB
comprehensive Full Wikipedia (images) + Gutenberg + Stack Exchange ~330 GB

Step 5: Start the stack

cd compose/
docker compose up -d

On first run, Docker does two things before services start:

  1. Builds the RAG image from source (scripts/rag/) — 2–4 minutes.
  2. Pulls the remaining images (kiwix-serve, Ollama, Open WebUI, nginx).

Subsequent starts skip both steps and are fast.

Watch progress:

docker compose logs -f

Wait for all containers to report healthy:

docker compose ps

Total first-run time: 5–15 minutes depending on network.


Step 6: Pull AI models

Two models are needed — the chat model and the embedding model used by the RAG indexer. Pull them before indexing:

# Chat model (~4 GB for qwen2.5:7b):
docker compose exec ollama ollama pull qwen2.5:7b

# Embedding model (~270 MB):
docker compose exec ollama ollama pull nomic-embed-text

Both pulls resume automatically if interrupted.

Apple Silicon note: Ollama detects Metal automatically. No extra config needed — inference will already be using GPU acceleration.


Step 7: Index the archive

docker compose exec rag python indexer.py \
    --zim-dir /data \
    --index-dir /index \
    --ollama-url http://ollama:11434

To force a full rebuild of an existing index:

docker compose exec rag python indexer.py \
    --zim-dir /data \
    --index-dir /index \
    --ollama-url http://ollama:11434 \
    --force

Indexing time is roughly 10–30 minutes for the balanced bundle. The index persists in $ALLARKIVE_DATA_DIR/index/ across restarts.


Step 8: Open the landing page

Visit http://localhost:8080 in your browser.

Confirm the status line shows your archive size and model name. Test a search and an AI question to verify citations are working.


Apple Silicon notes

Ollama detects Apple Silicon automatically and uses Metal for GPU acceleration. Model inference is significantly faster than on CPU. No extra configuration needed.

The Docker images are built as multi-arch manifests (linux/amd64 + linux/arm64). Docker Desktop on Apple Silicon runs the native arm64 images — no Rosetta emulation for any service, including Kiwix and Open WebUI.


What bootstrap.sh does on macOS

bootstrap.sh on macOS automatically uses ~/allarkive-data as the data directory (avoids /var/lib/ permission issues), saves paths to ~/.config/allarkive/config.json for future runs, detects port conflicts and auto-assigns alternatives, and pulls both the chat model and embedding model before running the indexer. The manual steps above are the exact equivalent.


Cleanup and uninstall

Use scripts/cleanup.sh. Nothing is deleted unless you explicitly ask.

Command What it removes
./scripts/cleanup.sh Stops and removes containers only. Data and images kept.
./scripts/cleanup.sh --images Also removes Docker images (re-pulled on next start).
./scripts/cleanup.sh --data Also deletes ~/allarkive-data: ZIMs, models, RAG index, Open WebUI DB. Irreversible. Prompts before deleting.
./scripts/cleanup.sh --all --images + --data. Full wipe. Prompts before deleting.

After a full wipe, start fresh with ./scripts/bootstrap.sh --bundle balanced.


Port summary

Service Port Bound to
Landing page 8080 127.0.0.1
kiwix-serve 8081 127.0.0.1
Open WebUI 3000 127.0.0.1
Ollama 11434 127.0.0.1
RAG service 8000 127.0.0.1

Troubleshooting

Docker Desktop is not running

Click the whale icon in the menu bar and wait for it to show "Running". Then retry docker compose up -d.

RAG image build fails

The RAG image is built from scripts/rag/ on first run. If it fails:

  • Check you have internet access during build (downloads Python packages)
  • Check Docker Desktop has enough disk (Settings → Resources → Disk image size)

To retry: docker compose build rag && docker compose up -d

bind: address already in use on port 8080

Port 8080 is commonly used by dev servers. Change LANDING_PORT=8082 (or any free port) in compose/.env and restart.

Model inference is very slow

Check Docker Desktop memory allocation (see Step 3). Also check that Docker Desktop is not competing with other memory-heavy apps. On Apple Silicon, make sure Rosetta is not running the Ollama container (it should not be, but you can verify with docker inspect allarkive-ollama-1 | grep -i platform).

Kiwix volumes not mounting

On macOS, Docker Desktop must have permission to access the directory you set as ALLARKIVE_DATA_DIR. Go to Docker Desktop → Settings → Resources → File sharing and add the parent directory of your data path.

WEBUI_SECRET_KEY error

Open compose/.env and confirm WEBUI_SECRET_KEY= is set to a 64-character hex string. If missing, generate one: openssl rand -hex 32.

Source: docs/install/macos.md. Edit on GitHub.