A reproducible MediaWiki stack powered by Docker. It includes a custom MediaWiki image, MariaDB, Memcached with a web UI, Elasticsearch + CirrusSearch for full‑text search, ClamAV for virusscanning of uploads, scheduled sitemap generation, rottenlinks updates and CirrusSearch indexing, short URLs, and an Apache reverse proxy for the MemcachePHP UI.
What you get
- Opinionated MediaWiki image (
mediawiki-custom) with sensible defaults- Short URLs (
/wiki/...)- Elasticsearch + CirrusSearch for search and RelatedArticles
- ClamAV is scanning uploads for viruses
- Automatic sitemap generation (cron via supercronic)
- Automatic rottenlinks generation (cron via supercronic)
- Memcached + MemcachePHP admin UI (proxied at
/memcacheui/)- Clear environment-driven configuration and volume persistence
- Cookie Banner with CookieConsent Extension
- SEO improvement with WikiSEO Extension
- and these included MediaWiki Extensions: - Lockdown - Description2 - RelatedArticles - MobileFrontend - CirrusSearch & Elastica - HitCounters & TopTenPages - RottenLinks - WikiCategoryTagCloud - CookieConsent - DynamicPageList - WikiSEO
This repo builds a custom MediaWiki image and composes a full stack:
- MediaWiki (based on official MediaWiki Image) with configurable extensions/skins
- MariaDB for the wiki database
- ClamAV for virusscanning
- Memcached plus a MemcachePHP admin UI (reverse-proxied via Apache)
- Elasticsearch + CirrusSearch/Elastica for search and suggestions
- Supercronic to run scheduled jobs (e.g., sitemap generation, optional link checks)
Target use-cases: local development and small/medium server deployments.
- MediaWiki: custom image
mediawiki-custom(Apache + PHP 8.x), short URLs, scripts inresources/mediawiki - MariaDB: official image (11.x), persistent volume for data
- ClamAV: official multiarch stable Debian Image, scans uploads for viruses on the fly
- Memcached: caching backend used by MediaWiki
- MemcachePHP: tiny admin UI, reverse-proxied by Apache under
/memcacheui/ - Elasticsearch (single node): for CirrusSearch integration
- Networks:
app-nw(front),backend-nw(internal)
Port defaults:
- Wiki:
http://localhost:${MW_HTTP_PORT:-8080}- MemcachePHP: proxied as
/memcacheui/or short alias/mcui/on the wiki host -http://localhost:${MW_HTTP_PORT:-8080}/mcui
Prerequisites: Docker + Docker Compose
1) Prepare environment
Create a local .env (or reuse your existing, or have a look at .env.example) with at least:
MW_HTTP_PORT=8080
MARIADB_ROOT_PASSWORD=R00tPassword
TZ=Europe/Berlin
# (optional) MemcachePHP UI
MEMCACHEPHP_ADMIN_USER=admin
MEMCACHEPHP_ADMIN_PASS=supersecret
# (recommended) Server URL for installer / sitemap
MW_SERVER_URL=http://localhost:80802) Start the stack use this command:
# if you like to build yourself
docker compose -f docker-compose.dev.yml --env-file .env up -d --build
# if you like to use prebuilt images
docker compose -f docker-compose.yml --env-file .env up -d --build3) First-run install
The entrypoint runs resources/mediawiki/mw-default-setup.sh
which:
- creates/extends
LocalSettings.phpat${MW_CONFIG_FILE} - enables short URLs (
/wiki/$1and action paths) - applies PHP/upload size limits
- wires Memcached, VisualEditor-friendly rewrites
- configures basic CirrusSearch settings to talk to Elasticsearch
- (if configured) sets up sitemap generation & Apache redirect for
/sitemap.xml
when:
-
you are mounting a conf folder for LocalSettings.php
- ./data/mediawiki/conf/mw_local_settings:/var/www/html/conf/:rw -
you are including the
.env.mwsetupin yourdocker-compose.yml... env_file: - .env.mwsetup ... volumes: - ./data/mediawiki/conf/mw_local_settings:/var/www/html/conf/:rw
You can later mount a host-side
LocalSettings.php(see Volumes & persistence).
The stack is environment-first. Most knobs are set via environment: in Compose, optionally complemented by an env_file:.
MW_CONFIG_FILE— path of the wiki config file inside the container (default/var/www/html/LocalSettings.php)MW_LANG— default language (e.g.,de)MW_SERVER_URL— canonical base URL used by maintenance scripts and sitemapMW_DEFAULT_SKIN— default skin code (e.g.,vector-2022,minerva)
- Build-time fetch lists (git-cloned during image build):
MW_INSTALL_EXTENSIONS— space-separated extension names or Git URLsMW_INSTALL_SKINS— space-separated skin names or Git URLs
- Runtime activation (appended to
LocalSettings.phpon container start):MW_ACTIVE_EXTENSIONS— space-separated canonical extension namesMW_ACTIVE_SKINS— space-separated skin codes (e.g.,vector-2022 deskmessmirrored)
The image contains helper logic to clone by branch with fallback or to pin to a specific commit (see
Dockerfile-mediawiki).
ES_JAVA_OPTS— e.g.,-Xms512m -Xmx512m(or-Xms1g -Xmx1g)discovery.type=single-node— set in Compose by default- Ensure the container RAM matches your heap (heap ≈ 50% of container memory).
MW_SITEMAP_GENERATION—true|falseMW_SITEMAP_CRON— default"20 */12 * * *"MW_SITEMAP_SERVER— e.g.,https://www.example.comMW_SITEMAP_URLPATH— e.g.,sitemap/(creates files under/var/www/html/sitemap/…)MW_SITEMAP_SKIP_REDIRECTS—true|falseMW_SITEMAP_RUN_ON_START—true|falseMW_SITEMAP_IDENTIFIER— identifier inserted into file names, e.g.,wiki
MEMCACHEPHP_SERVERS— e.g.,memcached:11211MEMCACHEPHP_ADMIN_USER,MEMCACHEPHP_ADMIN_PASS,MEMCACHEPHP_HTTP_PORT
MARIADB_ROOT_PASSWORD— root password for MariaDB- Wiki database/user/pass are applied by the installer (
mw-default-setup.sh) via flags.
COMPOSE_PROFILES- set toclamavto enable the clamav ContainerCLAMAV_ENABLEDenable the virusscan mediawiki config inLocalSettings.php
TZ— e.g.,UTCorEurope/Berlin
Precedence: values in the Compose
environment:section overrideenv_fileentries of the same name. Consider using Composesecrets:for sensitive values like DB root password.
data_mw_db:/var/lib/mysql— MariaDB data (persistent)data_mw_images:/var/www/html/images— MediaWiki uploads (persistent)clamav_db:/var/lib/clamav- clamav virussignatures and data (persistent)data_esdata:/usr/share/elasticsearch/data- index data (persistent)
LocalSettings.php
- Default path in container:
/var/www/html/LocalSettings.php - You can mount your host file to this path or to
/var/www/html/conf/…(then setMW_CONFIG_FILEaccordingly).
Sitemaps
- Default files are emitted under
/var/www/html/or inside/var/www/html/<MW_SITEMAP_URLPATH>/if you setMW_SITEMAP_URLPATH(e.g.,/var/www/html/sitemap/sitemap-index-<id>.xml).
- Rewrites map
/wiki/<Title>to the MediaWiki front controller, and actions like/wiki/edit/<Title>are supported. - VisualEditor friendly: consider
AllowEncodedSlashes NoDecodein your Apache conf for REST-style endpoints. - MemcachePHP UI is reverse-proxied by Apache to the
memcachephpcontainer under convenient paths:/memcacheui/(canonical)- plus optional helpers like
/memcacheor/mcui→ redirect to/memcacheui/ - Proper
X-Forwarded-*headers andProxyPassReverseCookiePathare configured in the provided conf.
- Fetching at build time: The Dockerfile supports cloning Wikimedia-hosted extensions/skins by branch (with a fallback), or by pinned commit.
- Activation at runtime: The entrypoint reads
MW_ACTIVE_EXTENSIONSandMW_ACTIVE_SKINSand appendswfLoadExtension()/ skin settings toLocalSettings.phpif not present. - Typical set included/tested in this stack:
- Extensions: CookieConsent, MobileFrontend, CirrusSearch, Elastica, RelatedArticles, Lockdown, Description2, WikiCategoryTagCloud, RottenLinks (optional), etc.
- Skins: Vector 2022, Minerva, Timeless, MonoBook, DeskMessMirrored, …
Some extensions (e.g., CirrusSearch/Elastica) require
composer installinside the MediaWiki directory to providevendor/autoload.php. The image takes care of running Composer where necessary.
-
Elasticsearch runs as a single node, configured for MediaWiki.
-
The entrypoint/script applies the CirrusSearch settings to
LocalSettings.phpand points MediaWiki to the ES host (elasticsearch). -
Initial index: use the helper script to bootstrap and build indices:
docker compose exec -T mediawiki sh -lc '/usr/local/bin/generate-elasticindex.sh'
This runs through
UpdateSearchIndexConfig,ForceSearchIndex, and optionallyUpdateSuggesterIndexonce the metastore exists. -
Memory sizing: start with
ES_JAVA_OPTS=-Xms512m -Xmx512m; raise to1gif you see sustained heap pressure. Heap ~50% of container RAM. -
Verification:
- MediaWiki API search using Cirrus backend (after index): queries should return results; RelatedArticles can use Cirrus as backend.
- Elasticsearch health:
GET /_cluster/healthshould beyelloworgreenin single-node mode.
- Script:
/usr/local/bin/generate-sitemap.sh- Adds flags only when corresponding env vars are set (e.g.,
--server,--urlpath,--skip-redirects). - Runs as
www-dataand ensures the output directory exists/has proper ownership.
- Adds flags only when corresponding env vars are set (e.g.,
- Apache 301 for
/sitemap.xml:- If
MW_SITEMAP_URLPATHis set (e.g.,sitemap/), redirect to/<urlpath>/**sitemap-index-<id>.xml** - If not set, redirect to
/sitemap-index-<id>.xmlin the docroot.
- If
- Scheduling via supercronic:
MW_SITEMAP_GENERATION=trueMW_SITEMAP_CRON="20 */12 * * *"(example)- Similar pattern can be used for optional link-check jobs (RottenLinks), guarded by a file existence check.
This enables on-upload antivirus scanning in MediaWiki using ClamAV (clamd) with the built‑in MediaWiki antivirus interface.
Use a dedicated Compose profile and an environment flag:
# Without ClamAV
docker compose up -d
# With ClamAV enabled (profile + app flag)
CLAMAV_ENABLED=true COMPOSE_PROFILES=clamav docker compose up -dRecommended .env entries:
CLAMAV_ENABLED=true
COMPOSE_PROFILES=clamav
# Optional host/port if you do not use the default service/port
CLAMAV_HOST=clamav
CLAMAV_PORT=3310Add a robust boolean toggle and the ClamAV command mapping.
# Antivirus integration (ClamAV)
$clamavEnabled = filter_var(getenv('CLAMAV_ENABLED') ?: 'false', FILTER_VALIDATE_BOOLEAN);
if ($clamavEnabled) {
$wgAntivirus = 'clamav';
$wgAntivirusRequired = true;
$wgAntivirusSetup['clamav'] = [
'command' => '/usr/bin/clamdscan --no-summary --stdout --config-file=/etc/clamav/clamd.remote.conf %f',
'codemap' => [
0 => AV_NO_VIRUS,
1 => AV_VIRUS_FOUND,
2 => AV_SCAN_FAILED,
'*' => AV_SCAN_FAILED,
],
];
} else {
$wgAntivirus = false;
}# From the MediaWiki container: clamd ready?
echo PING | nc -w 3 clamav 3310 # expect: PONG
# EICAR test (harmless signature): expect exit code 1 (infected)
cat > /tmp/eicar.com.txt <<'EOF'
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
EOF
/usr/bin/clamdscan --no-summary --stdout --config-file=/etc/clamav/clamd.remote.conf /tmp/eicar.com.txt; echo $?- Ensure
clamdscanis installed in the MediaWiki container (Debian/Ubuntu:apt-get install clamdscan). - If you upload large files, increase
StreamMaxLengthinclamd.conf(e.g.,StreamMaxLength 200M) and restart theclamavservice. - With
$wgAntivirusRequired = true, uploads are blocked if the scanner is unreachable (safer default).
- Suggested policy (example): git tag
v1.43.1→ container images tagged as:1.43.11.43latest
- Dependabot: keep Docker base images and GitHub Actions up to date.
- Watchtower labels (optional): permit automatic updates for selected services.
- Extensions: prefer pinned commits for stability; otherwise, branch fallback logic fetches latest for the chosen branch.
- Secrets: consider Docker
secrets:for DB root password and admin creds.
- VisualEditor & short URLs: ensure Apache allows encoded slashes:
AllowEncodedSlashes NoDecodein your site conf. - Sitemap redirect is 301 but file 404: confirm the sitemap script created
sitemap-index-<id>.xmlat the expected path (/var/www/htmlor/var/www/html/<urlpath>), and ownership iswww-data. - CirrusSearch says
Elastica\Client not found: run Composer sovendor/autoload.phpexists for Elastica; ensure both CirrusSearch and Elastica rancomposer install. - Elasticsearch heap pressure: check
/_cat/nodes?h=heap.percent,heap.current,heap.maxand logs forCircuitBreakingException; increaseES_JAVA_OPTSif needed. - LocalSettings mount blocks installer: if you mount an existing
LocalSettings.phpin Compose, the automated installer is skipped; adjustdocker-compose.*.ymlfor CI vs. dev. - env_file vs environment precedence: values in
environment:overrideenv_filefor the same key.
.
├─ Dockerfile-mediawiki
├─ Dockerfile-memcachephp
├─ docker-compose.dev.yml # main compose for dev/prod
├─ docker-compose.ci.yml # (optional) CI-oriented overrides
├─ resources/
│ └─ mediawiki/
│ ├─ mw-default-setup.sh # creates/amends LocalSettings, short URLs, Cirrus, etc.
│ ├─ generate-sitemap.sh # sitemap generator (runs as www-data)
│ └─ generate-elasticindex.sh # Cirrus/Elasticsearch bootstrap + index
├─ conf/
│ ├─ mediawiki-rewrites.conf # short URLs, VE-friendly settings
│ └─ memcachephp-proxy.conf # reverse proxy for /memcacheui/
└─ data/
└─ mediawiki/
└─ conf/ # mounted configs (LocalSettings.php, robots.txt, .htaccess, …)This project builds on:
- MediaWiki (Wikimedia Foundation & contributors)
- Elasticsearch, CirrusSearch, Elastica
- MariaDB, Memcached, MemcachePHP
- supercronic