Config Baker Image
The config baker container may be used to execute all sorts of tasks around setting up, preparing and finalizing an instance of the Dataverse software. Its focus is bootstrapping non-initialized installations.
Contents:
Supported Image Tags
This image is sourced from the main upstream code repository of the Dataverse software. Development and maintenance of the image’s code happens there (again, by the community). Community-supported image tags are based on the two most important upstream branches:
The
unstable
tag corresponds to thedevelop
branch, where pull requests are merged. (Dockerfile)The
alpha
tag corresponds to themaster
branch, where releases are cut from. (Dockerfile)
Image Contents
This image contains some crucial parts to make a freshly baked Dataverse installation usable.
Scripts
Script |
Description |
---|---|
|
Run an initialization script contained in a persona. See |
|
Fixes filesystem permissions. App and Solr container run as non-privileged users and might need adjusted filesystem permissions on mounted volumes to be able to write data. Run without parameters to see usage details. |
|
Default script when running container without parameters. Lists available scripts and details about them. |
|
Update a Solr |
Solr Template
In addition, at /template
a Solr Configset
is available, ready for Dataverse usage with a tuned core config and schema.
Providing this template to a vanilla Solr image and using solr-precreate with it will create the necessary Solr search index.
The solrconfig.xml
and schema.xml
are included from the upstream project conf/solr/...
folder. You are
obviously free to provide such a template in some other way, maybe tuned for your purposes.
As a start, the contained script update-fields.sh
may be used to edit the field definitions.
Build Instructions
Assuming you have Docker, Docker Desktop, Moby or some remote Docker host configured, up and running from here on. Note: You need to use Maven when building this image, as we collate selective files from different places of the upstream repository. (Building with pure Docker Compose does not support this kind of selection.)
By default, when building the application image, it will also create a new config baker image. Simply execute the Maven modules packaging target with activated “container” profile from the projects Git root to build the image:
mvn -Pct package
If you specifically want to build a config baker image only, try
mvn -Pct package -Ddocker.filter=dev_bootstrap
The build of config baker involves copying Solr configset files. The Solr version used is inherited from Maven, acting as the single source of truth. Also, the tag of the image should correspond the application image, as their usage is intertwined.
Some additional notes, using Maven parameters to change the build and use …:
- … a different tag only: add
-Dconf.image.tag=tag
.Note: default is${app.image.tag}
, which defaults tounstable
- … a different image name and tag: add
-Dconf.image=name:tag
.Note: default isgdcc/configbaker:${conf.image.tag}
… a different image registry than Docker Hub: add
-Ddocker.registry=registry.example.org
(see also DMP docs on registries)… a different Solr version: use
-Dsolr.version=x.y.z
Processor Architecture and Multiarch
This image is published as a “multi-arch image”, supporting the most common architectures Dataverse usually runs on: AMD64 (Windows/Linux/…) and ARM64 (Apple M1/M2), by using Maven Docker Plugin’s BuildX mode.
Building the image via mvn -Pct package
, etc. will only build for the architecture of the Docker machine’s CPU.
Only mvn -Pct deploy -Ddocker.platforms=linux/amd64,linux/arm64
will trigger building on all enabled architectures.
Yet, to enable building with non-native code on your build machine, you will need to setup a cross-platform builder.
On Linux, you should install qemu-user-static (preferably via
your package management) on the host and run docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
to enable that builder. The Docker plugin will setup everything else for you.
Tunables
This image has no tunable runtime parameters yet.
Locations
Location |
Value |
Description |
---|---|---|
|
|
Place to store the scripts. Part of |
|
|
Place where the Solr Configset resides to create an index core from it. |
|
|
Stores the bootstrapping personas in sub-folders. |
|
|
Minimal set of scripts and data from upstream |
Exposed Ports
This image contains no runnable services yet, so no ports exposed.
Entry & Extension Points
The entrypoint of this image is pinned to dumb-init
to safeguard signal handling. You may feed any script or
executable to it as command.
By using our released images as base image to add your own scripting, personas, Solr configset and so on, simply adapt and alter any aspect you need changed.
Examples
Docker Compose snippet to wait for Dataverse deployment and execute bootstrapping using a custom persona you added by bind mounting (as an alternative to extending the image):
bootstrap:
image: gdcc/configbaker:unstable
restart: "no"
command:
- bootstrap.sh
- mypersona
volumes:
- ./mypersona:/scripts/bootstrap/mypersona
networks:
- dataverse
Docker Compose snippet to prepare execution of Solr and copy your custom configset you added by bind mounting
(instead of an extension). Note that solr-precreate
will not overwrite an already existing core! To update
the config of an existing core, you need to mount the right volume with the stateful data!
solr_initializer:
container_name: solr_initializer
image: gdcc/configbaker:unstable
restart: "no"
command:
- sh
- -c
- "fix-fs-perms.sh solr && cp -a /template/* /solr-template"
volumes:
- ./volumes/solr/data:/var/solr
- ./volumes/solr/conf:/solr-template
- /tmp/my-generated-configset:/template
solr:
container_name: solr
hostname: solr
image: solr:${SOLR_VERSION}
depends_on:
- dev_solr_initializer
restart: on-failure
ports:
- "8983:8983"
networks:
- dataverse
command:
- "solr-precreate"
- "collection1"
- "/template"
volumes:
- ./volumes/solr/data:/var/solr
- ./volumes/solr/conf:/template