Configuration¶
Warning
ConfigMap
.For usage on Kubernetes, all configuration can be stored in a ConfigMap
.
Configuring Dataverse is done in multiple ways:
- Database connection, mail gateway, bootstrap values, etc used in scripts
are read from environment variables directly.
- Use these variable names in your
ConfigMap
. - See default.config for a list of these special ones.
- Use these variable names in your
- Things like file storage, networking, most DOI, etc are all basic system settings
and can be set via Java system properties, residing in the Glassfish domain configuration.
- See JVM Options on how to configure these.
- See Dataverse Installation Guide: JVM options for a complete list of all available options.
- More options are stored in the database and configured via API and/or UI.
- See Database Settings on how to configure these.
- See Dataverse Installation Guide: Database settings for an exhaustive list of all available options.
Note
All of this should be streamlined into an easier to use configuration system. See https://github.com/IQSS/dataverse/issues/5293 for more. Please leave a comment there if you feel the same.
System Properties: JVM options¶
The basic idea is to map environment variables to Java system properties each time a Dataverse container starts with the default entrypoint (being the application server).
- Simply pick a JVM option from the list and replace “.” with “_” (“-” is not allowed in env var names!).
- Put the transformed name as a key into the
ConfigMap.data
. - Add your value. Be sure to use simple strings only - no numbers, no complex types. Escape with
" "
.
Note
Currently many JVM options have dashes in them, which is no allowed character for
an environment variable. As a workaround, replace any dash with __
(double underscore).
Will be transformed back into a -
internally when the container starts. See example below.
Examples (see below Full configuration example):
data:
dataverse_fqdn: data.example.org
dataverse_siteUrl: https://\${dataverse.fqdn}
dataverse_auth_password__reset__timeout__in__minutes: 30
doi_baseurlstring: https://mds.test.datacite.org
doi_username: EXAMPLEORG.TEST
dataverse_files_directory: /data
Warning
DO NOT USE THIS ConfigMap
FOR PASSWORDS! Those are done via Kubernetes Secrets
, see Credentials and Secrets.
Database Options: Using curl
¶
As database settings are persistent in, well, the database, they don’t need to get set everytime the container starts. To be consistent and easy to use, the same ConfigMap used for JVM options can be used for these settings, but you need to create a Job or even a CronJob to apply them.
Note
Of course you can choose to use your own tools and scripts for this. Basically its just curl calls to the Admin API.
Provide settings¶
- Pick a Database setting
- Remove the
:`
and replace it withdb_
. Keep the Pascal case! - Put the transformed value into the
ConfigMap.data
. - Add your value, which can be any value you see in the docs. Keep in mind: when you need to use JSON, format it as a string!
- When you need to delete a setting, just provide an empty value.
Examples (see below Full configuration example):
data:
doi_username: EXAMPLEORG.TEST
db_DoiProvider: DataCite
db_Protocol: doi
db_Authority: "10.12345"
db_Shoulder: EXAMPLE/
db_StatusMessageHeader: "Example.org is not yet in production"
db_StatusMessageText: "<br />Please do not save any real data, only use for testing and sneak-peek."
Warning
DO NOT USE THIS ConfigMap
FOR PASSWORDS! Those are done via Kubernetes Secrets
, see Credentials and Secrets.
Apply settings¶
Remember: you will need to update your ConfigMap
when you want to apply changes.
You need to think about in which file you keep the map - having it in two locations
is a bad idea. It’s always a good idea to put it in revision control.
# Update ConfigMap:
kubectl apply -f k8s/dataverse/configmap.yaml
# Deploy a new config job:
kubectl create -f k8s/dataverse/jobs/configure.yaml
You might consider providing a CronJob for scheduled, regular updates.
Details of the configuration job¶
Alternative approaches¶
There’s also stakater/Reloader, a tool to auto-reload resources. To use it, you will need to 1) change metadata of deployments (easy) and 2) change the image to include an entrypoint script that runs the configuration script on boot (not so easy). Feel free to leave feedback in the project if you are interested having this builtin.
Full configuration example¶
Below you can find an example ConfigMap
using all three types of variables:
---
kind: ConfigMap
apiVersion: v1
metadata:
name: dataverse
labels:
app.kubernetes.io/name: configmap
app.kubernetes.io/version: "1.0"
app.kubernetes.io/component: configmap
app.kubernetes.io/part-of: dataverse
app.kubernetes.io/managed-by: kubectl
data:
### GENERAL SETTINGS
HOST_DNS_ADDRESS: data.example.org
POSTGRES_DATABASE: dataverse
dataverse_fqdn: data.example.org
dataverse_siteUrl: https://\${dataverse.fqdn}
dataverse_auth_password__reset__timeout__in__minutes: 30
### CONTACT SETTINGS
CONTACT_MAIL: rdm@example.org
ADMIN_MAIL: &admin Example - Research Data Management <rdm@example.org>
db_SystemEmail: *admin
### DOI SETTINGS
doi_baseurlstring: https://mds.test.datacite.org
doi_username: EXAMPLEORG.TEST
db_DoiProvider: DataCite
db_Protocol: doi
db_Authority: "10.12345"
db_Shoulder: EXAMPLE/
### FILE STORAGE
dataverse_files_directory: /data
dataverse_files_storage__driver__id: "s3"
dataverse_files_s3__custom__endpoint__url: http://minio:9000
dataverse_files_s3__bucket__name: dataverse
# required for Minio!
dataverse_files_s3__path__style__access: "true"
### CUSTOMIZATION
db_StatusMessageHeader: "Example.org is not yet in production"
db_StatusMessageText: "<br />Please do not save any real data, only use for testing and sneak-peek."
Sane defaults: default.config
¶
Some things need sane defaults, which can be found in default.config (see below). You might find those usefull as an example for your personally tuned ConfigMap.
# Variables used in resource creation
HOST_DNS_ADDRESS=${HOST_DNS_ADDRESS:-localhost}
POSTGRES_SERVER=${POSTGRES_SERVER:-postgresql}
POSTGRES_PORT=${POSTGRES_PORT:-5432}
POSTGRES_USER=${POSTGRES_USER:-dataverse}
POSTGRES_DATABASE=${POSTGRES_DATABASE:-${POSTGRES_USER}}
MAIL_SERVER=${MAIL_SERVER:-postfix}
CONTACT_MAIL=${CONTACT_MAIL:-"dataverse-k8s-contact@mailinator.com"}
ADMIN_MAIL=${ADMIN_MAIL:-"Dataverse on K8S <dataverse-k8s-admin@mailinator.com>"}
ADMIN_PASSWORD=${ADMIN_PASSWORD:-admin1}
MAX_RAM_PERCENTAGE=${MAX_RAM_PERCENTAGE:-25}
# System properties based Dataverse configuration options
# (Exporting needed as they cannot be seen by `env` otherwise)
export dataverse_files_directory=${dataverse_files_directory:-/data}
export dataverse_rserve_host=${dataverse_rserve_host:-rserve}
export dataverse_rserve_port=${dataverse_rserve_port:-6311}
export dataverse_rserve_user=${dataverse_rserve_user:-rserve}
export dataverse_rserve_password='${ALIAS=rserve_password_alias}'
export dataverse_fqdn=${dataverse_fqdn:-${HOST_DNS_ADDRESS}}
export dataverse_siteUrl=${dataverse_siteUrl:-"http://\${dataverse.fqdn}:8080"}
export dataverse_auth_password__reset__timeout__in__minutes=${dataverse_auth_password__reset__timeout__in__minutes:-60}
export dataverse_timerServer=${dataverse_timerServer:-true}
export doi_username=${doi_username:-test}
export doi_password='${ALIAS=doi_password_alias}'
export doi_baseurlstring=${doi_baseurlstring:-http://mds.test.datacite.org}