Configuration

Warning

DO. NOT. SET. ANY. PLAIN. PASSWORD. IN. ANY. ConfigMap .
(Or Deployment, Pod, …) Use Secrets for this. See Credentials and Secrets.

For usage on Kubernetes, all configuration can be stored in a ConfigMap.

Configuring Dataverse is done in multiple ways:

  1. Database connection, mail gateway, bootstrap values, etc used in scripts are read from environment variables directly.

    • Use these variable names in your ConfigMap.

    • See default.config for a list of these special ones.

  2. Things like file storage, networking, most DOI, etc are all basic system settings and can be set via Java system properties, residing in the Glassfish domain configuration.

  3. More options are stored in the database and configured via API and/or UI.

Note

All of this should be streamlined into an easier to use configuration system. See https://github.com/IQSS/dataverse/issues/5293 for more. Please leave a comment there if you feel the same.

System Properties: JVM options

The basic idea is to map environment variables to Java system properties each time a Dataverse container starts with the default entrypoint (being the application server).

  1. Simply pick a JVM option from the list and replace “.” with “_” (“-” is not allowed in env var names!).

  2. Put the transformed name as a key into the ConfigMap.data.

  3. Add your value. Be sure to use simple strings only - no numbers, no complex types. Escape with " ".

Note

Currently many JVM options have dashes in them, which is no allowed character for an environment variable. As a workaround, replace any dash with __ (double underscore). Will be transformed back into a - internally when the container starts. See example below.

Examples (see below Full configuration example):

data:
  ### GENERAL SETTINGS
  dataverse_fqdn: data.example.org
  dataverse_siteUrl: https://\${dataverse.fqdn}
  dataverse_auth_password__reset__timeout__in__minutes: 30

  ### DOI SETTINGS
  doi_baseurlstring: https://mds.test.datacite.org
  doi_username: EXAMPLEORG.TEST

Warning

DO NOT USE THIS ConfigMap FOR PASSWORDS! Those are done via Kubernetes Secrets, see Credentials and Secrets.

Database Options: Using curl

As database settings are persistent in, well, the database, they don’t need to get set everytime the container starts. To be consistent and easy to use, the same ConfigMap used for JVM options can be used for these settings, but you need to create a Job or even a CronJob to apply them.

Note

Of course you can choose to use your own tools and scripts for this. Basically its just curl calls to the Admin API.

Provide settings

  1. Pick a Database setting

  2. Remove the : and replace it with db_. Keep the Pascal case!

  3. Put the transformed value into the ConfigMap.data.

  4. Add your value, which can be any value you see in the docs. Keep in mind: when you need to use JSON, format it as a string!

  5. When you need to delete a setting, just provide an empty value.

Examples (see below Full configuration example):

data:
  ### DOI SETTINGS
  db_DoiProvider: DataCite
  db_Protocol: doi
  db_Authority: "10.12345"
  db_Shoulder: EXAMPLE/

  ### CUSTOMIZATION
  db_StatusMessageHeader: "Example.org is not yet in production"
  db_StatusMessageText: "<br />Please do not save any real data, only use for testing and sneak-peek."

Warning

DO NOT USE THIS ConfigMap FOR PASSWORDS! Those are done via Kubernetes Secrets, see Credentials and Secrets.

Apply settings

Remember: you will need to update your ConfigMap when you want to apply changes. You need to think about in which file you keep the map - having it in two locations is a bad idea. It’s always a good idea to put it in revision control.

# Update ConfigMap:
kubectl apply -f path/to/your/configmap.yaml
# Deploy a new config job:
kubectl create -f https://gitcdn.link/repo/IQSS/dataverse-kubernetes/release/k8s/dataverse/jobs/configure.yaml

You might consider providing a CronJob for scheduled, regular updates.

See also

Bootstrap Job will also apply initial settings for you, no need to run a job until you change your configuration again.

Details of the configuration job

@startuml
!includeurl "https://raw.githubusercontent.com/michiel/plantuml-kubernetes-sprites/master/resource/k8s-sprites-unlabeled-25pct.iuml"

actor User
participant "<color:#royalblue><$secret></color>\nSecrets" as S
participant "<color:#royalblue><$cm></color>\nConfigMap" as CM
participant "<color:#royalblue><$pod></color>\nPostgreSQL" as P
participant "<color:#royalblue><$pod></color>\nDataverse" as D
participant "<color:#royalblue><$job></color>\nConfigure Job" as CJ

create CJ
User -> CJ: Deploy Configure Job
S -> CJ: Pass API key
CM -> CJ: Pass settings
CJ <<-->> D: wait for
...After Dataverse ready......
CJ -> D: Configure Dataverse DB-based\nsettings via API
activate D
D -> P: Store settings
return
destroy CJ
@enduml

Alternative approaches

There’s also stakater/Reloader, a tool to auto-reload resources. To use it, you will need to 1) change metadata of deployments (easy) and 2) change the image to include an entrypoint script that runs the configuration script on boot (not so easy). Feel free to leave feedback in the project if you are interested having this builtin.

Full configuration example

Below you can find an example ConfigMap using all three types of variables.

See also

Most likely you’ll be interested in configuration of External Authentication, too. It’s left out here to stay in focus.

configmap.yaml
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: dataverse
  labels:
    app.kubernetes.io/name: configmap
    app.kubernetes.io/version: "1.0"
    app.kubernetes.io/component: configmap
    app.kubernetes.io/part-of: dataverse
    app.kubernetes.io/managed-by: kubectl
data:
  ### GENERAL SETTINGS
  POSTGRES_DATABASE: dataverse
  dataverse_fqdn: data.example.org
  dataverse_siteUrl: https://\${dataverse.fqdn}
  dataverse_auth_password__reset__timeout__in__minutes: 30

  ### CONTACT SETTINGS
  # Sender address of all mails sent by Dataverse
  MAIL_FROMADDRESS: "do-not-reply@example.org"
  # Root dataverse contact
  CONTACT_MAIL: rdm@example.org
  # Installation contact
  db_SystemEmail: "Example - Research Data Management <rdm@example.org>"

  ### DOI SETTINGS
  doi_baseurlstring: https://mds.test.datacite.org
  doi_username: EXAMPLEORG.TEST
  db_DoiProvider: DataCite
  db_Protocol: doi
  db_Authority: "10.12345"
  db_Shoulder: EXAMPLE/

  ### FILE STORAGE
  dataverse_files_directory: /data
  dataverse_files_storage__driver__id: "myremote"
  dataverse_files_myremote_type: "s3"
  dataverse_files_myremote_label: "My Remote S3 Object Store"
  dataverse_files_myremote_custom__endpoint__url: http://minio:9000
  dataverse_files_myremote_bucket__name: dataverse
  # required for Minio!
  dataverse_files_myremote_path__style__access: "true"

  ### CUSTOMIZATION
  db_StatusMessageHeader: "Example.org is not yet in production"
  db_StatusMessageText: "<br />Please do not save any real data, only use for testing and sneak-peek."

Sane defaults: default.config

Some things need sane defaults, which can be found in default.config (see below). You might find those usefull as an example for your personally tuned ConfigMap.

default.config
# Variables used in resource creation

POSTGRES_SERVER=${POSTGRES_SERVER:-postgresql}
POSTGRES_PORT=${POSTGRES_PORT:-5432}
POSTGRES_USER=${POSTGRES_USER:-dataverse}
POSTGRES_DATABASE=${POSTGRES_DATABASE:-${POSTGRES_USER}}
MAIL_SERVER=${MAIL_SERVER:-"postfix"}
MAIL_FROMADDRESS=${MAIL_FROMADDRESS:-"do-not-reply@mailinator.com"}
CONTACT_MAIL=${CONTACT_MAIL:-"dataverse-k8s-contact@mailinator.com"}
ENABLE_JMX_EXPORT=${ENABLE_JMX_EXPORT:-0}
JMX_EXPORTER_PORT=${JMX_EXPORTER_PORT:-8081}
JMX_EXPORTER_CONFIG=${JMX_EXPORTER_CONFIG:-"${HOME}/jmx_exporter_config.yaml"}

#####   #####   #####   #####   #####   #####   #####   #####   #####   #####
# System properties based Dataverse configuration options
# (Exporting needed as they cannot be seen by `env` otherwise)

export dataverse_files_directory=${dataverse_files_directory:-/data}
export dataverse_files_storage__driver__id=${dataverse_files_storage__driver__id:-local}

if [ "${dataverse_files_storage__driver__id}" = "local" ]; then
  export dataverse_files_local_type=${dataverse_files_local_type:-file}
  export dataverse_files_local_label=${dataverse_files_local_label:-Local}
  export dataverse_files_local_directory=${dataverse_files_local_directory:-/data}
fi

export dataverse_rserve_host=${dataverse_rserve_host:-rserve}
export dataverse_rserve_port=${dataverse_rserve_port:-6311}
export dataverse_rserve_user=${dataverse_rserve_user:-rserve}
export dataverse_rserve_password='${ALIAS=rserve_password_alias}'
export dataverse_fqdn=${dataverse_fqdn:-"localhost"}
export dataverse_siteUrl=${dataverse_siteUrl:-"http://\${dataverse.fqdn}:8080"}
export dataverse_auth_password__reset__timeout__in__minutes=${dataverse_auth_password__reset__timeout__in__minutes:-60}
export dataverse_timerServer=${dataverse_timerServer:-true}

export doi_username=${doi_username:-test}
export doi_password='${ALIAS=doi_password_alias}'
export doi_baseurlstring=${doi_baseurlstring:-http://mds.test.datacite.org}

#####   #####   #####   #####   #####   #####   #####   #####   #####   #####
# Database based Dataverse configuration options
# (Exporting needed as they cannot be seen by `env` otherwise)

export db_SystemEmail=${db_SystemEmail:-"Dataverse on K8S <dataverse-k8s-admin@mailinator.com>"}