Differential D1206

Add Dockerfile and OpenShift template
Needs ReviewPublic
Actions

Authored by csomh on Jun 7 2017, 10:11 AM.

Details

Reviewers

dcallagh

Group Reviewers

resultsdb

Summary

Add Dockerfile to build image from rpm and OpenShift template.

Based on the work done by dcallagh for waiverdb[0].

At this point the resultsdb image is looked for in the internal
registry of the cluster.

[0] https://pagure.io/waiverdb/pull-request/46

Test Plan

Start and configure local OpenShift cluster:

oc cluster up
oc login -u system:admin
oadm policy add-role-to-user system:registry developer
oadm policy add-role-to-user system:image-builder developer

oc login -u developer -p developer
docker login -u developer -p $(oc whoami -t) 172.30.1.1:5000

Build, tag and push resultsdb image to internal registry:

docker build -f openshift/Dockerfile --tag resultsdb --build-arg resultsdb_rpm=resultsdb-2.0.2-1.fc25.noarch.rpm .
docker tag resultsdb 172.30.1.1:5000/myproject/resultsdb:latest
docker push 172.30.1.1:5000/myproject/resultsdb:latest

Create environment from template:

oc process -f openshift/resultsdb-test-template.yaml -p TEST_ID=123 -p RESULTSDB_APP_VERSION=latest | oc apply -f -

Diff Detail

Repository

rRSDB resultsdb

Branch

openshift (branched from develop)

Lint

Lint Skipped

Excuse: n/a

Unit

Unit Tests Skipped

Build Status

Buildable 1175
Build 1175: arc lint + arc unit

csomh created this revision.Jun 7 2017, 10:11 AM

Herald added a subscriber: tflink. · View Herald TranscriptJun 7 2017, 10:11 AM

csomh edited the test plan for this revision. (Show Details)Jun 7 2017, 10:14 AM

csomh added reviewers: dcallagh, resultsdb.Jun 7 2017, 1:44 PM

openshift/Dockerfile
20	I guess for cleanliness this should also remove the RPM package from `/tmp`? I think I missed that in the WaiverDB one too.
openshift/resultsdb-test-template.yaml
52	So... who or what is filling in the variables here like ${DATABASE_PASSWORD}? Is that a feature of the config parser that resultsdb is using? Because OpenShift itself has no templating functionality for config files, right? That was why I had to add support in WaiverDB for passing in the db password and Flask secret key through the environment instead of the config file.
openshift/run_app.sh
23	Does resultsdb care what user it runs as? Does it actually require a user named `resultsdb` to exist? Does it write anything to the filesystem ever? Surely it doesn't? I personally think this nss_wrapper approach is a lot hackier and messier than just telling the container to run as an arbtirary user id (as in `USER 1001`).
31	I think the right way to do this is to have a separate container/pod specifically for populating or upgrading the database, and have OpenShift fire that off as part of each new deployment. The problem with doing it here inside the app itself is that it's racey (you will be starting multiple copies of the app and each one will be trying to do the init_db). WaiverDB makes this same mistake right now and we need to fix it in there too. Anyway -- not necessarily something that you need to fix now. This should be okay as a starting point.

openshift/Dockerfile
20	Right, I'll add this.
openshift/resultsdb-test-template.yaml
52	It's `oc process` doing the magic. If you run something like: oc process -f openshift/resultsdb-test-template.yaml -p TEST_ID=123 -p RESULTSDB_APP_VERSION=latest --local -o yaml you can check that all these are nicely replaced with the template parameters. (Actually I did not know about this either, I just assumed that works this way :) )
176	What about this? This expects the image to be pushed in the internal registry of the cluster. I would say this is more portable than referencing an internal registry, but it hard-codes the registry IP and project name - I should check if there are some built-in template variables that could be used for these.
openshift/run_app.sh
23	I've used the nss_wrapper approach b/c this was the recommended way in the docs, although I do agree that it's hackish and messy. Besides this, I cannot think of a good reason why the 'USER 1001' approach wouldn't work in this case (maybe it will be less future-proof), so I'll make this change.
31	I think the right way to do this is to have a separate container/pod specifically for populating or upgrading the database, and have OpenShift fire that off as part of each new deployment. Wouldn't this stop us from doing rolling releases? Shouldn't resultsdb to be able to handle a situation when the database is updated, but there are still some pods running older versions? You are right about the racey aspect, I'll have to search for some ideas on how this could be handled.

Consider comments

Delete resultsdb rpm after yum install
Set user ID instead of using nss wrapper

csomh marked 6 inline comments as done.Jun 9 2017, 8:09 AM

Use ImageStream for API container

This avoids hard-coding the internal registry IP and port and
makes the OpenShift project in which the application is created
configurable.

csomh marked an inline comment as done.Jun 9 2017, 11:22 AM

openshift/Dockerfile
31	So this means that the container will end up running the app inside a Flask development server, which is explicitly not designed for production use: http://flask.pocoo.org/docs/0.12/api/#flask.Flask.run "Do not use run() in a production setting. It is not intended to meet security and performance requirements for a production server. Instead, see Deployment Options for WSGI server recommendations." So this is okay for now, if this container is only intended for test environments. But if we want to run the container in production it will need to use a proper server, like gunicorn.
openshift/resultsdb-test-template.yaml
52	Oh yeah I see. So that means this won't actually be usable as is for a "real" deployment (prod or prod-like) since it would mean the secret is embedded directly inside the config map. It should be passed in as an env var to the container instead, which might mean patching ResultsDB to accept its secrets from environment variables the way I had to do for WaiverDB as well. But I guess this is okay as a first step, just for test environments.
176	Just bear in mind that the so-called Open Platform we have access to inside of Red Hat does not expose its internal registry. That is why for WaiverDB I am using that other, separate registry... Realistically there is no reason this couldn't just go to the public Docker Hub though. Just need to find (or register) a good namespace. Maybe there is already something that Fedora infra tools are published under...
openshift/run_app.sh
31	Yeah it means for rolling deployments you actually have to design the db schema migrations to be always forwards- and backwards-compatible with whichever app versions still exist. That's how you achieve outage free updates... You have to first add new schema elements, apply that to the db, let the application code be updated, and then only after all old versions are no longer running you can remove old schema elements. It's more work but it's how you have write a so-called "cloud native" app. Right now I guess ResultsDB's migrations are not designed that way though. I'm not sure if there is any good answer.

openshift/run_app.sh
23	Btw not sure if you already figured this out... the nss_wrapper stuff is mentioned in the OpenShift docs in addition to the `USER` directive. nss_wrapper is only needed in case the application needs to perform `getent` lookups of its own username (which a normal application will never need to do). But regardless whether you use nss_wrapper, you still need `USER 1001` or any other unprivileged user id. OpenShift won't let the container run as root (which is the default if `USER` is absent). OpenShift might have been letting you get away with this when you ran your own local cluster, but a real deployment (like the Open Platform internally inside Red Hat) doesn't permit it.

Just fyi - I don't have, and won't have any input, comments on this. Feel free to merge once you decide what you wanna do.

Use gunicorn to run app

This will switch using gunicorn instead of flask.run() for running the app.

It also swtiches storing settings.py in a Secret instead of a ConfigMap
in order to keep secret information under control.

Template asks for image name instead of version, this way resultsdb image
stored in an external registry can be also used.

Note that this is not a production ready solution, as resultsdb
requires authorization to be handled by the server. In current deployements
this is done using httpd configuration, but I was not able to find
a similar solution with gunicorn.

Fix branch

Use mod_wsgi-express as a server

This allows using authorization on the server side
(a feature gunicorn does not have).

This setup works as follows:

resultsdb instances associated with a publicly exposed

service will accept only GET requests;

resultsdb instances associated with an internal only service

will accept any kind of requests and can be accessed
from other pods running in the cluster by the internal service
name.

Signed-off-by: Hunor Csomortáni <csomh@redhat.com>

Seems fine to me, particularly if it works inside OpenShift. :-)

Might be a bit confusing the way that it either includes a config volume for httpd config to *restrict* access, or omits the volume to implicitly *permit* access. Not immediately obvious that there are quite important access control limitations based on the presence/absence of that volume... at first glance it kind of just looks like a mistake that it's missing from one of the services. Might be worth at least adding a comment to explain that, inside the service definition where the config volume is being omitted.

openshift/Dockerfile
31	This is a bit unfortunate... I guess the problem is CentOS 7 mod_wsgi is too old to ship the mod_wsgi-express script? And even the Fedora rawhide package doesn't include it either, perhaps by mistake... Could you file an RFE against the Fedora package, so that in future we could switch this to using packaged mod_wsgi instead of downloading stuff from PyPI?

Revision Contents

		Path
A	M	openshift/Dockerfile (43 lines)
A	M	openshift/resultsdb-test-template.yaml (295 lines)
A	M	openshift/run_app.sh (15 lines)

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	3060	39f6bf4		Jun 7 2017, 10:11 AM	★	★
Diff 2	3062	39f6bf4	Consider comments	Jun 9 2017, 8:07 AM	★	★
Diff 3	3063	39f6bf4	Use ImageStream for API container	Jun 9 2017, 11:20 AM	★	★
Diff 4	3088	3a3409c	Use gunicorn to run app	Jul 3 2017, 2:04 PM	★	★
Diff 5	3089	39f6bf4	Fix branch	Jul 3 2017, 4:13 PM	★	★
Diff 6	3099	39f6bf4	Use mod_wsgi-express as a server	Fri, Jul 28, 11:22 AM	★	★

Commit	Tree	Parents	Author	Summary	Date
ea5f71835ac7	0adf88a35c79	e99162f2a998	Hunor Csomortáni	Use mod_wsgi-express as a server (Show More…)	Fri, Jul 28, 10:39 AM
e99162f2a998	889b2dacc4f1	f5d1a5f6f221	Hunor Csomortáni	Use gunicorn to run app (Show More…)	Jun 27 2017, 3:31 PM
f5d1a5f6f221	a084529cf084	b22969490e5b	Hunor Csomortáni	Use ImageStream for API container (Show More…)	Jun 9 2017, 11:17 AM
b22969490e5b	f35c9c2bc8ce	1f4fb8dbb3e0	Hunor Csomortáni	Set user ID instead of using nss wrapper (Show More…)	Jun 9 2017, 8:00 AM
1f4fb8dbb3e0	b34cbf1af398	93767053caf2	Hunor Csomortáni	Delete resultsdb rpm after yum install (Show More…)	Jun 9 2017, 7:59 AM
93767053caf2	97c157f15ed0	39f6bf46f441	Hunor Csomortáni	Add Dockerfile and OpenShift template (Show More…)	Jun 7 2017, 8:59 AM

Diff 3099

View Options

openshift/Dockerfile

This file was added.

1		# This will produce an image to be used in Openshift
2		# Build should be triggered from repo root like:
3		# docker build -f openshift/Dockerfile.express --tag 172.30.1.1:5000/myproject/resultsdb:latest --build-arg resultsdb_rpm=resultsdb-2.0.2-1.fc25.noarch.rpm .
4
5		FROM centos/httpd:latest
6		LABEL \
7		name="ResultsDB application" \
8		vendor="ResultsDB developers" \
9		license="GPLv2+" \
10		build-date=""
11
12		USER 0
13
14		RUN yum -y install epel-release && yum -y clean all
15
16		# The caller should build a resultsdb RPM package using and then pass it in this arg.
17		ARG resultsdb_rpm
18		COPY $resultsdb_rpm /tmp
19
20		RUN yum -y update \
	dcallaghUnsubmitted Done I guess for cleanliness this should also remove the RPM package from `/tmp`? I think I missed that in the WaiverDB one too.
	csomhAuthorUnsubmitted Done Right, I'll add this.
21		&& yum -y install --setopt=tsflags=nodocs \
22		python-psycopg2 \
23		httpd-devel \
24		python-devel \
25		gcc \
26		python2-pip \
27		/tmp/$(basename $resultsdb_rpm) \
28		&& yum clean all \
29		&& rm -f /tmp/$(basename $resultsdb_rpm)
30
31		# This is installed from pypi, in order to get
	dcallaghUnsubmitted Not Done So this means that the container will end up running the app inside a Flask development server, which is explicitly not designed for production use: http://flask.pocoo.org/docs/0.12/api/#flask.Flask.run "Do not use run() in a production setting. It is not intended to meet security and performance requirements for a production server. Instead, see Deployment Options for WSGI server recommendations." So this is okay for now, if this container is only intended for test environments. But if we want to run the container in production it will need to use a proper server, like gunicorn.
	dcallaghUnsubmitted Not Done This is a bit unfortunate... I guess the problem is CentOS 7 mod_wsgi is too old to ship the mod_wsgi-express script? And even the Fedora rawhide package doesn't include it either, perhaps by mistake... Could you file an RFE against the Fedora package, so that in future we could switch this to using packaged mod_wsgi instead of downloading stuff from PyPI?
32		# mod_wsgi-express.
33		RUN pip install mod_wsgi
34
35		# Empty server configuration
36		RUN touch /etc/httpd/conf.d/resultsdb.conf
37
38		COPY openshift/run_app.sh /usr/bin/run_app
39		RUN chmod 770 /usr/bin/run_app
40
41		USER 1001
42		EXPOSE 5001
43		ENTRYPOINT run_app

View Options

openshift/resultsdb-test-template.yaml

This file was added.

1
2		# Template to produce a new test environment in OpenShift. Uses OpenID Connect
3		# against iddev.fedorainfracloud.org for authentication, and ephemeral storage
4		# for Postgres data.
5		#
6		# To create an environment from the template, process and apply it:
7		# oc process -f openshift/resultsdb-test-template.yaml -p TEST_ID=123 \| oc apply -f -
8		# To clean up the environment, use a selector on the environment label:
9		# oc delete dc,deploy,pod,configmap,secret,svc,route -l environment=test-123
10
11		---
12		apiVersion: v1
13		kind: Template
14		metadata:
15		name: resultsdb-test-template
16		parameters:
17		- name: TEST_ID
18		displayName: Test id
19		description: Short unique identifier for this test run (e.g. Jenkins job number)
20		required: true
21		- name: RESULTSDB_IMAGE
22		displayName: ResultsDB container image
23		description: Image to be used for ResultsDB deployement
24		value: 172.30.1.1:5000/myproject/resultsdb:latest
25		required: true
26		- name: DATABASE_PASSWORD
27		displayName: Database password
28		generate: expression
29		from: "[\\w]{32}"
30		- name: RESULTSDB_SECRET_KEY
31		displayName: Secret Key for ResultsDB
32		generate: expression
33		from: "[\\w]{32}"
34		objects:
35		- apiVersion: v1
36		kind: Secret
37		metadata:
38		name: "resultsdb-test-${TEST_ID}-secret"
39		labels:
40		environment: "test-${TEST_ID}"
41		stringData:
42		database-password: "${DATABASE_PASSWORD}"
43		- apiVersion: v1
44		kind: Secret
45		metadata:
46		name: "resultsdb-test-${TEST_ID}-config"
47		labels:
48		environment: "test-${TEST_ID}"
49		stringData:
50		settings.py: \|-
51		SECRET_KEY = '${RESULTSDB_SECRET_KEY}'
52		SQLALCHEMY_DATABASE_URI = 'postgresql+psycopg2://resultsdb:${DATABASE_PASSWORD}@resultsdb-test-${TEST_ID}-database:5432/resultsdb'
	dcallaghUnsubmitted Done So... who or what is filling in the variables here like ${DATABASE_PASSWORD}? Is that a feature of the config parser that resultsdb is using? Because OpenShift itself has no templating functionality for config files, right? That was why I had to add support in WaiverDB for passing in the db password and Flask secret key through the environment instead of the config file.
	csomhAuthorUnsubmitted Done It's `oc process` doing the magic. If you run something like: oc process -f openshift/resultsdb-test-template.yaml -p TEST_ID=123 -p RESULTSDB_APP_VERSION=latest --local -o yaml you can check that all these are nicely replaced with the template parameters. (Actually I did not know about this either, I just assumed that works this way :) )
	dcallaghUnsubmitted Not Done Oh yeah I see. So that means this won't actually be usable as is for a "real" deployment (prod or prod-like) since it would mean the secret is embedded directly inside the config map. It should be passed in as an env var to the container instead, which might mean patching ResultsDB to accept its secrets from environment variables the way I had to do for WaiverDB as well. But I guess this is okay as a first step, just for test environments.
53		FILE_LOGGING = False
54		LOGFILE = '/var/log/resultsdb/resultsdb.log'
55		SYSLOG_LOGGING = False
56		STREAM_LOGGING = True
57		RUN_HOST= '0.0.0.0'
58		RUN_PORT = 5001
59		MESSAGE_BUS_PUBLISH = False
60		MESSAGE_BUS_PLUGIN = 'fedmsg'
61		MESSAGE_BUS_KWARGS = {'modname': 'resultsdb'}
62		- apiVersion: v1
63		kind: ConfigMap
64		metadata:
65		name: "resultsdb-test-${TEST_ID}-httpd-config"
66		labels:
67		environment: "test-${TEST_ID}"
68		data:
69		resultsdb.conf: \|-
70		<Location "/">
71		<RequireAny>
72		Require method GET
73		</RequireAny>
74		</Location>
75		- apiVersion: v1
76		kind: Service
77		metadata:
78		name: "resultsdb-test-${TEST_ID}-database"
79		labels:
80		environment: "test-${TEST_ID}"
81		spec:
82		selector:
83		environment: "test-${TEST_ID}"
84		service: database
85		ports:
86		- name: postgresql
87		port: 5432
88		targetPort: 5432
89		- apiVersion: v1
90		kind: DeploymentConfig
91		metadata:
92		name: "resultsdb-test-${TEST_ID}-database"
93		labels:
94		environment: "test-${TEST_ID}"
95		service: database
96		spec:
97		replicas: 1
98		strategy:
99		type: Recreate
100		selector:
101		environment: "test-${TEST_ID}"
102		service: database
103		template:
104		metadata:
105		labels:
106		environment: "test-${TEST_ID}"
107		service: database
108		spec:
109		containers:
110		- name: postgresql
111		image: registry.access.redhat.com/rhscl/postgresql-95-rhel7:latest
112		imagePullPolicy: Always
113		ports:
114		- containerPort: 5432
115		readinessProbe:
116		timeoutSeconds: 1
117		initialDelaySeconds: 5
118		exec:
119		command: [ /bin/sh, -i, -c, "psql -h 127.0.0.1 -U $POSTGRESQL_USER -q -d $POSTGRESQL_DATABASE -c 'SELECT 1'" ]
120		livenessProbe:
121		timeoutSeconds: 1
122		initialDelaySeconds: 30
123		tcpSocket:
124		port: 5432
125		env:
126		- name: POSTGRESQL_USER
127		value: resultsdb
128		- name: POSTGRESQL_PASSWORD
129		valueFrom:
130		secretKeyRef:
131		name: "resultsdb-test-${TEST_ID}-secret"
132		key: database-password
133		- name: POSTGRESQL_DATABASE
134		value: resultsdb
135		triggers:
136		- type: ConfigChange
137		- apiVersion: v1
138		kind: Service
139		metadata:
140		name: "resultsdb-test-${TEST_ID}-api"
141		labels:
142		environment: "test-${TEST_ID}"
143		annotations:
144		service.alpha.openshift.io/dependencies: \|-
145		[{"name": "resultsdb-test-${TEST_ID}-database", "kind": "Service"}]
146		spec:
147		selector:
148		environment: "test-${TEST_ID}"
149		service: api
150		ports:
151		- name: api
152		port: 5001
153		targetPort: 5001
154		- apiVersion: v1
155		kind: Route
156		metadata:
157		name: "resultsdb-test-${TEST_ID}-api"
158		labels:
159		environment: "test-${TEST_ID}"
160		spec:
161		port:
162		targetPort: api
163		to:
164		kind: Service
165		name: "resultsdb-test-${TEST_ID}-api"
166		tls:
167		termination: edge
168		insecureEdgeTerminationPolicy: Redirect
169		- apiVersion: v1
170		kind: Service
171		metadata:
172		name: "resultsdb-test-${TEST_ID}-internal-api"
173		labels:
174		environment: "test-${TEST_ID}"
175		annotations:
176		service.alpha.openshift.io/dependencies: \|-
	csomhAuthorUnsubmitted Done What about this? This expects the image to be pushed in the internal registry of the cluster. I would say this is more portable than referencing an internal registry, but it hard-codes the registry IP and project name - I should check if there are some built-in template variables that could be used for these.
	dcallaghUnsubmitted Not Done Just bear in mind that the so-called Open Platform we have access to inside of Red Hat does not expose its internal registry. That is why for WaiverDB I am using that other, separate registry... Realistically there is no reason this couldn't just go to the public Docker Hub though. Just need to find (or register) a good namespace. Maybe there is already something that Fedora infra tools are published under...
177		[{"name": "resultsdb-test-${TEST_ID}-database", "kind": "Service"}]
178		spec:
179		selector:
180		environment: "test-${TEST_ID}"
181		service: internal-api
182		ports:
183		- name: api
184		port: 5001
185		targetPort: 5001
186		- apiVersion: v1
187		kind: DeploymentConfig
188		metadata:
189		name: "resultsdb-test-${TEST_ID}-api"
190		labels:
191		environment: "test-${TEST_ID}"
192		service: api
193		spec:
194		replicas: 1
195		selector:
196		environment: "test-${TEST_ID}"
197		service: api
198		template:
199		metadata:
200		labels:
201		environment: "test-${TEST_ID}"
202		service: api
203		spec:
204		containers:
205		- name: api
206		image: "${RESULTSDB_IMAGE}"
207		imagePullPolicy: Always
208		ports:
209		- containerPort: 5001
210		volumeMounts:
211		- name: config-volume
212		mountPath: /etc/resultsdb
213		readOnly: true
214		- name: httpd-config-volume
215		mountPath: /etc/httpd/conf.d
216		readOnly: true
217		readinessProbe:
218		timeoutSeconds: 1
219		initialDelaySeconds: 5
220		httpGet:
221		path: /api/v2.0/
222		port: 5001
223		livenessProbe:
224		timeoutSeconds: 1
225		initialDelaySeconds: 30
226		httpGet:
227		path: /api/v2.0/
228		port: 5001
229		# Limit to 384MB memory. This is probably not enough but it is
230		# necessary in the current environment to allow for 2 replicas and
231		# rolling updates, without hitting the (very aggressive) memory quota.
232		resources:
233		limits:
234		memory: 384Mi
235		volumes:
236		- name: config-volume
237		secret:
238		secretName: "resultsdb-test-${TEST_ID}-config"
239		- name: httpd-config-volume
240		configMap:
241		name: "resultsdb-test-${TEST_ID}-httpd-config"
242		triggers:
243		- type: ConfigChange
244		- apiVersion: v1
245		kind: DeploymentConfig
246		metadata:
247		name: "resultsdb-test-${TEST_ID}-internal-api"
248		labels:
249		environment: "test-${TEST_ID}"
250		service: internal-api
251		spec:
252		replicas: 1
253		selector:
254		environment: "test-${TEST_ID}"
255		service: internal-api
256		template:
257		metadata:
258		labels:
259		environment: "test-${TEST_ID}"
260		service: internal-api
261		spec:
262		containers:
263		- name: api
264		image: "${RESULTSDB_IMAGE}"
265		imagePullPolicy: Always
266		ports:
267		- containerPort: 5001
268		volumeMounts:
269		- name: config-volume
270		mountPath: /etc/resultsdb
271		readOnly: true
272		readinessProbe:
273		timeoutSeconds: 1
274		initialDelaySeconds: 5
275		httpGet:
276		path: /api/v2.0/
277		port: 5001
278		livenessProbe:
279		timeoutSeconds: 1
280		initialDelaySeconds: 30
281		httpGet:
282		path: /api/v2.0/
283		port: 5001
284		# Limit to 384MB memory. This is probably not enough but it is
285		# necessary in the current environment to allow for 2 replicas and
286		# rolling updates, without hitting the (very aggressive) memory quota.
287		resources:
288		limits:
289		memory: 384Mi
290		volumes:
291		- name: config-volume
292		secret:
293		secretName: "resultsdb-test-${TEST_ID}-config"
294		triggers:
295		- type: ConfigChange

View Options

openshift/run_app.sh

This file was added.

Property	Old Value	New Value
File Mode	null	100755

1		#!/bin/bash
2		set -x
3		set -e
4
5		# initialize db (in a non-destructive manner)
6		env resultsdb init_db
7
8		exec mod_wsgi-express start-server /usr/share/resultsdb/resultsdb.wsgi \
9		--user apache --group apache \
10		--port 5001 --threads 5 \
11		--include-file /etc/httpd/conf.d/resultsdb.conf \
12		--log-level info \
13		--log-to-terminal \
14		--access-log \
15		--startup-log
	dcallaghUnsubmitted Not Done I think the right way to do this is to have a separate container/pod specifically for populating or upgrading the database, and have OpenShift fire that off as part of each new deployment. The problem with doing it here inside the app itself is that it's racey (you will be starting multiple copies of the app and each one will be trying to do the init_db). WaiverDB makes this same mistake right now and we need to fix it in there too. Anyway -- not necessarily something that you need to fix now. This should be okay as a starting point.
	csomhAuthorUnsubmitted Not Done I think the right way to do this is to have a separate container/pod specifically for populating or upgrading the database, and have OpenShift fire that off as part of each new deployment. Wouldn't this stop us from doing rolling releases? Shouldn't resultsdb to be able to handle a situation when the database is updated, but there are still some pods running older versions? You are right about the racey aspect, I'll have to search for some ideas on how this could be handled.
	dcallaghUnsubmitted Not Done Yeah it means for rolling deployments you actually have to design the db schema migrations to be always forwards- and backwards-compatible with whichever app versions still exist. That's how you achieve outage free updates... You have to first add new schema elements, apply that to the db, let the application code be updated, and then only after all old versions are no longer running you can remove old schema elements. It's more work but it's how you have write a so-called "cloud native" app. Right now I guess ResultsDB's migrations are not designed that way though. I'm not sure if there is any good answer.
	dcallaghUnsubmitted Done Does resultsdb care what user it runs as? Does it actually require a user named `resultsdb` to exist? Does it write anything to the filesystem ever? Surely it doesn't? I personally think this nss_wrapper approach is a lot hackier and messier than just telling the container to run as an arbtirary user id (as in `USER 1001`).
	csomhAuthorUnsubmitted Done I've used the nss_wrapper approach b/c this was the recommended way in the docs, although I do agree that it's hackish and messy. Besides this, I cannot think of a good reason why the 'USER 1001' approach wouldn't work in this case (maybe it will be less future-proof), so I'll make this change.
	dcallaghUnsubmitted Not Done Btw not sure if you already figured this out... the nss_wrapper stuff is mentioned in the OpenShift docs in addition to the `USER` directive. nss_wrapper is only needed in case the application needs to perform `getent` lookups of its own username (which a normal application will never need to do). But regardless whether you use nss_wrapper, you still need `USER 1001` or any other unprivileged user id. OpenShift won't let the container run as root (which is the default if `USER` is absent). OpenShift might have been letting you get away with this when you ran your own local cluster, but a real deployment (like the Open Platform internally inside Red Hat) doesn't permit it.

Add Dockerfile and OpenShift templateNeeds ReviewPublicActions

Details

Diff Detail

Revision Contents

Diff 3099

openshift/Dockerfile

openshift/resultsdb-test-template.yaml

openshift/run_app.sh

Add Dockerfile and OpenShift template
Needs ReviewPublic
Actions