Add Dockerfile and OpenShift template
Needs ReviewPublic

Authored by csomh on Jun 7 2017, 10:11 AM.

Details

Reviewers
dcallagh
Group Reviewers
resultsdb
Summary

Add Dockerfile to build image from rpm and OpenShift template.

Based on the work done by dcallagh for waiverdb[0].

At this point the resultsdb image is looked for in the internal
registry of the cluster.

[0] https://pagure.io/waiverdb/pull-request/46

Test Plan
  1. Start and configure local OpenShift cluster:
oc cluster up
oc login -u system:admin
oadm policy add-role-to-user system:registry developer
oadm policy add-role-to-user system:image-builder developer
  1. Login to the internal registry of the cluster:
oc login -u developer -p developer
docker login -u developer -p $(oc whoami -t) 172.30.1.1:5000
  1. Build, tag and push resultsdb image to internal registry:
docker build -f openshift/Dockerfile --tag resultsdb --build-arg resultsdb_rpm=resultsdb-2.0.2-1.fc25.noarch.rpm .
docker tag resultsdb 172.30.1.1:5000/myproject/resultsdb:latest
docker push 172.30.1.1:5000/myproject/resultsdb:latest
  1. Create environment from template:
oc process -f openshift/resultsdb-test-template.yaml -p TEST_ID=123 -p RESULTSDB_APP_VERSION=latest | oc apply -f -

Diff Detail

Repository
rRSDB resultsdb
Branch
openshift (branched from develop)
Lint
Lint SkippedExcuse: n/a
Unit
Unit Tests Skipped
Build Status
Buildable 1175
Build 1175: arc lint + arc unit
csomh created this revision.Jun 7 2017, 10:11 AM
csomh edited the test plan for this revision. (Show Details)Jun 7 2017, 10:14 AM
dcallagh added inline comments.Jun 8 2017, 12:32 AM
openshift/Dockerfile
20

I guess for cleanliness this should also remove the RPM package from /tmp? I think I missed that in the WaiverDB one too.

openshift/resultsdb-test-template.yaml
52

So... who or what is filling in the variables here like ${DATABASE_PASSWORD}? Is that a feature of the config parser that resultsdb is using? Because OpenShift itself has no templating functionality for config files, right? That was why I had to add support in WaiverDB for passing in the db password and Flask secret key through the environment instead of the config file.

openshift/run_app.sh
23

Does resultsdb care what user it runs as? Does it actually require a user named resultsdb to exist? Does it write anything to the filesystem ever? Surely it doesn't?

I personally think this nss_wrapper approach is a lot hackier and messier than just telling the container to run as an arbtirary user id (as in USER 1001).

31

I think the right way to do this is to have a separate container/pod specifically for populating or upgrading the database, and have OpenShift fire that off as part of each new deployment. The problem with doing it here inside the app itself is that it's racey (you will be starting multiple copies of the app and each one will be trying to do the init_db). WaiverDB makes this same mistake right now and we need to fix it in there too.

Anyway -- not necessarily something that you need to fix now. This should be okay as a starting point.

csomh added inline comments.Jun 8 2017, 10:16 AM
openshift/Dockerfile
20

Right, I'll add this.

openshift/resultsdb-test-template.yaml
52

It's oc process doing the magic. If you run something like:

oc process -f openshift/resultsdb-test-template.yaml -p TEST_ID=123 -p RESULTSDB_APP_VERSION=latest --local -o yaml

you can check that all these are nicely replaced with the template parameters.

(Actually I did not know about this either, I just assumed that works this way :) )

176

What about this? This expects the image to be pushed in the internal registry of the cluster. I would say this is more portable than referencing an internal registry, but it hard-codes the registry IP and project name - I should check if there are some built-in template variables that could be used for these.

openshift/run_app.sh
23

I've used the nss_wrapper approach b/c this was the recommended way in the docs, although I do agree that it's hackish and messy.
Besides this, I cannot think of a good reason why the 'USER 1001' approach wouldn't work in this case (maybe it will be less future-proof), so I'll make this change.

31

I think the right way to do this is to have a separate container/pod specifically for populating or upgrading the database, and have OpenShift fire that off as part of each new deployment.

Wouldn't this stop us from doing rolling releases? Shouldn't resultsdb to be able to handle a situation when the database is updated, but there are still some pods running older versions?

You are right about the racey aspect, I'll have to search for some ideas on how this could be handled.

csomh updated this revision to Diff 3062.Jun 9 2017, 8:07 AM

Consider comments

  • Delete resultsdb rpm after yum install
  • Set user ID instead of using nss wrapper
csomh marked 6 inline comments as done.Jun 9 2017, 8:09 AM
csomh updated this revision to Diff 3063.Jun 9 2017, 11:20 AM

Use ImageStream for API container

This avoids hard-coding the internal registry IP and port and
makes the OpenShift project in which the application is created
configurable.

csomh marked an inline comment as done.Jun 9 2017, 11:22 AM
dcallagh added inline comments.Jun 13 2017, 5:49 AM
openshift/Dockerfile
31

So this means that the container will end up running the app inside a Flask development server, which is explicitly not designed for production use:

http://flask.pocoo.org/docs/0.12/api/#flask.Flask.run

"Do not use run() in a production setting. It is not intended to meet security and performance requirements for a production server. Instead, see Deployment Options for WSGI server recommendations."

So this is okay for now, if this container is only intended for test environments. But if we want to run the container in production it will need to use a proper server, like gunicorn.

openshift/resultsdb-test-template.yaml
52

Oh yeah I see. So that means this won't actually be usable as is for a "real" deployment (prod or prod-like) since it would mean the secret is embedded directly inside the config map. It should be passed in as an env var to the container instead, which might mean patching ResultsDB to accept its secrets from environment variables the way I had to do for WaiverDB as well.

But I guess this is okay as a first step, just for test environments.

176

Just bear in mind that the so-called Open Platform we have access to inside of Red Hat does not expose its internal registry. That is why for WaiverDB I am using that other, separate registry... Realistically there is no reason this couldn't just go to the public Docker Hub though. Just need to find (or register) a good namespace. Maybe there is already something that Fedora infra tools are published under...

openshift/run_app.sh
31

Yeah it means for rolling deployments you actually have to design the db schema migrations to be always forwards- and backwards-compatible with whichever app versions still exist. That's how you achieve outage free updates... You have to first add *new* schema elements, apply that to the db, let the application code be updated, and then only after all old versions are no longer running you can *remove* old schema elements. It's more work but it's how you have write a so-called "cloud native" app. Right now I guess ResultsDB's migrations are not designed that way though. I'm not sure if there is any good answer.

dcallagh added inline comments.Jun 13 2017, 5:52 AM
openshift/run_app.sh
23

Btw not sure if you already figured this out... the nss_wrapper stuff is mentioned in the OpenShift docs *in addition* to the USER directive. nss_wrapper is only needed in case the application needs to perform getent lookups of its own username (which a normal application will never need to do). But regardless whether you use nss_wrapper, you still need USER 1001 or any other unprivileged user id. OpenShift won't let the container run as root (which is the default if USER is absent).

OpenShift might have been letting you get away with this when you ran your own local cluster, but a real deployment (like the Open Platform internally inside Red Hat) doesn't permit it.

Just fyi - I don't have, and won't have any input, comments on this. Feel free to merge once you decide what you wanna do.

csomh updated this revision to Diff 3088.Jul 3 2017, 2:05 PM

Use gunicorn to run app

This will switch using gunicorn instead of flask.run() for running the app.

It also swtiches storing settings.py in a Secret instead of a ConfigMap
in order to keep secret information under control.

Template asks for image name instead of version, this way resultsdb image
stored in an external registry can be also used.

Note that this is not a production ready solution, as resultsdb
requires authorization to be handled by the server. In current deployements
this is done using httpd configuration, but I was not able to find
a similar solution with gunicorn.

csomh updated this revision to Diff 3089.Jul 3 2017, 4:13 PM

Fix branch

csomh updated this revision to Diff 3099.Fri, Jul 28, 11:22 AM

Use mod_wsgi-express as a server

This allows using authorization on the server side
(a feature gunicorn does not have).

This setup works as follows:

  • resultsdb instances associated with a publicly exposed

service will accept only GET requests;

  • resultsdb instances associated with an internal only service

will accept any kind of requests and can be accessed
from other pods running in the cluster by the internal service
name.

Signed-off-by: Hunor Csomortáni <csomh@redhat.com>

Seems fine to me, particularly if it works inside OpenShift. :-)

Might be a bit confusing the way that it either includes a config volume for httpd config to *restrict* access, or omits the volume to implicitly *permit* access. Not immediately obvious that there are quite important access control limitations based on the presence/absence of that volume... at first glance it kind of just looks like a mistake that it's missing from one of the services. Might be worth at least adding a comment to explain that, inside the service definition where the config volume is being omitted.

openshift/Dockerfile
31

This is a bit unfortunate... I guess the problem is CentOS 7 mod_wsgi is too old to ship the mod_wsgi-express script? And even the Fedora rawhide package doesn't include it either, perhaps by mistake...

Could you file an RFE against the Fedora package, so that in future we could switch this to using packaged mod_wsgi instead of downloading stuff from PyPI?