Allow resultsdb url to end both with and without slash
ClosedPublic

Authored by mkrizek on May 13 2015, 9:19 AM.

Details

Summary

Fixes ResultsDBapiException: u'Not found (HTTP 404)' when 'resultsdb_server: http://0.0.0.0:5000/api/v1.0/'. Worked fine without the ending slash.

Test Plan

Set resultsdb url so that it ends with slash.

Diff Detail

Repository
rRSAPI resultsdb_api
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
mkrizek retitled this revision from to Allow resultsdb url to end both with and without slash.May 13 2015, 9:19 AM
mkrizek updated this object.
mkrizek edited the test plan for this revision. (Show Details)
mkrizek added reviewers: tflink, kparal, jskladan.
jskladan accepted this revision.May 13 2015, 9:45 AM
This revision is now accepted and ready to land.May 13 2015, 9:45 AM
kparal requested changes to this revision.May 13 2015, 11:19 AM

@mkrizek, can you please supply details of the crash? @jskladan just tried it manually without having this patch, and it seems to work for him. Also, double slash in URL should work the same as double slash in filesystem path - it should be ignored. This patch seems to be a workaround for something that should work. And it seems to work on dev, see:

http://taskotron-dev.fedoraproject.org/resultsdb_api//api//v1.0//jobs

This revision now requires changes to proceed.May 13 2015, 11:19 AM

There is actually Not Found error if you try to access the root of the project:
http://taskotron-dev.fedoraproject.org/resultsdb_api
instead of this:
http://taskotron-dev.fedoraproject.org/resultsdb_api/

but that is most probably just apache configuration, not resultsdb_api problem. Josef says. :)

In D373#6778, @kparal wrote:

There is actually Not Found error if you try to access the root of the project:
http://taskotron-dev.fedoraproject.org/resultsdb_api
instead of this:
http://taskotron-dev.fedoraproject.org/resultsdb_api/

but that is most probably just apache configuration, not resultsdb_api problem. Josef says. :)

Yeah just realized it's probably just flask's web server that is not capable of handling that? Er...

Yeah just realized it's probably just flask's web server that is not capable of handling that? Er...

I'm not sure if it can be configured and how. But it seems to me that this patch doesn't solve that issue anyway -> if you try to access the root URL without a trailing slash, you'll still get 404. If you try to access some subpage, it works even with a double slash. Or am I missing something?

In D373#6782, @kparal wrote:

Yeah just realized it's probably just flask's web server that is not capable of handling that? Er...

I'm not sure if it can be configured and how. But it seems to me that this patch doesn't solve that issue anyway -> if you try to access the root URL without a trailing slash, you'll still get 404. If you try to access some subpage, it works even with a double slash. Or am I missing something?

Er, you completely got me lost in this. :) The point of this patch is to prevent crash when running libtaskotron/resultsb in dev environment and you put a trailing slash into resultsdb url in taskotron.yaml conf, which was my case and had to figure out why:

$ curl http://0.0.0.0:5000/api/v1.0//jobs
{
  "message": "Not found"
$ curl http://0.0.0.0:5000/api/v1.0/jobs
{
  "data": [
    {
      "end_time": null, 
      "href": "http://0.0.0.0:5000/api/v1.0/jobs/2", 
      "id": 2, 
      "name": null, 
      "ref_url": "http://example.com/someref", 
      "results": null, 
      "results_count": 0, 
      "start_time": null, 
      "status": null, 
      "uuid": null
    },
}

So this just prevents putting a double slash into the url and genereting:

[resultsdb_api] 12:30:24 WARNING Received HTTP failure status code 404 for request: http://0.0.0.0:5000/api/v1.0//jobs
[libtaskotron] 12:30:24 CRITICAL Traceback (most recent call last):
  File "runtask.py", line 10, in <module>
    runner.main()
  File "/home/mkrizek/devel/libtaskotron/libtaskotron/runner.py", line 263, in main
    task_runner.run()
  File "/home/mkrizek/devel/libtaskotron/libtaskotron/runner.py", line 61, in run
    self.do_actions()
  File "/home/mkrizek/devel/libtaskotron/libtaskotron/runner.py", line 145, in do_actions
    self.do_single_action(action)
  File "/home/mkrizek/devel/libtaskotron/libtaskotron/runner.py", line 123, in do_single_action
    self.envdata)
  File "/home/mkrizek/devel/libtaskotron/libtaskotron/directives/resultsdb_directive.py", line 194, in process
    uuid=env_data['uuid'],
  File "/home/mkrizek/devel/libtaskotron/libtaskotron/directives/resultsdb_directive.py", line 144, in create_resultsdb_job
    uuid=uuid)
  File "/home/mkrizek/devel/resultsdb/resultsdb_api/resultsdb_api.py", line 77, in create_job
    self.__raise_on_error(r)
  File "/home/mkrizek/devel/resultsdb/resultsdb_api/resultsdb_api.py", line 61, in __raise_on_error
    '%s (HTTP %s)' % (r.json()['message'], r.status_code), r)
ResultsDBapiException: u'Not found (HTTP 404)'

Or is it just problem on my machine?

@mkrizek - that is actually caused by Werkzeug (the code used in app.run()), and is not present while deployed via mod_wsgi (so @kparal was not able to reproduce on dev). It probably makes sense to go forward with the patch anyway - it just makes the library more forgiving, so I don't see a problem with it.

In D373#6788, @jskladan wrote:

@mkrizek - that is actually caused by Werkzeug (the code used in app.run()), and is not present while deployed via mod_wsgi (so @kparal was not able to reproduce on dev). It probably makes sense to go forward with the patch anyway - it just makes the library more forgiving, so I don't see a problem with it.

ah, never mind, that is what I get for not refreshing the page before posting...

I think this is the werkzeug bug, I added a comment:
https://github.com/mitsuhiko/werkzeug/issues/491#issuecomment-101663798

@mkrizek I wonder, wouldn't be easier to just strip the trailing slash while loading resultsdb_server from the configuration file? It would save you from the _urljoin() hacks on multiple places.

In D373#6790, @kparal wrote:

I think this is the werkzeug bug, I added a comment:
https://github.com/mitsuhiko/werkzeug/issues/491#issuecomment-101663798

@mkrizek I wonder, wouldn't be easier to just strip the trailing slash while loading resultsdb_server from the configuration file? It would save you from the _urljoin() hacks on multiple places.

Theoretically, what if I use resultsdb_api from my own script that doesn't do the slash stripping?

Theoretically, what if I use resultsdb_api from my own script that doesn't do the slash stripping?

Fair point. Would it help to strip the slash here?

class ResultsDBapi(object):
    def __init__(self, api_url):
        self.url = api_url

Preferably with a comment describing why we're doing it.

mkrizek updated this revision to Diff 991.May 15 2015, 6:58 AM
  • Revert "Allow resultsdb url to end both with and without slash"
  • Remove trailing slashes from url before it's used
kparal accepted this revision.May 26 2015, 7:55 AM

Easy and simple! Thumbs up.

This revision is now accepted and ready to land.May 26 2015, 7:55 AM
This revision was automatically updated to reflect the committed changes.