Replace urlgrabber with requests
ClosedPublic

Authored by mkrizek on Nov 30 2015, 2:41 PM.

Details

Summary

Also replace dingus with mock in unittests.

Test Plan

Functests pass. Rpmlint worked. Downloading a 1gig file worked as well.

Diff Detail

Repository
rLTRN libtaskotron
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
mkrizek retitled this revision from to Replace urlgrabber with requests.Nov 30 2015, 2:41 PM
mkrizek updated this object.
mkrizek edited the test plan for this revision. (Show Details)
mkrizek added reviewers: tflink, kparal, jskladan, lbrabec.
mkrizek added inline comments.Nov 30 2015, 2:43 PM
libtaskotron/file_utils.py
174

I am more than fine with getting rid of the progressbar and not adding another dep.

tflink requested changes to this revision.Nov 30 2015, 3:47 PM

When I run tasks locally with this patch, execution blows up with:

$ python runtask.py -i python-faker-0.5.3-7.fc24 -t koji_build --local ../task-rpmlint/rpmlint.yml
[libtaskotron] 15:43:05 INFO    Execution started at: 2015-11-30 15:43:05 UTC
[libtaskotron] 15:43:05 DEBUG   Using libtaskotron 0.4.5
[libtaskotron] 15:43:05 DEBUG   Parsed arguments: Namespace(arch=None, debug=False, item='python-faker-0.5.3-7.fc24', jobid='-1', libvirt=False, local=True, no_destroy=False, override=[], patch=None, ssh=None, ssh_privkey=None, task=['../task-rpmlint/rpmlint.yml'], type='koji_build', uuid='20151130_154305_785666')
[libtaskotron] 15:43:05 DEBUG   Using config file: /home/tflink/code/taskotron/libtaskotron/conf/taskotron.yaml
[libtaskotron] 15:43:05 WARNING Unknown option "bodhi_server" in the config file /home/tflink/code/taskotron/libtaskotron/conf/taskotron.yaml
[libtaskotron] 15:43:05 DEBUG   Using config profile: development
[libtaskotron:logger.py:184] 2015-11-30 15:43:05 DEBUG   Stream logging enabled with level: DEBUG
[libtaskotron:logger.py:261] 2015-11-30 15:43:05 DEBUG   File logging enabled with level DEBUG into: /var/log/taskotron/taskotron-tflink.log
[libtaskotron:overlord.py:89] 2015-11-30 15:43:05 INFO    Task artifacts will be saved in: /var/lib/taskotron/artifacts/20151130_154305_785666
[libtaskotron:overlord.py:60] 2015-11-30 15:43:05 DEBUG   Forcing local execution (option --local)
[libtaskotron:overlord.py:73] 2015-11-30 15:43:05 DEBUG   Execution mode: local
[libtaskotron:logger.py:261] 2015-11-30 15:43:05 DEBUG   File logging enabled with level DEBUG into: /var/lib/taskotron/artifacts/20151130_154305_785666/taskotron.log
[libtaskotron:rpm_utils.py:146] 2015-11-30 15:43:05 DEBUG   Running command: dnf --assumeno --disableplugin=noroot install rpmlint
[libtaskotron:rpm_utils.py:150] 2015-11-30 15:43:06 DEBUG   Last metadata expiration check performed 0:01:59 ago on Mon Nov 30 08:41:06 2015.
Package rpmlint-1.8-2.fc23.noarch is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!

[libtaskotron:executor.py:84] 2015-11-30 15:43:06 DEBUG   Current workdir: /var/tmp/taskotron/tflink/task-ipYhQS
[libtaskotron:executor.py:143] 2015-11-30 15:43:06 DEBUG   Executing directive: koji
[libtaskotron:koji_utils.py:160] 2015-11-30 15:43:06 INFO    Querying Koji for a list of RPMS for: python-faker-0.5.3-7.fc24
[libtaskotron:koji_utils.py:220] 2015-11-30 15:43:07 INFO    Fetching 4 RPMs for: python-faker-0.5.3-7.fc24 (into /var/tmp/taskotron/tflink/task-ipYhQS)
[libtaskotron:file_utils.py:102] 2015-11-30 15:43:07 DEBUG   Cached file /var/cache/taskotron/python-faker-doc-0.5.3-7.fc24.noarch.rpm differs from its online version. Redownloading.
[libtaskotron:file_utils.py:106] 2015-11-30 15:43:07 DEBUG   Downloading (cached): http://kojipkgs.fedoraproject.org/packages/python-faker/0.5.3/7.fc24/noarch/python-faker-doc-0.5.3-7.fc24.noarch.rpm
[libtaskotron:logger.py:88] 2015-11-30 15:43:07 CRITICAL Traceback (most recent call last):
  File "runtask.py", line 10, in <module>
    main.main()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/main.py", line 151, in main
    overlord.start()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/overlord.py", line 95, in start
    runner.execute()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 56, in execute
    self._run()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 93, in _run
    self._do_actions()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 131, in _do_actions
    self._do_single_action(action)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 152, in _do_single_action
    self.arg_data)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/directives/koji_directive.py", line 165, in process
    src=self.src)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/ext/fedora/koji_utils.py", line 228, in get_nvr_rpms
    rpm_file = file_utils.download(url, dest, cachedir=cachedir)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/file_utils.py", line 108, in download
    download_file(url, dl_dest)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/file_utils.py", line 159, in download_file
    max_value=(int(total_len) / chunk_size))  
TypeError: __init__() got an unexpected keyword argument 'max_value'

Is anyone else seeing this or is this a local setup issue?

libtaskotron.spec
6–7

This is a code change, it should have a version bump instead of just a revision bump

libtaskotron/file_utils.py
174

I have a slight preference for not adding another dep but it's not a huge deal

This revision now requires changes to proceed.Nov 30 2015, 3:47 PM
In D668#12754, @tflink wrote:

When I run tasks locally with this patch, execution blows up with:

$ python runtask.py -i python-faker-0.5.3-7.fc24 -t koji_build --local ../task-rpmlint/rpmlint.yml
[libtaskotron] 15:43:05 INFO    Execution started at: 2015-11-30 15:43:05 UTC
[libtaskotron] 15:43:05 DEBUG   Using libtaskotron 0.4.5
[libtaskotron] 15:43:05 DEBUG   Parsed arguments: Namespace(arch=None, debug=False, item='python-faker-0.5.3-7.fc24', jobid='-1', libvirt=False, local=True, no_destroy=False, override=[], patch=None, ssh=None, ssh_privkey=None, task=['../task-rpmlint/rpmlint.yml'], type='koji_build', uuid='20151130_154305_785666')
[libtaskotron] 15:43:05 DEBUG   Using config file: /home/tflink/code/taskotron/libtaskotron/conf/taskotron.yaml
[libtaskotron] 15:43:05 WARNING Unknown option "bodhi_server" in the config file /home/tflink/code/taskotron/libtaskotron/conf/taskotron.yaml
[libtaskotron] 15:43:05 DEBUG   Using config profile: development
[libtaskotron:logger.py:184] 2015-11-30 15:43:05 DEBUG   Stream logging enabled with level: DEBUG
[libtaskotron:logger.py:261] 2015-11-30 15:43:05 DEBUG   File logging enabled with level DEBUG into: /var/log/taskotron/taskotron-tflink.log
[libtaskotron:overlord.py:89] 2015-11-30 15:43:05 INFO    Task artifacts will be saved in: /var/lib/taskotron/artifacts/20151130_154305_785666
[libtaskotron:overlord.py:60] 2015-11-30 15:43:05 DEBUG   Forcing local execution (option --local)
[libtaskotron:overlord.py:73] 2015-11-30 15:43:05 DEBUG   Execution mode: local
[libtaskotron:logger.py:261] 2015-11-30 15:43:05 DEBUG   File logging enabled with level DEBUG into: /var/lib/taskotron/artifacts/20151130_154305_785666/taskotron.log
[libtaskotron:rpm_utils.py:146] 2015-11-30 15:43:05 DEBUG   Running command: dnf --assumeno --disableplugin=noroot install rpmlint
[libtaskotron:rpm_utils.py:150] 2015-11-30 15:43:06 DEBUG   Last metadata expiration check performed 0:01:59 ago on Mon Nov 30 08:41:06 2015.
Package rpmlint-1.8-2.fc23.noarch is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!

[libtaskotron:executor.py:84] 2015-11-30 15:43:06 DEBUG   Current workdir: /var/tmp/taskotron/tflink/task-ipYhQS
[libtaskotron:executor.py:143] 2015-11-30 15:43:06 DEBUG   Executing directive: koji
[libtaskotron:koji_utils.py:160] 2015-11-30 15:43:06 INFO    Querying Koji for a list of RPMS for: python-faker-0.5.3-7.fc24
[libtaskotron:koji_utils.py:220] 2015-11-30 15:43:07 INFO    Fetching 4 RPMs for: python-faker-0.5.3-7.fc24 (into /var/tmp/taskotron/tflink/task-ipYhQS)
[libtaskotron:file_utils.py:102] 2015-11-30 15:43:07 DEBUG   Cached file /var/cache/taskotron/python-faker-doc-0.5.3-7.fc24.noarch.rpm differs from its online version. Redownloading.
[libtaskotron:file_utils.py:106] 2015-11-30 15:43:07 DEBUG   Downloading (cached): http://kojipkgs.fedoraproject.org/packages/python-faker/0.5.3/7.fc24/noarch/python-faker-doc-0.5.3-7.fc24.noarch.rpm
[libtaskotron:logger.py:88] 2015-11-30 15:43:07 CRITICAL Traceback (most recent call last):
  File "runtask.py", line 10, in <module>
    main.main()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/main.py", line 151, in main
    overlord.start()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/overlord.py", line 95, in start
    runner.execute()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 56, in execute
    self._run()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 93, in _run
    self._do_actions()
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 131, in _do_actions
    self._do_single_action(action)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/executor.py", line 152, in _do_single_action
    self.arg_data)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/directives/koji_directive.py", line 165, in process
    src=self.src)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/ext/fedora/koji_utils.py", line 228, in get_nvr_rpms
    rpm_file = file_utils.download(url, dest, cachedir=cachedir)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/file_utils.py", line 108, in download
    download_file(url, dl_dest)
  File "/home/tflink/code/taskotron/libtaskotron/libtaskotron/file_utils.py", line 159, in download_file
    max_value=(int(total_len) / chunk_size))  
TypeError: __init__() got an unexpected keyword argument 'max_value'

Is anyone else seeing this or is this a local setup issue?

ok, I was able to reproduce it with python-progressbar rpm. It doesn't happen with the version from pip, hm. Will investigate.

mkrizek updated this revision to Diff 1730.Nov 30 2015, 6:10 PM

Fixed progressbar crash

mkrizek updated this object.Nov 30 2015, 6:13 PM
mkrizek edited the test plan for this revision. (Show Details)
tflink accepted this revision.Nov 30 2015, 6:55 PM
This revision is now accepted and ready to land.Nov 30 2015, 6:55 PM
jskladan accepted this revision.Dec 1 2015, 10:23 AM
kparal added inline comments.Dec 2 2015, 2:14 PM
libtaskotron/file_utils.py
17–19

It might be more obvious what these variables refer to if we prefixed them with REQUESTS_.

Personally I'd go with 30sec timeout. Long timeouts are a drag.

68–69

I guess this doesn't use keep-alive connection. It would be great if we could use it here as well. It would speed up downloading files when caching is used (developer machines).

76–77

Resuming partially downloaded files is supported.

I'm not saying this is important, but is it really supported with requests? If not, we should remove this comment.

148–150

I find a bit confusing to have download() and download_file() functions with the same description Download a file. Would it make sense to make this a private function?

159

We have to keep session around in order to use HTTP Keep-Alive:
http://docs.python-requests.org/en/latest/user/advanced/#keep-alive

Therefore as a global attribute, I guess.

kparal added inline comments.Dec 2 2015, 2:26 PM
libtaskotron/file_utils.py
76–77

I tested this and it seems it can't resume a download by default. It seems it can be done manually:
http://stackoverflow.com/questions/22894211/how-to-resume-file-download-in-python
But I consider this not that important and it's better to not support it than to have more complex code to maintain, I'd say.

mkrizek updated this revision to Diff 1741.Dec 2 2015, 9:06 PM
mkrizek edited the test plan for this revision. (Show Details)

Keep using one session

kparal accepted this revision.Dec 3 2015, 10:14 AM

I tested the new patch and it's very fast now, even for a large number of small files. Great job. A few nitpicks below, but feel free to commit right away, whether you adjust it or not.

libtaskotron/file_utils.py
20

Similarly to above, I'd call this _requests_session to indicate what this is related to :) This is file_utils,py, so this is not directly related to just downloading (we might even argue it's not the right module for it), and we have requests-unrelated methods here as well.

23

It might be a good idea to mention in the docstring that this is a singleton getter and that max_retries argument is applied only if the session doesn't exist and is created for the very first time. Alternatively, mounting of those adapters would have to be adjusting to be applied every time somebody calls this method.

This revision was automatically updated to reflect the committed changes.