download RPM headers - feed RPM file to rpmUtils using a pipe
AbandonedPublic

Authored by garretraziel on Sep 9 2014, 10:53 AM.

Details

Summary

Use python-rpm, named pipe and threading
for downloading RPM headers.

Test Plan

Add headers_only=true to task-depcheck to koji directive,
run depcheck.

Diff Detail

Repository
rLTRN libtaskotron
Branch
feature/T226-headers_alt
Lint
No Linters Available
Unit
No Unit Test Coverage
garretraziel retitled this revision from to use different approach for downloading rpm headers.Sep 9 2014, 10:53 AM
garretraziel updated this object.
garretraziel edited the test plan for this revision. (Show Details)
garretraziel added reviewers: tflink, kparal, jskladan.

note: This isn't complete solution - I am only publishing this for discussion.

jdulaney added a subscriber: jdulaney.
jdulaney commandeered this revision.Sep 9 2014, 4:31 PM
jdulaney edited reviewers, added: garretraziel; removed: jdulaney.

Code looks good, testing.

Note: msimacek from koschei also implemented this approach, independently on me, here is his solution.

garretraziel commandeered this revision.Oct 8 2014, 12:33 PM
garretraziel edited reviewers, added: jdulaney; removed: garretraziel.
kparal retitled this revision from use different approach for downloading rpm headers to download RPM headers - feed RPM file to rpmUtils using a pipe.Oct 9 2014, 1:52 PM
kparal added a comment.Oct 9 2014, 3:46 PM

I like this solution much more than D223. That doesn't mean I don't have any reservations about the current code, I have quite a few, but I think the concept is good.

There are some possible improvements, like using an unnamed pipe, similar to the linked koschei script (which is not an easy-to-understand code, however). Also, if there were some concerns about using threads, I think we can do this in a single thread:

  1. download a fixed amount of data, e.g. 10kB (let's measure the usual rpm header size and use this value)
  2. try to pass it to ts.hdrFromFdno()
  3. if it fails, download another chunk, e.g. twice the size than in #1
  4. try to pass it to ts.hdrFromFdno()
  5. if it worked, close and clean up everything
  6. if it didn't work, bail out (return None or something) and fall back to full RPM download

All nice and simple, in a single thread. It does not support downloading arbitrary amount of data, but I think we don't want to support that anyway. If the header is not found in a reasonable small chunk of initial data, we want to print a warning and fall back to regular methods.

@tflink, you had the largest reservation regarding RPM header approach, what do you think about this patch? Of course, if we detect production environment, we can always download full RPMs, if we want to. But with the number of RPMs we test every day on our dev machines, I think that after running this a week or two on our dev machines, we will be reasonably sure whether there are some potential problems with it or not.

as this seems to be abanodned anyway, how about closing the revision?

I would still like to incorporate either this or D223. It has not been a pressing issue in the past, because we now have rpm caching in dev mode, but from time to time it is still PITA to work on depcheck and wait for the initial several GBs large download. We also had issues with slow downloads in production, and soon we will have the need to speed up our checks, because esp. during freeze periods our task queues start to fill up rapidly. I think the major issue here is not "this might cause weird issues" but rather "can we detect broken headers and in such case download the full file instead?". And it seems to me that the answer is yes. Not to mention we could even store the rpm headers as artifacts now that we have support for it, and use them for reproducing issues if needed.

Sure, but the Diff is almost 8 months old, and nothing happened. If you do not feel like closing it, I'll remove myself from reviewers - it annoyed me on the front page long enough :)

jskladan added a subscriber: jskladan.
garretraziel abandoned this revision.Aug 27 2015, 12:28 PM

It looks like there are currently other things to do. I will abandon this revision, we can open it again when we will need it.