Threads in anaconda? No! David Cantrell INTRODUCTION Threads make a lot of people run screaming. That's entirely understandable because thread concurrency can be a pain. In this short document, I want to explain why threads are in anaconda and how to work with them in the code. MISCONCEPTIONS Just to make sure everyone is on the same page, threads are similar to processes. The big advantage we get is easier shared data structures. Threads can communicate over more methods than just signals. But, multithreaded does not mean that we are taking every operation out to a separate thread. ANACONDA THREADS The idea for anaconda threads is to simplify life for things that can or need to run parallel to other operations. So we will reserve the use of threads for tasks that fit in to this category well (the logging system, for instance) and keep the bulk of the installer in the main thread. THREADS AND PYTHON Python has a nice model for threading. Threads in Python extend the threading.Thread class. So an easy way to identify something that will run or can be run as a thread is seeing a class definition like this: class SomeClass(threading.Thread): You still have your __init__ method for the constructor, but threads also have a run() method and a join() method (there are others, but I will just discuss these). The run() method is called when you start the thread. This is where you want to do the work. Normally this happens in the class constructor, but in threads we need that separated out to a different method. The join() method is to block execution of other threads. Whatever you put in the join() method will run and other threads will be blocked while it runs. Now, this method is only run when you call it explicitly from another thread, so think of it as similar to waitpid(). Python has the thread and threading modules. Use threading as it's built on top of thread and provides a threading system similar to the POSIX thread model. More information: http://docs.python.org/lib/module-threading.html THREAD NAMES Threads have names in Python. They are automatically assigned or you can specify the name. For anaconda it probably makes more sense to name our threads since we won't be launching more than one for the same task. The convention I'm using is: NameThr For example: RelNotesThr The name is arbitrary, but we should probably have some sort of consistency. PYGTK AND THREADS GTK+ presents the biggest challenge for threads, but it's not impossible. We will be allowing GTK+ calls from any thread, so we have to call threads_init() in gui.py as the first thing: gtk.gdk.threads_init() After this, you can use Python threads as you normally would. When you call gtk.main(), you need to call it like this: gtk.threads_enter() gtk.main() gtk.threads_leave() Calls to PyGTK methods or fiddling with GTK objects...all that has to be wrapped in threads_enter/threads_leave calls. The suggested syntax is: gtk.threads_enter() try: # do stuff finally: gtk.threads_leave() Suggested reading: http://www.async.com.br/faq/pygtk/index.py?req=show&file=faq20.006.htp http://developer.gnome.org/doc/API/2.0/gdk/gdk-Threads.html I will try to expand this document as I move through more threading code. Email me if you have any questions. -- David Cantrell