Showing posts with label multiprocessing. Show all posts
Showing posts with label multiprocessing. Show all posts


Python multiprocessing ignores cgroups

At work, I noticed a Python job that was causing a large overload condition. The job requested 16 CPU cores, and was running gensim.models.ldamulticore requesting 15 workers. However, the load on that server indicated an overload of many times over. This is despite a cgroups cpuset restricting the number of cores for that process to 16.

It turns out, gensim.models.ldamulticore uses Python multiprocessing. That module decides how many threads to run based on the number of CPU cores read directly from /proc/cpuinfo. This completely bypasses the limitations imposed by cgroups.

There is currently an open enhancement request to add a new function to multiprocessing for requesting the number of usable CPU cores rather than the total number of CPU cores.


High Performance Python

At PyCon 2012, Ian Oszvald showed how to write high performance Python. Key is understanding performance using profiling. In his introductory remarks, he tells how he came to work in Python after years of doing industry AI research using C++. It's the same reason I started using Python extensively, and I've known several other people who adopted Python generally for the same reason:
I was more productive at the end of the first day using Python to parse SAX than I was after 5 years as being senior dev using C++
Anyway, he has a blog post about his talk, with the slides and links to further material. The source is at github: get it by doing
git clone git://
The first sort of case review he gives is converting old Fortran Xray diffraction code to Python/Cython, and then optimizing the Python in the first day getting an order of magnitude speedup. Further optimization was done using other tools, getting to a final speedup of 300 on the pure Python numpy code.

As with all performance tuning, the key is profiling the code to understand exactly where the code spends its time.