EuroPython 2014 – Python under the hood

This is actually a summary of few talks which deals with “under the hood” topics, the talks partially overlap. Those topics include – memory allocation and management, inheritance, over-ridden built-in methods, etc.

Relevant talks –
The magic of attribute access by Petr Viktorin
Performance Python for Numerical Algorithms by Yves
Metaprogramming, from Decorators to Macros by Andrea Crotti
Everything you always wanted to know about Memory in Python but were afraid to ask by Piotr Przymus
Practical summary –
  • __slots__ argument – limited the memory allocation for objects in Python by overriding the __dict__ attribute.
  • Strings – the empty strings and strings of length=1 are saved as constants. Use intern (or sys.intern in python 3.x) on strings to avoid allocating string variables with the same values this will help making memory usage more efficiency and quicker string comparison. More about this topic here .
  • Numerical algorithms – the way a 2-dimensional array is allocated (row wise or column wise) and store has a great impact on the performance of different algorithms (even for simple sum function).
  • Working with the GPU – there are packages that process some of the data on the GPU. It is efficient when the data is big and less efficient when the data is small since copying all the data to the GPU has some overhead.
  • Cython use c “malloc” function for re-allocating space when list \ dictionaries \ set grow or shrink. On one hand this function can be overridden, on the other hand one can try to avoid costly processes which cause space allocation operations or to use more efficient data structures, e.g list instead of dictionary where it is possible.
  • Note the garbage collector! Python garbage collector is based on reference count. Over-ridding the __dell__ function may disrupt the garbage collector.
  • Suggested Profiling and monitoring tools – psutilmemory_profilerobjgraphRunSnakeRun + Meliaevalgrind
Bottom line of all of those – “knowledge itself is power”. I.e. knowing the internal and the impact of what we are doing can bring to a significant improvements.
There are always several ways to do things and each has cons and pros fitted to the specific case. Some of those are simple to implement and use and can donate to a great improvement on both running time and memory usage. On the other hand some of those suggestion are really “shoot in the foot” – causing memory leaks and other unexpected behavior, beware.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s