<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Stein Magnus Jodal</title>
  <link href="http://www.jodal.no/"/>
  <link type="application/atom+xml" rel="self" href="http://www.jodal.no/atom.xml"/>
  <updated>2012-05-06T01:12:52+02:00</updated>
  <id>http://www.jodal.no/</id>
  <author>
    <name>Stein Magnus Jodal</name>
    <email>stein.magnus@jodal.no</email>
    <uri>http://www.jodal.no/</uri>
  </author>

  
    
      <entry>
        <id>http://www.jodal.no/2011/10/19/speeding-up-a-django-web-site-without-touching-the-code/</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/2011/10/19/speeding-up-a-django-web-site-without-touching-the-code/"/>
        <title>Speeding up a Django web site without touching the code</title>
        <updated>2011-10-19T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;I’ve recently been tweaking my server setup for a Django 1.3 web site with the
goal of making it a bit faster. Of course, there is a lot of speed to gain by
improving e.g. the number of database queries needed to render a web page, but
the server setup also has an effect on the web site performance. This is a log
of my findings.&lt;/p&gt;

&lt;p&gt;All measurements have been done
using the &lt;a href=&quot;http://httpd.apache.org/docs/2.2/programs/ab.html&quot;&gt;ab&lt;/a&gt; tool from
Apache using the arguments &lt;code&gt;-n 200 -c 20&lt;/code&gt;, which means that each case have
been tested with 20 concurrent requests up to 200 requests in total. The tests
was run from another machine than the web server, with around 45ms RTT to the
server. This is not a scientific measurement, but good enough to let me quickly
test my assumptions on what increases or decreases performance.&lt;/p&gt;

&lt;p&gt;The Django app isn’t particularly optimized in itself, so I don’t care much
about the low number of requests per second (req/s) that it manages to process.
The main point here is the relative improvement with each change to the server
setup.&lt;/p&gt;

&lt;p&gt;The baseline setup is a
&lt;a href=&quot;http://www.linode.com/?r=3919f35863b90f73ab3181921b5d1a4eadf39ba1&quot;&gt;Linode 1024 VPS&lt;/a&gt;
(Referral link: I get USD 20 off my bill if you sign up and remain a customer
for 90 days), running Apache 2.2.14 with mpm-itk, mod_wsgi in daemon mode with
maximum 50 threads and restart every 10000 requests, SSL using mod_ssl, and
PostgreSQL 8.4.8 as the database. For the given Django app and hardware, this
setup is strolling along at &lt;strong&gt;4.0 req/s&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;With &lt;a href=&quot;http://senko.net/en/django-nginx-gunicorn/&quot;&gt;this blog post&lt;/a&gt; as reference,
I switched from Apache+mod_wsgi to using &lt;a href=&quot;http://nginx.org/&quot;&gt;nginx&lt;/a&gt; 0.7.5 as
SSL terminator, for serving static media, and as a proxy in front of
&lt;a href=&quot;http://gunicorn.org/&quot;&gt;Gunicorn&lt;/a&gt; 0.13.4. Gunicorn is a WSGI HTTP server,
hosting the Django site. The Linode VPS got access to four CPU cores (n=4), so
I set up nginx with 4 workers (n) and Gunicorn with 9 workers (2n+1). Different
values for these settings are sometimes recommended, but this is what I’m
currently using. This setup resulted in an increase to &lt;strong&gt;9.0 req/s&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A nice improvement, but I changed multiple components here, so I don’t know
exactly what helped. It would be interesting to test e.g. Apache with
mod_proxy in front of Gunicorn, as well as different number of nginx and
Gunicorn workers. The nginx version is also a bit old, because I used the one
packaged in Ubuntu 10.04 LTS. I should give nginx 1.0.x a spin.&lt;/p&gt;

&lt;p&gt;Next up, I added
&lt;a href=&quot;http://pgbouncer.projects.postgresql.org/doc/usage.html&quot;&gt;pgbouncer&lt;/a&gt; 1.3.1 (as
packaged in Ubuntu 10.04 LTS, latest is 1.4.2) as a PostgreSQL connection
pooler. I let pgbouncer do session pooling, which is the safest choice and the
default. Then I changed the Django app settings to use pgbouncer at port 6432,
instead of connecting directly to PostgreSQL’s port
5432. This increased the performance further to &lt;strong&gt;10.5 req/s&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Then, I started looking at SSL performance, without this being the bottleneck
at all. I learned a lot about SSL performance, but didn’t improve the test
results at all. Some key points was:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;nginx defaults to offering Diffie-Hellman Ephemeral (DHE) which takes a lot
of resources. Notably, the SSL terminators
&lt;a href=&quot;https://github.com/bumptech/stud&quot;&gt;stud&lt;/a&gt; and
&lt;a href=&quot;http://www.stunnel.org/&quot;&gt;stunnel&lt;/a&gt; does not use DHE.
See &lt;a href=&quot;http://matt.io/entry/ur&quot;&gt;this blog post&lt;/a&gt; for more details and how to
turn off DHE in nginx.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If you’re using AES, you can process
&lt;a href=&quot;http://vincent.bernat.im/en/blog/2011-ssl-benchmark.html#ciphers_and_key_sizes&quot;&gt;five times&lt;/a&gt;
as many requests with a 1024 bit key compared to a 2048 bit key. I use a 2048
bit key.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;64-bit OS and userland
&lt;a href=&quot;http://vincent.bernat.im/en/blog/2011-ssl-benchmark.html#32bit_vs_64bit&quot;&gt;doubles the connections&lt;/a&gt;
per second compared to 32-bit. My VPS is stuck at 32-bit for historical reasons.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;http://vincent.bernat.im/en/blog/2011-ssl-benchmark.html#session_cache&quot;&gt;SSL session reuse&lt;/a&gt;
eliminates one round-trip for subsequent connections. I set this up, but my
test setup only use fresh connections, so this improvement isn’t visible in
the test results.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Browsers will go a long way to get hold of missing certificates in the
certificate chain between known CA certificates and the site’s certificate.
To avoid having the browser doing requests to other sites to find missing
certificates, make sure all certificates in the chain are provided by your
server.&lt;/p&gt;

    &lt;p&gt;If you’re switching from Apache to Nginx, note that Apache uses separate
files for your SSL certificate and the SSL certificate chain, while Nginx
wants these two files to be concatenated to a single file, with your SSL
certificate first.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, I read about
&lt;a href=&quot;http://thebuild.com/blog/2009/11/07/django-postgresql-and-transaction-management/&quot;&gt;transaction management&lt;/a&gt;
and the use of
&lt;a href=&quot;http://thebuild.com/blog/2009/11/07/django-postgresql-and-autocommit/&quot;&gt;autocommit&lt;/a&gt;
in Django. The Django site I’m testing is read-heavy, with almost no database
writes at all. It doesn’t use Django’s transaction middleware, which means that
each select/update/insert happens in its own transaction instead of having one
database transaction spanning the entire Django view function.&lt;/p&gt;

&lt;p&gt;Since I’m using PostgreSQL &amp;gt;= 8.2, which supports &lt;code&gt;INSERT ... RETURNING&lt;/code&gt;, I can
turn on &lt;code&gt;autocommit&lt;/code&gt; in the Django settings, and keep the transaction
semantics of a default Django setup without the transaction middleware. Turning
on &lt;code&gt;autocommit&lt;/code&gt; makes PostgreSQL wrap each query with a transaction, instead of
Django adding explicit &lt;code&gt;BEGIN&lt;/code&gt;, and &lt;code&gt;COMMIT&lt;/code&gt; or &lt;code&gt;ROLLBACK&lt;/code&gt; statements around
each and every query. Somewhat surprisingly, this reduced the performance to
&lt;strong&gt;9.2 req/s&lt;/strong&gt;. Explanations as to why this reduced the performance are welcome.&lt;/p&gt;

&lt;p&gt;Reverting the &lt;code&gt;autocommit&lt;/code&gt; change, I got back to &lt;strong&gt;10.5 req/s&lt;/strong&gt;. Then I tried
tuning the PostgreSQL configuration using the
&lt;a href=&quot;http://pgtune.projects.postgresql.org/&quot;&gt;pgtune&lt;/a&gt; tool. I went for the web
profile, with autodetection of the amount of memory (1024 MB):&lt;/p&gt;

&lt;p&gt;pgtune changed the following settings:&lt;/p&gt;

&lt;p&gt;After restarting PostgreSQL with the updated settings, the performance
increased to &lt;strong&gt;11.7 req/s&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To summarize: in a few hours, I’ve learned a lot about SSL performance tuning,
and–without touching any application code–I’ve almost tripled the amount of
requests that the site can handle. The performance still isn’t &lt;em&gt;great&lt;/em&gt;, but
it’s a lot better than what I started with, and the setup is still far from
perfect.&lt;/p&gt;

&lt;p&gt;To get further speed improvements, I would mainly look into three areas:
adding page (or block) caching where appropriate, log database queries and
tweak the numerous or slow ones, and look further into tweaking the PostgreSQL
settings. But, that’s for another time.&lt;/p&gt;

&lt;p&gt;If you have suggestions for other server setup tweaks, please share them in the
comments, and I’ll try them out.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Updated:&lt;/em&gt; Removed the “mean response time” numbers, which simply is (time of
full test run) / (number of requests). It just told us the same as req/s in a
less intuitive way. The other interesting number here is the perceived latency
for a single user/request. I’ll make sure to include it in future posts.&lt;/p&gt;
</content>
        
          <category term="python"/>
        
          <category term="django"/>
        
          <category term="performance"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/2011/10/06/traversable-attributes-in-pykka/</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/2011/10/06/traversable-attributes-in-pykka/"/>
        <title>Traversable attributes in Pykka</title>
        <updated>2011-10-06T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;In &lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka&lt;/a&gt; 0.13–which was released almost two
weeks ago–traversing the attributes of an actor is about 8.3 times faster than
it used to be. To paraphrase Apple: “8.3X faster. That’s amazing!” &lt;em&gt;(Update:
This was written a couple of hours before the news of Jobs’ passing arrived.
May he continue to inspire us.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So, what is “traversable attributes”? Let’s take a few steps back.&lt;/p&gt;

&lt;p&gt;If we were a conservative actor adhering strictly to the
&lt;a href=&quot;http://en.wikipedia.org/wiki/Actor_model&quot;&gt;actor model&lt;/a&gt;, we surely wouldn’t
share our attributes with anybody else. We would expect other actors to send
serializable messages to us, asking nicely to get the value of the attribute,
or maybe asking for something else. Of course, the other actors wouldn’t even
know the attribute existed unless we told them, and even then they wouldn’t
ever dream of requesting a reference to our attribute or altering the attribute
directly. It would be indecent. It would break the rules of the actor model. It
would be unsafe.&lt;/p&gt;

&lt;p&gt;When using Pykka, you can keep to the traditional way of passing messages back
and forth between the actors. You start the actor by calling the
&lt;code&gt;Actor.start()&lt;/code&gt; class method, which returns an &lt;code&gt;ActorRef&lt;/code&gt; object. This object
can safely be passed around and even shared between threads. The &lt;code&gt;ActorRef&lt;/code&gt;
object got two methods for sending messages to the actor, called
&lt;code&gt;send_one_way()&lt;/code&gt; and &lt;code&gt;send_request_reply()&lt;/code&gt;. This is nice enough by itself, and
it gives you a way to build concurrent applications which is easier to reason
about–just as promised by advocates of the actor model–than when you do the
thread and lock management dance. You can quickly hack together a simple actor
implementation like this from scratch for each and every application you make
using e.g. &lt;code&gt;Thread&lt;/code&gt; and &lt;code&gt;Queue&lt;/code&gt;. I’ve done this a couple of times, and it
works.&lt;/p&gt;

&lt;p&gt;But, I wanted a bit more, so I created Pykka.&lt;/p&gt;

&lt;p&gt;First, I wanted to get rid of verbose dict messages all over my code base.
I just wanted to call regular methods and access regular attributes on regular
objects. Pykka provides a safe way of doing this, called &lt;code&gt;ActorProxy&lt;/code&gt;. An
&lt;code&gt;ActorProxy&lt;/code&gt; is nothing more than a wrapper around an &lt;code&gt;ActorRef&lt;/code&gt;. It does all
it’s magic by sending messages to the actor, just like you used to do yourself.&lt;/p&gt;

&lt;p&gt;Second, I wanted to be able to organize actors like regular code, e.g. by
splitting them into multiple classes. Imagine a running actor &lt;code&gt;a&lt;/code&gt; which have
the attribute &lt;code&gt;b&lt;/code&gt;. The “subobject” &lt;code&gt;b&lt;/code&gt; have the method &lt;code&gt;c()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you call &lt;code&gt;a.b.c()&lt;/code&gt;, the following happens:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;We send a message to actor &lt;code&gt;a&lt;/code&gt; requesting attribute &lt;code&gt;b&lt;/code&gt;, and immediately get
a future object back which is our handle to the result which will be
available in the future.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Actor &lt;code&gt;a&lt;/code&gt; gets the message, looks up attribute &lt;code&gt;b&lt;/code&gt;, and returns a copy of
the object referenced by the &lt;code&gt;b&lt;/code&gt; attribute.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;We call &lt;code&gt;c()&lt;/code&gt; on the future, but the &lt;code&gt;Future&lt;/code&gt; class doesn’t have an
attribute called &lt;code&gt;c&lt;/code&gt;, so it fails. Alternatively, we use the future
correctly and call &lt;code&gt;get()&lt;/code&gt; on the future to get the real result, a copy of
&lt;code&gt;b&lt;/code&gt;. Then we call &lt;code&gt;c()&lt;/code&gt; on the copy of &lt;code&gt;b&lt;/code&gt;.  The method &lt;code&gt;c()&lt;/code&gt; is now
running, but it is running in the caller’s thread, and not in the actor &lt;code&gt;a&lt;/code&gt;
like I wanted it to do.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The simple attribute access that the &lt;code&gt;ActorProxy&lt;/code&gt; provides isn’t enough to make
this work.&lt;/p&gt;

&lt;p&gt;To make the &lt;code&gt;a.b.c()&lt;/code&gt; method call be executed in the actor &lt;code&gt;a&lt;/code&gt; instead of the
caller’s thread, we need to traverse attribute &lt;code&gt;b&lt;/code&gt; without having it returned
to us, so that we can get to &lt;code&gt;c()&lt;/code&gt; while still inside the actor &lt;code&gt;a&lt;/code&gt;, and call
its method &lt;code&gt;c()&lt;/code&gt;. We need what we in Pykka call &lt;em&gt;traversable attributes&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;To make an attribute traversable, the only thing we need to do is to mark it as
such by adding the attribute &lt;code&gt;pykka_traversable&lt;/code&gt; to the traversable attribute:&lt;/p&gt;

&lt;p&gt;When you access a regular attribute of a Pykka actor, you just get a future
object, which, when you call &lt;code&gt;get()&lt;/code&gt; on it, will return a copy of the
attribute. When you access a traversable attribute of a Pykka actor, you get a
brand new &lt;code&gt;ActorProxy&lt;/code&gt; which wraps the same &lt;code&gt;ActorRef&lt;/code&gt;, but method calls and
attribute accesses on the new proxy object will work on the actor’s attribute
instead of the actor itself.&lt;/p&gt;

&lt;h3 id=&quot;speeding-up-access-to-traversible-attributes&quot;&gt;Speeding up access to traversible attributes&lt;/h3&gt;

&lt;p&gt;If you’re still following, you’re maybe wondering how we sped up access to
traversable attributes with a factor of 8.3. The answer is a few lines up: “you
get a brand new &lt;code&gt;ActorProxy&lt;/code&gt;.”&lt;/p&gt;

&lt;p&gt;&lt;em&gt;So, why should that matter?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you split your actor into multiple classes using traversable attributes,
you’re probably going to use each traversable attribute more than once. Maybe
really often. Turns out, creating brand new &lt;code&gt;ActorProxy&lt;/code&gt; objects for the same
attribute over and over again is kind of wasteful.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you find out?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/sandos&quot;&gt;John Bäckstrand&lt;/a&gt; was irritated by
&lt;a href=&quot;http://www.mopidy.com/&quot;&gt;Mopidy&lt;/a&gt; being almost unusable on his slow system, and
attacked the problem in the scientific way: by measuring where the bottleneck
was. John quickly pointed out that access to second-level attributes, which
required the traversal of a traversable attribute, was five times slower than
access to first-level attributes, which didn’t involve traversable attributes.
This observation made it obvious that the creation of new
&lt;code&gt;ActorProxy&lt;/code&gt; objects whenever we accessed traversable attributes–even though
the proxy objects didn’t contain any state and was fully reusable–probably
needed refinement.&lt;/p&gt;

&lt;p&gt;To be sure we fixed the issue, we started by writing a performance test which
compared attribute access with and without the traversal of a traversable
attribute.&lt;/p&gt;

&lt;p&gt;Then, the &lt;a href=&quot;https://github.com/jodal/pykka/commit/e0ffcf42c19be53b66a85d3041f7330180e368e6&quot;&gt;fix&lt;/a&gt;
was short and easy: Cache and reuse &lt;code&gt;ActorProxy&lt;/code&gt; objects.&lt;/p&gt;

&lt;p&gt;The result was immediate: The performance test for traversable attribute access
showed an 8.3X improvement.&lt;/p&gt;

&lt;p&gt;Mopidy use Pykka’s traversable attributes heavily to organize its backend code.
Obviously, we try to avoid wiring up lots of actors in Mopidy’s unit tests, but
we’ve been lazy and use some actors in the tests. These five lines of code
inserted at the right place in a dependency made Mopidy’s test suite run 20%
faster, and made John’s use case run 166% faster.&lt;/p&gt;

&lt;p&gt;We could use more of five-line patches like that :-)&lt;/p&gt;
</content>
        
          <category term="python"/>
        
          <category term="pykka"/>
        
          <category term="performance"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/9961954094/hva-alle-utviklere-ma-vite-om-tegnsettenkoding</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/9961954094/hva-alle-utviklere-ma-vite-om-tegnsettenkoding"/>
        <title>Hva alle utviklere må vite om tegnsettenkoding</title>
        <updated>2011-09-08T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;&lt;em&gt;English: This is slides and video from the Norwegian lightning presentation
on character encoding I did at the JavaZone 2011 conference today.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Takk for nok en knall konferanse :-)&lt;/p&gt;

&lt;p&gt;Her er &lt;a href=&quot;http://speakerdeck.com/u/jodal/p/hva-alle-utviklere-ma-vite-om-tegnsettenkoding&quot;&gt;slidene mine&lt;/a&gt;
fra lyntalen om tegnsettenkoding jeg holdt på JavaZone 2011 i dag.
&lt;a href=&quot;https://speakerd.s3.amazonaws.com/presentations/4fa584fdc0729c00220087be/charset-encoding-as-presented-at-jz11.pdf&quot;&gt;PDF-versjon&lt;/a&gt; er også
tilgjengelig, by popular demand.&lt;/p&gt;

&lt;script async=&quot;async&quot; class=&quot;speakerdeck-embed&quot; data-id=&quot;4fa584fdc0729c00220087be&quot; data-ratio=&quot;1.3333333333333333&quot; src=&quot;//speakerdeck.com/assets/embed.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;Og her er &lt;a href=&quot;http://vimeo.com/28764541&quot;&gt;videoen&lt;/a&gt; ute, mindre enn åtte timer senere. Smidig! :-)&lt;/p&gt;

&lt;iframe src=&quot;http://player.vimeo.com/video/28764541?byline=0&amp;amp;portrait=0&amp;amp;color=ffffff&quot; width=&quot;508&quot; height=&quot;143&quot;&gt;
&lt;/iframe&gt;
</content>
        
          <category term="charset"/>
        
          <category term="encoding"/>
        
          <category term="javazone"/>
        
          <category term="talk"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/6299220692/pyspotify-1-2-released</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/6299220692/pyspotify-1-2-released"/>
        <title>pyspotify 1.2 released</title>
        <updated>2011-06-08T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;pyspotify is a Python wrapper for
&lt;a href=&quot;http://developer.spotify.com/en/libspotify/&quot;&gt;libspotify&lt;/a&gt;, which give
developers access to the &lt;a href=&quot;http://www.spotify.com/&quot;&gt;Spotify&lt;/a&gt; music streaming
service.&lt;/p&gt;

&lt;p&gt;Today, I tagged pyspotify 1.2 at &lt;a href=&quot;https://github.com/mopidy/pyspotify&quot;&gt;GitHub&lt;/a&gt;
and pushed the new release to
&lt;a href=&quot;http://pypi.python.org/pypi/pyspotify/1.2&quot;&gt;PyPI&lt;/a&gt;. I’ve also made deb packages
of libspotify and pyspotify available at
&lt;a href=&quot;http://apt.mopidy.com/&quot;&gt;apt.mopidy.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The 1.2 release brings pyspotify up to date with libspotify 0.0.8, which was
released by Spotify a couple of weeks ago. It also fixes a bunch of memory
issues. pyspotify does not implement the full libspotify API, but a significant
and usable &lt;a href=&quot;http://pyspotify.mopidy.com/docs/master/introduction/#completion-status&quot;&gt;part of
it&lt;/a&gt;.
With pyspotify 1.2 all the old and broken unit tests have been fixed, and a
bunch of new tests have been developed. For the first time, pyspotify is
documented, and it comes with a &lt;a href=&quot;http://pyspotify.mopidy.com/&quot;&gt;new web site&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is the first release of pyspotify to PyPI since Doug Winter’s release of
1.1 in April 2010. Since Doug hasn’t found the time to maintain pyspotify
further, Johannes Knutsen and myself have been patching pyspotify somewhat ad
hoc, adding just what we needed to keep pyspotify barely working with the new
releases of libspotify and &lt;a href=&quot;http://www.mopidy.com/&quot;&gt;Mopidy&lt;/a&gt;. Back in January, I
explored the use of &lt;a href=&quot;http://www.cython.org/&quot;&gt;Cython&lt;/a&gt; for a new alternative
libspotify binding, &lt;a href=&quot;http://github.com/jodal/spoticy&quot;&gt;spoticy&lt;/a&gt;. Using Cython
for this was a great success as far as I took it, but I had other Mopidy
related side projects to complete first (aka
&lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka&lt;/a&gt;). Because Mopidy depends heavily on
pyspotify, I also asked Doug to transfer the maintenance of pyspotify to the
Mopidy project, which he finally decided to do now in May.&lt;/p&gt;

&lt;p&gt;In February, Antoine Pierlot-Garcin started sending us patches for our branch
of pyspotify. Since then, he has been steadily improving pyspotify. All the
improvements listed above is due to Antoine’s work, and the 1.2 release is to
his credit.&lt;/p&gt;

&lt;p&gt;I hope this will make the situation around pyspotify clearer, and that we’ll
soon see more projects using pyspotify to do great stuff with the Spotify
service.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Le pyspotify est mort, vive le pyspotify.&lt;/em&gt;&lt;/p&gt;
</content>
        
          <category term="python"/>
        
          <category term="spotify"/>
        
          <category term="mopidy"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/5779178001/log-from-the-debugging-of-a-segfault</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/5779178001/log-from-the-debugging-of-a-segfault"/>
        <title>Log from the debugging of a segfault</title>
        <updated>2011-05-24T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;&lt;em&gt;The following is a cleaned up log I wrote for myself while debugging a bug.
Writing a log while working helps me keep track of the debugging effort in case
I’m interrupted (life, sleep, work, etc.). It also requires me to explain all
findings to myself in fully spelled out sentences, making my thoughts
considerably easier to follow.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In addition to serving as an example of a personal debug log, I hope it can be
useful as an introduction to debugging segfaults or other low-level bugs.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;how-to-reproduce&quot;&gt;How to reproduce&lt;/h3&gt;

&lt;p&gt;I have three computers running Ubuntu 11.04. Mopidy revision 9c23949
consistently crashes with a segfault on one of them when I do the following:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Start Mopidy.&lt;/li&gt;
  &lt;li&gt;Connect with an MPD client, e.g. &lt;code&gt;ncmpcpp&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Search for anything, e.g. &lt;code&gt;foo&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Wait less than 10 seconds for the segfault to happen. It always happens
directly after a log message from Mopidy’s Spotify backend stating that it
is &lt;code&gt;Updating metadata&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;rule-out-the-obvious&quot;&gt;Rule out the obvious&lt;/h3&gt;

&lt;p&gt;I check that I’m actually using the latest versions of the important pieces of
software, not just some old version I installed by hand and forgot about.&lt;/p&gt;

&lt;p&gt;I try uninstalling pyspotify v1.1+mopidy20110405 installed from my own Debian
package, and install the latest revision (c6e2a02) from Git instead. Still the
same error.&lt;/p&gt;

&lt;p&gt;Nobody has reported the same problem, even though both other developers and
users should have been on the same mix of versions for the last couple of
months.&lt;/p&gt;

&lt;h3 id=&quot;digging-in&quot;&gt;Digging in&lt;/h3&gt;

&lt;p&gt;Luckily I can consistently reproduce the segfault in less than 20 seconds of
work. I’m not too familiar with debugging C programs and especially the
infamous segfaults, but I’m taking on this fight.&lt;/p&gt;

&lt;p&gt;I expect the problem to be in pyspotify, as the rest of the code is either
pure Python or more well-tested and broadly used libraries. Also the segfault
consistently happens directly after a log message from Mopidy’s Spotify
backend.&lt;/p&gt;

&lt;p&gt;Antoine Pierlot-Garcin, the new main contributor to pyspotify, says to rebuild pyspotify with
&lt;code&gt;CFLAGS=&quot;-g -O0&quot;&lt;/code&gt;. &lt;code&gt;-g&lt;/code&gt; will include debug information that &lt;code&gt;gdb&lt;/code&gt; will
understand in the resulting binary. &lt;code&gt;-O0&lt;/code&gt; will override the &lt;code&gt;-O2&lt;/code&gt; default
of the pyspotify build system, and turn off any optimizations to ease the
debugging.&lt;/p&gt;

&lt;p&gt;Googling “debugging segfaults” yields a nice
&lt;a href=&quot;http://www.cprogramming.com/debugging/segfaults.html&quot;&gt;howto on debugging segfaults&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In summary: Get a core dump, inspect it with &lt;code&gt;gdb&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Possible causes: “There are four common mistakes that lead to segmentation
faults: dereferencing NULL, dereferencing an uninitialized pointer,
dereferencing a pointer that has been freed (or deleted, in C++) or that has
gone out of scope (in the case of arrays declared in functions), and writing
off the end of an array.”&lt;/p&gt;

&lt;p&gt;Rerunning Mopidy nothing more interesting happens, except the usual segfault.
No core dump is produced.&lt;/p&gt;

&lt;p&gt;Hum, how to get a core dump? I’ve done it before, but I can’t remember.
&lt;a href=&quot;http://mihirknows.blogspot.com/2009/03/how-to-get-core-dump.html&quot;&gt;Google helps&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I try &lt;code&gt;ulimit -c 50000&lt;/code&gt; to get a core dump of maximum 50MB, and then
reproduce the segfault.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;python mopidy -v&lt;/code&gt;, connect with &lt;code&gt;ncmpcpp&lt;/code&gt;, search for &lt;code&gt;foo&lt;/code&gt;, wait less
than 10 seconds. Kaboom:&lt;/p&gt;

&lt;p&gt;Loads the core dump up in &lt;code&gt;gdb&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;gdb&lt;/code&gt; complains that the core dump is truncated, and that is should be
approximately 180MB.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;man bash&lt;/code&gt; and search for &lt;code&gt;ulimit&lt;/code&gt;. Aha. I can specify &lt;code&gt;unlimited&lt;/code&gt;. But
I’m not allowed to? Restart shell, run &lt;code&gt;ulimit -c unlimited&lt;/code&gt; again. Works.&lt;/p&gt;

&lt;p&gt;Rerun Mopidy to get an untruncated core dump.&lt;/p&gt;

&lt;h3 id=&quot;analyzing-the-core-dump&quot;&gt;Analyzing the core dump&lt;/h3&gt;

&lt;p&gt;Yay! &lt;code&gt;gdb&lt;/code&gt; loads debug symbols from all the linked libraries that got them
available. Notably, the proprietary &lt;code&gt;libspotify&lt;/code&gt; library does not include
debug symbols. So what does &lt;code&gt;gdb&lt;/code&gt; say?&lt;/p&gt;

&lt;h3 id=&quot;verifying-a-hypothesis&quot;&gt;Verifying a hypothesis&lt;/h3&gt;

&lt;p&gt;According to the libspotify docs for the &lt;a href=&quot;http://developer.spotify.com/en/libspotify/docs/group__album.html&quot;&gt;album subsystem&lt;/a&gt;
&lt;code&gt;sp_album_add_ref&lt;/code&gt; takes an &lt;code&gt;sp_album&lt;/code&gt; struct and increases the reference count
of the album. Looking back at the list of reasons for segfaults from the
segfault debugging howto, it may sound as we increase the reference count of
&lt;code&gt;NULL&lt;/code&gt;, which obviously isn’t good. Let’s see if that hypothesis is correct…&lt;/p&gt;

&lt;p&gt;Looking in the howto for ways to proceed, &lt;code&gt;backtrace&lt;/code&gt; reveals the events
happening just before the segfault:&lt;/p&gt;

&lt;p&gt;Moving one step up the stack, we get to the pyspotify code:&lt;/p&gt;

&lt;p&gt;Our guess was that we’re increasing the reference count on an album that is
null. Let’s see what &lt;code&gt;album&lt;/code&gt; actually was…&lt;/p&gt;

&lt;p&gt;A pointer to an &lt;code&gt;sp_album&lt;/code&gt; struct at the address &lt;code&gt;0x0&lt;/code&gt;, also known as &lt;code&gt;NULL&lt;/code&gt;.
Hypothesis confirmed.&lt;/p&gt;

&lt;h3 id=&quot;finding-a-solution&quot;&gt;Finding a solution&lt;/h3&gt;

&lt;p&gt;Let’s take a look at the pyspotify code in question:&lt;/p&gt;

&lt;p&gt;We create a pointer to a &lt;code&gt;sp_album&lt;/code&gt; struct. Given a Spotify track, we request a
reference to the related album, and assign the result to our pointer. Then we
create an &lt;code&gt;Album&lt;/code&gt; Python object, before we increase the reference count on the
&lt;code&gt;sp_album&lt;/code&gt; and give the reference to the Python object. Finally, the Python
object is returned.&lt;/p&gt;

&lt;p&gt;Lets try just returning &lt;code&gt;NULL&lt;/code&gt; if we get &lt;code&gt;NULL&lt;/code&gt; from &lt;code&gt;sp_track_album&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;If we rebuild pyspotify and try to reproducing the segfault, we now get a
familiar Python traceback instead:&lt;/p&gt;

&lt;p&gt;Note that we get a &lt;code&gt;SystemError&lt;/code&gt; and not e.g. &lt;code&gt;AttributeError: 'NoneType'
object has no attribute 'year'&lt;/code&gt; which we would expect if &lt;code&gt;album()&lt;/code&gt; returned
&lt;code&gt;None&lt;/code&gt;. This is because we return &lt;code&gt;NULL&lt;/code&gt; from &lt;code&gt;Track_album&lt;/code&gt;, which Python
considers an “error return”, without setting an exception code first. If we
want to return &lt;code&gt;None&lt;/code&gt; on failure instead of throwing an exception, we can
change &lt;code&gt;track.c&lt;/code&gt; as follows:&lt;/p&gt;

&lt;p&gt;Rebuilding pyspotify again, and reproducing the error, we now get the expected
Python error which we can handle nicely:&lt;/p&gt;

&lt;p&gt;By this point, I consider the bug squashed and ready to be
&lt;a href=&quot;https://github.com/mopidy/pyspotify/issues/12&quot;&gt;reported&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The only part left is to decide how to best fix it. E.g. if &lt;code&gt;album()&lt;/code&gt; should
return &lt;code&gt;None&lt;/code&gt;, throw an exception, or maybe return an empty album object.
That’s another story.&lt;/p&gt;
</content>
        
          <category term="python"/>
        
          <category term="programming"/>
        
          <category term="debugging"/>
        
          <category term="segfaults"/>
        
          <category term="gdb"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/4233456652/fizzbuzz-in-haskell</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/4233456652/fizzbuzz-in-haskell"/>
        <title>FizzBuzz in Haskell</title>
        <updated>2011-03-31T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;I’ve recently started reading the free Haskell tutorial &lt;a href=&quot;http://learnyouahaskell.com/&quot;&gt;Learn You A Haskell
for Great Good!&lt;/a&gt; while commuting. So far I’ve
enjoyed the tutorial, and I’ve had a couple moments where I’ve smiled to myself
on the bus due to typeclasses, etc., which I guess isn’t quite normal behaviour
among the general population ;-)&lt;/p&gt;

&lt;p&gt;I just discovered that the tutorial is going to be released as a book in a
couple of weeks. You should definitely pick it up if you’re interested in
learning a different programming language.&lt;/p&gt;

&lt;p&gt;Here’s my first shot at some Haskell code, implementing the FizzBuzz kata using
guard statements:&lt;/p&gt;

&lt;p&gt;If you save this code to a file named &lt;code&gt;FizzBuzz.hs&lt;/code&gt;, you can load and run it
the interactive Haskell interpreter:&lt;/p&gt;

&lt;p&gt;Now the file is loaded, compiled, and the &lt;code&gt;fizzBuzz&lt;/code&gt; function is available in
my current namespace. I use a list comprehension and a range to call &lt;code&gt;fizzBuzz&lt;/code&gt;
20 times over the integers from 1 to 20, collecting the returned strings in a
list:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Updated 2011-04-10:&lt;/em&gt; Removed special case for 0, as there is no such rule in
FizzBuzz.&lt;/p&gt;
</content>
        
          <category term="programming"/>
        
          <category term="haskell"/>
        
          <category term="kata"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/4180400974/pykka-and-porting-pykka-to-python-3</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/4180400974/pykka-and-porting-pykka-to-python-3"/>
        <title>Pykka, and porting Pykka to Python 3</title>
        <updated>2011-03-29T00:00:00+02:00</updated>
        <content type="html">&lt;p&gt;This is somewhat of an introduction to &lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka&lt;/a&gt;,
a Python library implementing the actor model, which I’ve been working on now
and then since November. And it’s still just 300 lines of code!&lt;/p&gt;

&lt;p&gt;The goal of the actor model is to make it easier to develop concurrent programs
by removing all shared state and use messaging for all communication. Removing
shared state and doing lots of messaging doesn’t make anything easier by
itself, but it avoids common problems in concurrent programs like proper lock
usage and data corruption (due to the absence of proper lock usage). Also, I
believe the actor model makes it easier to reason about concurrent programs,
which is a nice property since software development often feels like 10%
development and 91% debugging.&lt;/p&gt;

&lt;p&gt;The goal of Pykka is to implement the actor model for Python as well as
building useful concurrency abstractions on top of the actor model, most
notably Pykka’s actor proxies. Actor proxies let you call the actor’s methods
and read/write to the actor’s public attributes through a regular API, as
opposed to passing messages using the usual &lt;code&gt;send_one_way(message)&lt;/code&gt; and
&lt;code&gt;send_request_reply(message, block, timeout)&lt;/code&gt; methods. Accessing the actor’s
public attributes like this may sound like the reintroduction of shared state
and not in line with the actor model, but the actor proxy has no more access to
the actor than anyone with a regular reference to the running actor has: it can
send messages to the actor and patiently wait for an answer. The request for
reading or writing to the attribute is processed like any other message to the
actor: one at the time. The only difference between a regular object API and
the API of the actor proxy is that all attribute reads and method calls on the
proxy returns futures. You can either just &lt;code&gt;get()&lt;/code&gt; the future right away, to
block until the result is available and effectively serialize the execution
your program, or you can pass the future around your program, delaying &lt;code&gt;get()&lt;/code&gt;
until the moment where you really need the future’s encapsulated value.&lt;/p&gt;

&lt;p&gt;Currently Pykka has two actor implementations. One is based on threads, with
one thread per actor. The other is based on &lt;a href=&quot;http://www.gevent.org/&quot;&gt;gevent&lt;/a&gt;,
with one lightweight greenlet per actor. In the gevent variant, all actors
share a single thread, but are scheduled by a libevent loop. libevent is a
cross-platform library wrapping whatever event mechanism that is the most
efficient one available on your platform. libevent is also used by the popular
Node.js framework. The big disadvantage of the gevent actors are that they
don’t play well with non-gevent threads. Sometimes you can’t convert all
threads in your application to actors, especially if those threads are embedded
in support libraries you use, like GStreamer and libspotify, as is the case for
my other current project, the &lt;a href=&quot;http://www.mopidy.com/&quot;&gt;Mopidy&lt;/a&gt; music server.
For those cases, the threading-based actors are there to help you.&lt;/p&gt;

&lt;p&gt;For some examples and more detailed descriptions and documentation, check out
&lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka’s documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;porting-to-python-3&quot;&gt;Porting to Python 3&lt;/h3&gt;

&lt;p&gt;gevent isn’t available under Python 3 yet, but it seems like a port of gevent
to Python 3
&lt;a href=&quot;https://groups.google.com/d/topic/gevent/HUmVDfK9ggM/discussion&quot;&gt;is being worked on&lt;/a&gt;.
The upcoming 0.12 release of Pykka brings Python 3 support to the thread-based
actor implementation.&lt;/p&gt;

&lt;p&gt;Porting Pykka to Python 3 was no large feat:&lt;/p&gt;

&lt;p&gt;Pykka is a quite new project, mainly made to scatch my own itch (aren’t all
projects?). I know some people have been playing with it, but I’m not aware of
anybody other than myself using it for a “real” project yet. Thus, I’m quite
lenient about changing the API, at least in minor ways. The current goal is to
make the library strike a nice balance between being useful and consistent. If
I see ways to improve the API, I mostly just do it, but make sure to increase
the version number in the next release. That’s how I got to 0.12 in four
months. ;-) To keep overhead low, I’ve not yet started to maintain a changelog
or migration documentation outside the git log. If you start using Pykka for
anything more than an hour of play, please notify me, and I’ll straighten up.
My point is, not having to maintain backwards compatability simplifies the
world greatly. That said, I don’t think I’ve changed any of Pykka’s API to add
Python 3 support.&lt;/p&gt;

&lt;p&gt;As my main/original target for Pykka was to use it for Mopidy, which requires
2.6+, Pykka also requires Python 2.6+. 2.6 brings e.g. &lt;code&gt;with&lt;/code&gt; statements, which
are nice when working with locks. All the three latest Ubuntu releases defaults
to 2.6, and so does OS X as of Snow Leopard. (Heck, some Linux distributions
like Arch Linux are already using Python 3 as the default.) If you increase the
requirement to Python 2.7 (a bit early yet, I think) you can even use lots of
the new modules from Python 3 which has been backported to Python 2.x. Not
having to support older Python versions makes it easier to write idiomatic
Python code that will work mostly unchanged on Python 3. Keep the range of
supported versions as narrow as possible, gradually getting rid of support for
older Python versions, while not excluding users of any recent OS distribution.&lt;/p&gt;

&lt;p&gt;Given this setting, three things have been really helpful for porting to and
supporting a project on Python 3:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Good test coverage. Pykka is currently at 100% line coverage, and I believe
the branch coverage is quite all right too. 1.2 seconds after running the
tests I’m confident that everything works. That confidence is a &lt;em&gt;great&lt;/em&gt;
feeling. The tools I’ve used is &lt;code&gt;unittest&lt;/code&gt; (not unittest2, yet), the
&lt;code&gt;nosetests&lt;/code&gt; test runner, and the nose coverage plugin.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Have all the Python version you target available on your system. I recently
upgraded to the alpha version of Ubuntu 11.04, which works nicely so far.
Ubuntu 11.04 changes the default Python version from 2.6.6 to 2.7.1, but more
importantly: it brings Python 3.2 to the table. In other words Python 2.6,
2.7, 3.1, and 3.2 on a single system is just &lt;code&gt;sudo apt-get install python-all
python3-all&lt;/code&gt; away.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A test runner which makes testing on multiple Python versions a piece of
cake: &lt;a href=&quot;http://codespeak.net/tox/&quot;&gt;tox&lt;/a&gt;. Thanks to tox, it is actually 6
characters less effort to run the tests on four Python versions instead of
running them one Python version (&lt;code&gt;tox&lt;/code&gt; vs &lt;code&gt;nosetests&lt;/code&gt;). Of course, it takes
more than four times as long to run, but when the time for a single Python
version is 1.2 seconds, I’ll survive the wait, and I’ll even run the tests on
all targeted versions before every commit.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Porting to Python 3 is not just a one time effort – an effort which probably
is quite small for many idiomatic and well-tested Python projects – but also
an ongoing effort. There is little gain in porting if your going to break the
Python 3 support with your next commit. Having confidence in your tests and a
couple of nice tools is all you need to support multiple versions in parallel.&lt;/p&gt;

&lt;p&gt;If you’re not writing tests, I’m sorry. I can’t help you.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Update 2011-03-30:&lt;/em&gt; Pykka 0.12 has now been released.&lt;/p&gt;
</content>
        
          <category term="gevent"/>
        
          <category term="pykka"/>
        
          <category term="python"/>
        
          <category term="tox"/>
        
          <category term="actors"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/3978132609/multiprocessing-connection-is-not-free</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/3978132609/multiprocessing-connection-is-not-free"/>
        <title>multiprocessing.Connection is not free</title>
        <updated>2011-03-20T00:00:00+01:00</updated>
        <content type="html">&lt;p&gt;I’ve
&lt;a href=&quot;http://www.jodal.no/2011/03/06/pickling-multiprocessing-connection-objects&quot;&gt;previously written&lt;/a&gt;
about how to wrap &lt;code&gt;multiprocessing.Connection&lt;/code&gt; objects in a class that makes
&lt;code&gt;Connection&lt;/code&gt; objects picklable, and thus transferable over another
&lt;code&gt;Connection&lt;/code&gt;. That approach is still valid, but don’t use it–or &lt;code&gt;Connection&lt;/code&gt;
objects at all–for lots of use-once-and-throw-away connections.&lt;/p&gt;

&lt;p&gt;For &lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka&lt;/a&gt; 0.9 and 0.10 I used wrapped
&lt;code&gt;Connection&lt;/code&gt; objects for implementing futures. It worked great until I tried
using it for my ongoing rewrite of &lt;a href=&quot;http://www.mopidy.com/&quot;&gt;Mopidy&lt;/a&gt; to use
Pykka actors. As soon as I had updated the majority of Mopidy’s tests to use
actors, the test suite started failing after a given number of tests with the
error “too many open files”. This error is due to &lt;code&gt;multiprocessing&lt;/code&gt; using UNIX sockets
in &lt;code&gt;/tmp&lt;/code&gt; to implement it’s connections. The reason for backing connections
with sockets is to make them work between multiple processes, a core feature of
&lt;code&gt;multiprocessing&lt;/code&gt;, but not a feature that I need.&lt;/p&gt;

&lt;p&gt;In Pykka 0.11 I’ve rewritten the &lt;code&gt;ThreadingFuture&lt;/code&gt; implementation from using
&lt;code&gt;multiprocessing.Connection&lt;/code&gt; to using &lt;code&gt;Queue.Queue&lt;/code&gt; (or &lt;code&gt;queue.Queue&lt;/code&gt; if your
are using Python 3.x). As &lt;code&gt;Queue.Queue&lt;/code&gt; use thread’s shared memory and locks instead of
sockets on disk, it will not hit the “too many open files” problem, and it should
also be a bit faster: A given test class in Mopidy took 2.7-2.9 seconds to run
using Pykka 0.10 and &lt;code&gt;multiprocessing.Connection&lt;/code&gt;. The same test class took
1.8-1.9 seconds using Pykka 0.11 and &lt;code&gt;Queue.Queue&lt;/code&gt;. That’s an improvement of 
about one third, and a nice side effect of fixing the “too many open files”
issue.&lt;/p&gt;
</content>
        
          <category term="python"/>
        
          <category term="multiprocessing"/>
        
          <category term="pykka"/>
        
          <category term="mopidy"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/3877028985/traits-in-python-using-multiple-inheritance</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/3877028985/traits-in-python-using-multiple-inheritance"/>
        <title>Traits in Python using multiple inheritance</title>
        <updated>2011-03-15T00:00:00+01:00</updated>
        <content type="html">&lt;p&gt;As you probably know, if a class needs to handle two mostly unrelated concerns, the
class should probably be split into two. This way, your will achieve
&lt;a href=&quot;http://en.wikipedia.org/wiki/Cohesion_(computer_science)&quot;&gt;higher cohesion&lt;/a&gt; in your
code, which is generally considered a good thing. Though, there are times where
you need to address two mostly unrelated concerns in a single class.&lt;/p&gt;

&lt;p&gt;The Scala programming language has a concept they call
&lt;a href=&quot;http://www.scala-lang.org/node/126&quot;&gt;traits&lt;/a&gt;. Scala traits can be compared to
Java interfaces, except that Scala traits may be partially or fully
implemented, like abstract classes. As with Java interfaces, you can &lt;em&gt;mixin&lt;/em&gt;
multiple traits in a class. If any of the traits has colliding method
signatures, the last trait mixed in overrides the earlier ones. Just as with
regular inheritance and interfaces, the class &lt;em&gt;must&lt;/em&gt; implement any methods that
has not been implemented by the traits, and &lt;em&gt;may&lt;/em&gt; override any methods that
already has been implemented.&lt;/p&gt;

&lt;p&gt;Traits as a concept for separating reusable parts of your code is not new. Like
most good ideas, someone has been thinking about them previously. According to
&lt;a href=&quot;http://en.wikipedia.org/wiki/Trait_(computer_science)&quot;&gt;Wikipedia&lt;/a&gt; traits
originated in the Self programming language from 1987. Today, multiple
programming languages has variants of traits as native constructs in the
language. One example being Scala traits, another being module mixins in Ruby.&lt;/p&gt;

&lt;p&gt;Given almost 25 years of history, one should think traits should be well known
by now and more widely used for achieving good and reusable code. My first
meeting with traits wasn’t until during an otherwise unrelated course in late
2009 where Jim Coplien introduced me to
&lt;a href=&quot;http://www.artima.com/articles/dci_vision.html&quot;&gt;the DCI architecture&lt;/a&gt; during
lunch, and then again a bit later when I picked up the Scala language. I ask myself: Why
didn’t I learn stuff like this in university?&lt;/p&gt;

&lt;h3 id=&quot;example-mixing-in-the-actor-life-cycle-in-mopidy&quot;&gt;Example: Mixing in the actor life cycle in Mopidy&lt;/h3&gt;

&lt;p&gt;Back to Python. I’m currently introducing &lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka’s actor
model&lt;/a&gt; implementation in the &lt;a href=&quot;http://www.mopidy.com/&quot;&gt;Mopidy music
server&lt;/a&gt; as a replacement for Mopidy’s existing thread
management and inter-thread communication code. I’m replacing untested
application wiring–some reused throughout the application, some custom for
specific problems–with a fully tested library. Using the actor model I apply
the same simple semantics to all concurrent components in the application,
making it simpler to reason about, at least compared to having custom solutions
all over the place. As an added bonus, the current &lt;code&gt;git diff --stat&lt;/code&gt; is at
about -700/+300, leaving 400 fewer lines of code to maintain.&lt;/p&gt;

&lt;p&gt;Mopidy has several extension points where one can add new components such as
music library backends, audio outputs, audio mixers, and frontend servers. When
you implement an extension, you must subclass the corresponding base class,
e.g. the &lt;code&gt;BaseOutput&lt;/code&gt; class, which defines and &lt;a href=&quot;http://www.mopidy.com/docs/master/api/outputs/&quot;&gt;documents the
API&lt;/a&gt; that the extension must
implement for it to be usable by the rest of the system.&lt;/p&gt;

&lt;p&gt;Many of these components interface with external libraries or systems, but we
can’t block the entire system while waiting for these interactions to complete.
We want to be able to respond to requests from MPD clients while we adjust the
volume on the NAD amplifier, or queue the next block of audio data for
playback. Thus, the extensions need to either run in its own thread, or to
dispatch work to a worker thread.&lt;/p&gt;

&lt;p&gt;Clearly, the components got two separate sets of concerns. First of all, they
have some application logic which implements the API. E.g. methods like
&lt;code&gt;play_uri()&lt;/code&gt; and &lt;code&gt;get_volume()&lt;/code&gt;. Secondly, the components got a life cycle,
consiting of methods like &lt;code&gt;start()&lt;/code&gt; and &lt;code&gt;stop()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;An example from Mopidy is the Last.fm scrobbler, which
&lt;a href=&quot;https://github.com/mopidy/mopidy/blob/5d4c13268f2e93eb638fc6c4af2805b293ee6cbc/mopidy/frontends/lastfm.py&quot;&gt;previously&lt;/a&gt;
was split in two; a main class which implements the frontend API, and a worker
class, which is a thread which contains most of the application logic.&lt;/p&gt;

&lt;p&gt;After refactoring the code is down to a single class just implementing core
application logic, which no library can replace. The trick? Using multiple
inheritance to mix in the actor life cycle.&lt;/p&gt;

&lt;p&gt;I think this is a beautiful way of separating concerns that needs to be part of
the same class. I also think the code reads well:&lt;/p&gt;

&lt;p&gt;The above example is an &lt;em&gt;frontend&lt;/em&gt; extension, which has the trait of also being
an actor. It has some application logic, but also happens to have an actor life cycle.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclaimer:&lt;/em&gt; Before refactoring all your code to use multiple inheritance in
Python, get familiar with Python’s Method Resolution Order (MRO) and the
&lt;a href=&quot;http://fuhm.net/super-harmful/&quot;&gt;semantics of &lt;code&gt;super()&lt;/code&gt;&lt;/a&gt; when using multiple
inheritance.&lt;/p&gt;
</content>
        
          <category term="mopidy"/>
        
          <category term="pykka"/>
        
          <category term="python"/>
        
          <category term="scala"/>
        
          <category term="traits"/>
        
      </entry>
    
  
    
      <entry>
        <id>http://www.jodal.no/post/3669476502/pickling-multiprocessing-connection-objects</id>
        <link type="text/html" rel="alternate" href="http://www.jodal.no/post/3669476502/pickling-multiprocessing-connection-objects"/>
        <title>Pickling multiprocessing Connection objects</title>
        <updated>2011-03-06T00:00:00+01:00</updated>
        <content type="html">&lt;p&gt;For safe message-based communication between threads and processes in Python, I
tend to use
&lt;a href=&quot;http://docs.python.org/library/multiprocessing.html&quot;&gt;multiprocessing&lt;/a&gt;’s
&lt;code&gt;Queue&lt;/code&gt; and &lt;code&gt;Pipe&lt;/code&gt;. A pattern often seen is using a queue for sending messages
from multiple producers to a single consumer.&lt;/p&gt;

&lt;p&gt;When a producer wants a response to its message, I create a &lt;code&gt;Pipe&lt;/code&gt; and
piggy-back one end of the &lt;code&gt;Pipe&lt;/code&gt; (a &lt;code&gt;Connection&lt;/code&gt; object) to the message. I use
Python dicts as messages, and use the string “reply_to” as the dictionary key
for the connection objects.&lt;/p&gt;

&lt;p&gt;When the queue consumer processes a message, it doesn’t know who the sender is
or how to reach him. Though, if the message has an attached &lt;code&gt;Connection&lt;/code&gt;
object, the consumer can–almost magically–respond to the sender, across
thread and process boundaries.&lt;/p&gt;

&lt;p&gt;All good? Nope.&lt;/p&gt;

&lt;p&gt;Any message sent through the queues and pipes must be serializable, or
&lt;em&gt;picklable&lt;/em&gt; as we say in Pythonesque. The &lt;code&gt;multiprocessing.Connection&lt;/code&gt; objects
can be serialized, but not unserialized, which means that you will not see an
exception when you create your message, but some time later in the consumer
that tries to respond. The exception does not tell you much, unless you’ve seen
it before:&lt;/p&gt;

&lt;p&gt;This has been a known &lt;a href=&quot;http://bugs.python.org/issue4892&quot;&gt;bug in Python&lt;/a&gt;
for two years. Googling the exception leads to StackOverflow question asking
for &lt;a href=&quot;http://stackoverflow.com/questions/1446004/python-2-6-send-connection-object-over-queue-pipe-etc&quot;&gt;a
workaround&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I’ve usually added a version of the workaround to some util package in my
projects; one function for pickling a connection, and one function for
unpickling a connection. In my code I’ve been forced to manually
pickle/unpickle &lt;code&gt;Connection&lt;/code&gt; objects before putting them on a &lt;code&gt;Queue&lt;/code&gt; or
&lt;code&gt;Pipe&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This works great most of the time, but not this time. In the Python actor model
library &lt;a href=&quot;http://jodal.github.com/pykka/&quot;&gt;Pykka&lt;/a&gt; I use &lt;code&gt;Connection&lt;/code&gt; objects to
implement futures for thread-based actors, similar to how I use gevent’s
&lt;code&gt;AsyncResult&lt;/code&gt; for gevent-based actors. When someone sets a value on the future,
it is written to one end of a &lt;code&gt;Pipe&lt;/code&gt;. When someone tries to read the future’s
value, they block on the other end of the &lt;code&gt;Pipe&lt;/code&gt; until there is something to
get or a timeout is reached. The problem appeared when I tried to nest futures,
which is likely to happen if an actor, in response to your message, returns a
future result from another actor.  I no longer have the opportunity to babysit
every &lt;code&gt;Connection&lt;/code&gt; object that goes into or comes out of another &lt;code&gt;Connection&lt;/code&gt;.
They need to be able to watch over themselves. As the &lt;code&gt;Connection&lt;/code&gt; class is
implemented in C and is rather closed to changes, my solution was to wrap the
&lt;code&gt;Connection&lt;/code&gt; objects:&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ConnectionWrapper&lt;/code&gt; class simply implements &lt;code&gt;__reduce__&lt;/code&gt; on the wrapped
&lt;code&gt;Connection&lt;/code&gt; object using multiprocessing’s own &lt;code&gt;reduce_connection&lt;/code&gt; function.
To work like a real &lt;code&gt;Connection&lt;/code&gt; object, it dispatches any attribute access to
the wrapped connection by implementing &lt;code&gt;__getattr__&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To make sure the connection remains wrapped even after a trip through
&lt;code&gt;pickle.dumps()&lt;/code&gt; and &lt;code&gt;pickle.loads()&lt;/code&gt;, &lt;code&gt;_ConnectionWrapperBuilder&lt;/code&gt; is used for
rebuilding the connection and rewrapping it on deserialization.&lt;/p&gt;

&lt;p&gt;Given this wrapper, you can make your own &lt;code&gt;Pipe&lt;/code&gt; function which creates a new
pipe and wraps the connection objects for you.&lt;/p&gt;

&lt;p&gt;Hopefully this trick will be of help until the bug is fixed in Python.&lt;/p&gt;
</content>
        
          <category term="multiprocessing"/>
        
          <category term="pykka"/>
        
          <category term="python"/>
        
      </entry>
    
  

</feed>

