Wednesday, August 8, 2012

Yet another note on destructors in Python

Here is another catch that may bite you when relying on destructor in Python. Consider the following snippet:

import MySQLdb, MySQLdb.cursors
dbc = MySQLdb.connect()

def execute(query):
    cursor = dbc.cursor()
    return cursor.fetchall() 

So when the cursor object in execute function will be destroyed? C++ coders among us would probably expect that by the time program returns from execute function, cursor's destructor has already been run. That's apparently not true - cursor's reference count will just go to zero when function returns, BUT that does not mean that the garbage collector will recycle it immediately. The actual recycling may be delayed significantly.

That fact can have very buggy and hard-to-debug side effects. Consider the following code:

execute("SELECT 'Hello World'")
cursor = dbc.cursor(MySQLdb.cursors.SSCursor)
cursor.execute("SELECT * FROM HugeTable")
rows = cursor.fetchmany(1000)
while rows:
    rows = cursor.fetchmany(1000)
That above can will randomly spit the following errors:
ProgrammingError: (2014, "Commands out of sync; you can't run this command now")
What happens is the following:

  • execute leaves cursor object for garbage collection
  • We create another cursor and running it for a long time
  • During the second query execution, the garbage collector decides to recycle the cursor object
  • cursor object destructor tries to close the cursor and to free any result sets associated with it
  • And the driver complains that we are trying to execute two operations on one connection simultaneously.

Solution? - Trivial:

def execute(query):
    cursor = dbc.cursor()
    rv = cursor.fetchall() 
    cursor.close() # Close the cursor implicitly
    return rv

The moral:

If your objects require tear-down operations - execute them yourself in a timely manner