Green Threads, Performance Ruby and Mysql

Today I got curious and tried a little experiment. I’m currently responsible for a Ruby application with a performance requirement. Without getting into too much detail, it needs to do a fair amount of analysis on text moving through it, and log that text to a database.

An obvious optimization is to do the analysis and logging in parallel, but Ruby uses green (cooperative) threading, so this will only help if the Mysql DLL is smart enough to yield while it is waiting for a query.

This is the rub of green threading, of course: you’re only really threaded if the calls you are making explicitly support it.

Let’s find out …

db = Mysql.connect 'localhost', 'foo', 'bar', 'baz', nil, '/tmp/mysql.sock'

db.query <<-END
create table delete_me (id integer auto_increment not null, primary key(id))
END

insert = "insert into delete_me values #{(['(NULL)'] * 100).join(',')}"

thread = Thread.new do
while true
puts 'hello from thread'
sleep 0.5
end
end

db.query 'select * from delete_me as one join delete_me as two join delete_me as three'

So. If I see “hello from thread” while the last query is running, then db.query is a good green threads citizen and yields. The result? No dice, I’m afraid. Nothing prints until the query is done.

Given that, what are your options? Clearly the following doesn’t actually save you anything (create_many bashes many inserts into one):

writer = Thread.new {LoggedMessage.create_many messages}

Instead, you’re left with:

writer = Process.fork {LoggedMessage.create_many messages}

and later …

Process.waitpid writer

This is pretty wasteful – clearly the better way would be to modify the Mysql module to work asynchronously. Maybe some day I’ll spend some of my copious free time on that.