Code like a Local
RSS

Performance Optimization With Pry-debugger

Last week at Wimdu was a lot of fun. We set out to optimize the search results page, as it’s the main entry point for our visitors and it accounts for quite a big part of the overall user experience. So I teamed up with Jason, CainĂ£ and Christian and spent the whole week on it!

Regarding page load time, we saved a bunch of work with the usual suspects – optimizing images, the works. In this post I’ll explain one of the techniques that we employed on the server side to shave off around two thirds of the average server response time in requests to our search results page.

Finding the low-hanging fruit

We started by looking at our NewRelic stats and found out the actual search (the query we dispatch to our Solr instance) accounted only for less than a third of the whole server response time – most of the time was spent on rendering the views, which lazy-loaded ActiveRecord models through our presenter objects.

So I began refactoring this by eager-loading all the database interaction (and ideally any other form of IO) in the controller action. After this was done, we had a sequential block of code in the controller action that was easy to profile.

The next step was to find redundant queries to our DB and Redis, especially those done for each offer we display in the search results page (25 per page). One way would have been to go look through the explosion of code paths that the controller action leads to, because we need to display a lot of information, so that means going through a/b testing layers, feature switches, heaps of presenters and other moving parts.

But in programming, as everyone knows, laziness is a virtue, so I let the computer do the work for me.

Debuggers are better than you (and me)

Given a big enough software project and enough time, there will be a point where literally no one knows all the parts. And even if you are really familiar with some parts of the code, the code and the runtime are two very different things. We might know our source code, but debuggers have much more insight about live programs.

So the debugger seemed like the best companion to get insight at runtime. In our project we use pry and pry-debugger, so I wrote an acceptance spec and called binding.pry right before actually executing the simplest possible search.

Redis, ActiveRecord – I need you to get off my lawn

I wanted to set a breakpoint to whatever the Redis rubygem uses to communicate with the Redis server, and looking at the gem source code I figured that’d be Redis::Client#process. So I set a breakpoint:

pry> break Redis::Client#process

And then hit continue so that the spec runs until the first call to Redis::Client#process. Then it stops and I can inspect where we are. In Ruby, from any point in the code you have access to a special object called caller, which is basically a backtrace of the current thread of execution at that point. Taking advantage of this, we can know where we came from:

pry> p caller

Then I thought: if we did this for every point where Redis or ActiveRecord is called (in the latter case by tracing calls to ActiveRecord::Relation#to_a), and maybe dump all the backtraces into a file, we would have a nice starting point to go through all these places in the code and find redundancy to clean up.

Automating

Thanks to pry’s scriptability, I quickly found a way to automate this. I created a custom Pry command called trace that takes a method to trace and a filename to log to, with some backtrace sanitizing. It’s all in this gist. Keep in mind that the backtrace sanitizing regexes should be adapted to your specific case. With this script it was just a matter of throwing this into the pry-debugger session:

pry> trace Redis::Client#process my_redis_calls.log

And in the case of ActiveRecord:

pry> trace ActiveRecord::Relation#to_a my_ar_calls.log

Conclusion

Once we had a file with all the backtraces that pointed to calls to Redis and ActiveRecord, finding and fixing redundant queries was (mostly) a piece of cake. What we learned about this – to be efficient at optimizing, you really need to profile and measure what’s worth optimizing. And every chance you get, just let the computer do the work :)

Recent topics