We use a small blogging project to explain the features of PowerStation, i.e., the types of performance issues that it discovers.
Loop Invariant Query
Loop invariant query is the query that is inside a loop, but remains the same in every loop iteration. For example,
[1].each do
user = User.first
user.blog.size
...
end
In this piece of code, the query User.first
and user.blog.size
are issued in every loop iteration without changing any parameters.
Usually there is no need to execution the same query multiple times, especially when the query is slow. To fix this issue, you can move such query out of the loop:
user = User.first
user.blog.size
[1].each do
...
end
Inefficient API
Rails provides many APIs to interact with databases, but some APIs issue more efficient queries than others. Powerstation uses a static checker to loop for inefficient API uses and suggests replacement APIs. For instance,
user.where(:name=?).any?
The above code issues a query to check whether there exists a user with specific name. The any?
API issues a COUNT
query and return a boolean. However, replacing it with
user.where(:name=?).exists?
is more efficient since the exists?
issues a LIMIT 1
query which usually runs faster than COUNT
.
Following is the list of APIs PowerStaion checks and their replacements:
any?
, to be replaced with (=>)exists?
where.first?
=>find_by
*
=>*.except(order)
each.update
=>update_all
.count
=>size
.map
=>.pluck
pluck.sum
=>sum
.pluck + pluck
=>SQL UNION
if exists? find else create end
=>find_or_create_by
Common Subpexression
Some queries share common subexpressions. For example,
user.blogs.size
user.blogs
This code piece issues two queries: SELECT COUNT(*) from blogs where user_id = ?
and SELECT * from blogs where user_id = ?
. They share the subexpression where user_id = ?
(with the same parameter). Rewritten these queries to compute the common subexpression first may accelerate the queries. In this example, one can rewrite the code as:
@u = user.blogs
@u.count
Only one query is issued, while in the second line u.count
computes the count
from the data already loaded in memory.
Inefficient Rendering
Rails provides many handy APIs to render objects, but some of them are not very efficient, for example,
@users = user.all
for u in @users:
link_to u.id
The link_to
is a ruby function that takes into a user id and generates a url link for that id. Such rendering functions are usually non-trivial. However, after figure out one link, the rest url links can just simply replace the id
in the first link with another id
:
@users = user.all
one_link = link_to @users[0].id
for u in @users:
one_link.replace(@users[0].id, u.id)
Since string replacement is much faster that link_to
rendering, if there are a great number of objects to render, using string replacement would be much more efficient. However, such change may harm the code readability, so you may only want to fix it when rendering is the bottleneck.
Dead Store Query
Dead store query refer to the query is repeatedly issued to load different database contents into the same memory object while the object has not been used between the reloads. For example,
blog.reload
blog.reload
render blog
where blog.reload
issues a query. However, in between the too blog.reload
, there is no reference to blog, there is no need to reload them again. Removing the first reload reduces unnecessary computation:
blog.reload
render blog
Redundant Data Retrieval
By default the queries issued by Rails loads the full tuple from the database into objects unless specified which fields to load specifically. Usually only a few fields are used, and loading a full tuple is unnecessary. For example,
blog = Blog.all.order('created_at')
link_to blog.id
Only the blog id is used later in the program to produce url links. However, the query loads the full object. A more efficient way is to project fields that are later used:
blog = Blog.all.order('created_at').select('id')
link_to blog.id