[Rails] N + 1 is evil! If it does occur, resolve it for the time being! !! Is dangerous

N+1

That is the root of all evil! Performance enemy! !! It is a bad existence that you should repel as soon as you find it! !! !!

I think many people think that.

In fact, that's right, and even if you look it up online, you'll find plenty of know-how to eliminate N + 1.

Basically, you can correct it according to the know-how, but in rare cases it is better not to eliminate N + 1, so I will introduce it with a concrete example.

What I want to talk about in this article

What I want to talk about in this article is why N + 1 should be fixed? about it. N + 1 is so famous that I have a lot of know-how to fix it, but I feel that I have forgotten why I should fix it. One thing I would like to mention is that fixing N + 1 is not because we want to reduce the number of queries issued. Because I want to improve performance! In other words, if the performance doesn't improve with fewer queries, you don't need to fix N + 1.

What is N + 1

Let's review the typical N + 1 first.

I will explain using the model below.

class User
  has_many :articles
end

class Article
  belongs_to :user
  has_many :images
end

class Image
  belongs_to :article
end

Consider an API that uses this model to get a list of articles for a particular user. The response is as follows

{
  articles: [
    id: 1
    body: "hogehoge"
    images: [
      {
        id: 1
        alt: "alt"
        src: "https://example.com/hoge1.img"
      }
    ]
  ]
}

Everyone loves this (?) If you use Jbuilder, you will get the following implementation.

class ArticlesController
  def index
    #Get 10 items in descending order of update date(using kaminari)
    @articles = Articles.where(user_id: params[:user_id]).order(updated_at: :desc).page(params[:page]).per(10)
  end
end

# articles/index.json.jbuilder
json.articles do
  json.array!(@articles) do |article|
    json.id article.id
    json.body article.body
    json.images do
      json.array!(article.images) do |image|
        json.id image.id
        json.alt image.alt
        json.src image.src
    end
  end
end

What if I do the above? As shown below, SQL to get the images table will be issued as many as the number of articles.

--1 query to get articles
SELECT `articles` FROM `articles` WHERE `articles`.`user_id` = 1 ORDER BY `articles`.`updated_at` DESC LIMIT 10

--Images acquisition query is issued for the number of articles
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 1
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 2
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 3
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 4
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 5
...
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 10

Rails has a feature to solve this. That is preload, ʻeager_load, ʻincludes. I won't go into these details in this article, but this time I'll use preload to eliminate N + 1.

class ArticlesController
  def index
    # preload(:images)add to
    @articles = Articles.where(user_id: params[:user_id]).preload(:images).order(updated_at: :desc, id: :desc).page(params[:page]).per(10)
  end
end
--1 query to get articles
SELECT `articles` FROM `articles` WHERE `articles`.`user_id` = 1 ORDER BY `articles`.`updated_at` DESC LIMIT 10

--1 query for images acquisition
SELECT `images`.* FROM `images` WHERE `images`.`article_id` in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

Queries are expensive to issue, so reducing the number of queries can significantly improve performance. In this example as well, 11 queries have been significantly reduced to 2 queries.

N + 1 repulsion completed! !!

Patterns that should not eliminate N + 1

In the above example, N + 1 was successfully resolved, but if there are many users who have a lot of images, was this correspondence really good? As an extreme example, let's consider the case where an average of 1,000 images are linked per article.

In this case, if the above response is left as it is, an image of 1,000 * 10 = 10,000 will be returned, and the response will become too huge and the performance will deteriorate. Therefore, in most cases, the article list will be changed to a specification that returns a part of the image (such as only the first 5).

Let's change the specifications. I changed the jbuilder part without changing the controller.

# articles/index.json.jbuilder
json.articles do
  json.array!(@articles) do |article|
    json.id article.id
    json.body article.body
    json.images do
      #Get only the first 5
      json.array!(article.images.first(5)) do |image|
        json.id image.id
        json.alt image.alt
        json.src image.src
    end
  end
end

When executed with this, the number of images of the response will be up to 5 for each article. Congratulations ... it won't be! !! !! !!

What's wrong? Looking at the queries, only the same two queries as in the previous example were issued.

--1 query to get articles
SELECT `articles` FROM `articles` WHERE `articles`.`user_id` = 1 ORDER BY `articles`.`updated_at` DESC LIMIT 10

--1 query for images acquisition
SELECT `images`.* FROM `images` WHERE `images`.`article_id` in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

Did you understand the problem? In this example, the problem is the images query that resolved N + 1 last time. According to this precondition, 1,000 images are linked to one article. This means that this image query retrieves 10,000 image objects. While ActiveRecord is convenient, the size of the object is very large. Creating 10,000 of those objects consumes a lot of memory. Moreover, this time I will only use the first 5 cards.

So what happens if you remove preload and return to the state before N + 1 was resolved?

--1 query to get articles
SELECT `articles` FROM `articles` WHERE `articles`.`user_id` = 1 ORDER BY `articles`.`updated_at` DESC LIMIT 10

--Images acquisition query is issued for the number of articles
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 1 limit 5
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 2 limit 5
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 3 limit 5
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 4 limit 5
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 5 limit 5
...
SELECT `images`.* FROM `images` WHERE `images`.`article_id` = 10 limit 5

The image query will return to be issued as many as the number of articles, but since we will only get 5 each, the number of ActiveRecord objects generated will be significantly reduced from 10,000 to 50.

It depends on the performance of the execution environment, so it is not absolute, but it is often better to save memory than to eliminate N + 1. (* Which response is actually appropriate cannot be known without performance verification in an environment equivalent to the execution environment) If N + 1 is evil! If you think that you absolutely have to fix it, when you see the images query issued as described above, you will add preload, which may significantly increase the memory usage and cause performance degradation. Maybe.

Finally

N + 1 is very famous and easy to see, and Rails can solve it quickly just by adding preload etc., so I think that it is often solved without thinking deeply. However, if you look ahead to has_many as in this example, consider how many cases you can expect before implementing it. If you get all of them without considering the number of cases, it may cause memory shortage, performance degradation, and in the worst case, freeze the server.

Also, I wrote a little at the beginning, but let's recognize the purpose of fixing N + 1. Would you like to confirm that the number of queries has decreased when you fix N + 1 and complete the response? Eliminating N + 1 is not to reduce issuing queries, but to improve performance. Don't just check for fewer queries, but make sure you're improving performance.

From the viewpoint of improving performance, in addition to N + 1, there are various viewpoints such as reduction of the number of processes such as memory usage and loops. Keep in mind that N + 1 is just one way to improve performance, and keep other perspectives in mind.

Recommended Posts

[Rails] N + 1 is evil! If it does occur, resolve it for the time being! !! Is dangerous
Isn't it reflected even if the content is updated in Rails?
Install Amazon Corretto (preview) for the time being
Use Java external library for the time being
Run Dataflow, Java, streaming for the time being
Is it faster if the docker bind mount is readonly?
Command to try using Docker for the time being