Perform a large amount of csv export (output) of log information etc. on the WEB application with Rails application

Too much output data takes time

I made a function to output log information from DB because I want to analyze log information from the client.
As a result of outputting the log after the actual operation started, if the log is about 1 or 2 days, it will take time, but csv could be output. However, it takes too much time for the whole period to be useful. I fell into such a situation. When I actually tried it, it was not output even after 15 minutes, and it was in the state of "Surely this can not be used ...", so I tried to directly hit SQL to output the log.

The method used until then

Let's output DB data to CSV with Rails | Miscellaneous notes that are not already known | note I used to output the common csv output like this using the CSV library. (Is this this? library csv (Ruby 2.7.0 Reference Manual)
This time, issuing a query with raw SQL and outputting a log also uses the CSV library, but the usage is slightly different.
The SQL issued by ActiveRecord is converted to raw SQL using the to_sql method and output.

app/services/large_csv_exporter_service.rb

class LargeCsvExporterService
  require 'csv'

  attr_accessor :table, :column, :host, :username, :password, :database

  def initialize(table, column, host, username, password, database)
    @table = table
    @column = column
    @host = host
    @username = username
    @password = password
    @database = database
  end

  def set_file_name(file_name)
    @csv_file_name = file_name
  end

  def set_query(query)
    results = @client.query(query)
    results
  end

  def write_csv(results)
    # File.write(@csv_file_name, encoding: Encoding::SJIS) unless File.exist? @csv_file_name
    CSV.open("./tmp/csv_export/#{@csv_file_name}", "w:sjis") do |csv|
      csv << results.fields
      results.each do |row|
        csv << row.values
      end
    end
  end

  def export_csv_file
    filepath = "./tmp/csv_export/#{@csv_file_name}"
    stat = File::stat(filepath)
    return filepath, stat, @csv_file_name
  end

  def delete_created_csv_file
    if File.exist?("./tmp/csv_export/#{@csv_file_name}")
      File.delete("./tmp/csv_export/#{@csv_file_name}")
    end
  end

  def set_client
    # cache_rows:False to avoid using memory
    @client = Mysql2::Client.new(
      host: host,
      username: username,
      password: password,
      database: database,
      stream: true,
      cache_rows: false,
    )
  end
end

I'm using service in a controller. From csv_export = LargeCsvExporterService.new ("access_logs", "*", ENV ["DB_D_HOST"], ENV ["DB_D_USERNAME"], "", ENV ["DB_D_NAME"]) around the middle. ʻENV ["DB_D_HOST"] `refers to the DB name or user name of .env (dotenv).
I can't write pretty code, so please take a good look.

app/controllers/admins/access_logs_controller.rb

    def export_csv(params_q)
      @q = AccessLog.order(id: :asc).ransack(params_q)
      user_type_param = params[:q][:user_type].empty? ? ["user", "admin", "supplier"] : params[:q][:user_type].split(':')[1]
      methoda_type_param = params[:q][:method_type] == "ALL_TYPE" ? AccessLog.distinct.pluck(:method_type).compact : params[:q][:method_type]
      access_dates = AccessLog.order(access_date: :asc).to_a
      from_time = params[:q][:created_at_gteq].empty? ? access_dates.first.access_date : params[:q][:created_at_gteq].to_time
      to_time = params[:q][:created_at_lt].empty? ? access_dates.last.access_date : params[:q][:created_at_lt].to_time

      sql_query = AccessLog.where(user_type: user_type_param).where(method_type: methoda_type_param).where(created_at: [from_time..to_time]).to_sql

      csv_export = LargeCsvExporterService.new("access_logs", "*", ENV["DB_D_HOST"], ENV["DB_D_USERNAME"], "", ENV["DB_D_NAME"])
      csv_export.set_file_name("AccessLog_#{Time.zone.now.strftime("%Y%m%d%S")}.csv")
      csv_export.set_client
      # results = csv_export.set_query("SELECT * FROM access_logs WHERE created_at BETWEEN '#{from_time}' AND '#{to_time}' AND method_type = '#{methoda_type_param}' AND user_type = '#{user_type_param}'" )
      results = csv_export.set_query(sql_query)
      csv_export.write_csv(results)
      filepath, stat, csv_file_name = csv_export.export_csv_file
      send_result = send_file(filepath, :filename => csv_file_name, :length => stat.size)
      # csv_export.delete_created_csv_file
    end

The sql query outputs something like this in one line

SELECT `access_logs`.* 
FROM   `access_logs` 
WHERE  `access_logs`.`user_type` = 'user' 
       AND `access_logs`.`method_type` IN ( 'GET', 'POST' ) 
       AND `access_logs`.`created_at` BETWEEN 
           '2020-08-25 00:00:00' AND '2020-09-27 23:55:35' 

Well, like this, I haven't actually measured the time, but now I can output about 100,000 files in about 10 seconds.

Recommended Posts

Perform a large amount of csv export (output) of log information etc. on the WEB application with Rails application
Roughly the flow of web application development with Rails.
Create a large number of records with one command using the Ruby on Rails seeds.rb file
[Rails] Put the same restrictions on multiple pieces of information
The first WEB application with Spring Boot-Making a Pomodoro timer-
Check the operation of two roles with a chat application
A note about the seed function of Ruby on Rails
About the solution to the problem that the log of logback is not output when the web application is stopped
Graph the sensor information of Raspberry Pi in Java and check it with a web browser
The story of the first Rails app refactored with a self-made helper
Comparison of WEB application development with Rails and Java Servlet + JSP
[Rails / Heroku / MySQL] How to reset the DB of Rails application on Heroku
A story that struggled with the introduction of Web Apple Pay
I made a tool to output the difference of CSV file
[Rails] Implementation of CSV export function
[Ruby on Rails] CSV output function
Build a web application with Javalin
[Java] Deploy a web application created with Eclipse + Maven + Ontology on Heroku
Volume of trying to create a Java Web application on Windows Server 2016