Hello Writing today is mainly for beginners. This time, I will introduce a certain debugging method. The debug is'byebug'. It doesn't require a gem like binding.pry, and it's attractive that view doesn't. You can debug just by writing byebug in the processing of the model or controller. However, even if you write it in view, you cannot debug it.
scraping.rb
class Scraping < ApplicationRecord
def self.get_infomation
require 'mechanize'
agent = Mechanize.new
links = []
current_page = agent.get("https://talent-dictionary.com/s/jobs/3/20")
elements = current_page.at('.home_talent_list_wrapper')
boxs = elements.search('.item')
roks = boxs.search('.right')
qqqs = roks.search('a')
eees = qqqs.search('.title')
eees.each do |eee|
links << eee.inner_text
end
links.each do |link|
get_personal_infomation('https://talent-dictionary.com/' + link)
end
end
def self.get_personal_infomation(link)
agent = Mechanize.new
personal_page = agent.get(link)
aaas = personal_page.at('.talent_name_wrapper')
ages = aaas.at('.age').inner_text.delete('age').to_i if aaas.at('.age')
names = aaas.at('h1').inner_text if aaas.at('h1')
image_urls = personal_page.at('.main_image img').get_attribute('src') if personal_page.at('.main_image img')
infomation = Infomation.where(name: names).first_or_initialize
infomation.age = ages
infomation.image_url = image_urls
byebug
infomation.save
end
end
Then run get_infomation on the console
[23, 32] in /home/ec2-user/environment/filebook/app/models/scraping.rb
23: ages = aaas.at('.age').inner_text.delete('age').to_i if aaas.at('.age')
24: names = aaas.at('h1').inner_text if aaas.at('h1')
25: image_urls = personal_page.at('.main_image img').get_attribute('src') if personal_page.at('.main_image img')
26: infomation = Infomation.where(name: names).first_or_initialize
27: infomation.age = ages
28: infomation.image_url = image_urls
29: byebug
=> 30: infomation.save
31: end
32: end
(byebug)
In this way, you can stop the process in the model. You can do it by writing a method in this (byebug).
(byebug) infomation
#<Infomation id: 14, age: 22, name: "Hirose Suzu", image_url: "https://images.talent-dictionary.com/uploads/image...", created_at: "2020-11-01 07:14:44", updated_at: "2020-11-01 07:14:44">
You can take out the contents. Next, if you do it with infomation.name
(byebug) infomation.name
"Hirose Suzu"
If you make it personal_page
personal_page
#<Mechanize::Page
{url
#<URI::HTTPS https://talent-dictionary.com/%E5%BA%83%E7%80%AC%E3%81%99%E3%81%9A>}
{meta_refresh}
{title "From the birth of Hirose Tin to the present-Talent dictionary"}
{iframes
#<Mechanize::Page::Frame
nil
"//www.googletagmanager.com/ns.html?id=GTM-P49Z2S">
#<Mechanize::Page::Frame nil "https://www.youtube.com/embed/hY0oCSd6G78">}
{frames}
{links
#<Mechanize::Page::Link "Talent dictionary" "https://talent-dictionary.com/">
#<Mechanize::Page::Link "Talent dictionary" "https://talent-dictionary.com/">
#<Mechanize::Page::Link
"Hirose Suzu"
"https://talent-dictionary.com/%E5%BA%83%E7%80%AC%E3%81%99%E3%81%9A">
#<Mechanize::Page::Link
""
"https://www.facebook.com/sharer/sharer.php?u=https%3A%2F%2Ftalent-dictionary.com%2F%25E5%25BA%2583%25E7%2580%25AC%25E3%2581%2599%25E3%2581%259A">
#<Mechanize::Page::Link
""
"https://twitter.com/share?url=h
The rest is omitted
You can extract all the information in the view.
You can do the same with the controller
infomation_controller.rb
def show
@infomation = Infomation.find(params[:id])
byebug
@comments = @infomation.comments
@favorite = @infomation.favorites
end
And when you start the server
[3, 12] in /home/ec2-user/environment/filebook/app/controllers/infomations_controller.rb
3: @infomations = Infomation.all
4: @infomations_rankings = Infomation.find(Favorite.group(:infomation_id).order('count(infomation_id) desc').limit(3).pluck(:infomation_id))
5: end
6: def show
7: @infomation = Infomation.find(params[:id])
8: byebug
=> 9: @comments = @infomation.comments
10: @favorite = @infomation.favorites
11: end
12: end
(byebug)
If you start @infomation in this state
(byebug) @infomation
#<Infomation id: 1, age: 21, name: "Kanna Hashimoto", image_url: "https://images.talent-dictionary.com/uploads/image...", created_at: nil, updated_at: nil>
And you can get the information.
Recommended Posts