I tried to work on the graph system two times before, but since the data itself was a little subtle, I would like to graph the transition of the number of views of the article posted on Qiita as a revenge!
Today I write a code to get the number of views of the article I posted on Qiita
https://qiita.com/itaya/items/262eec85e36763497664
I wrote about scraping Qiita once in the article above, so basically I will use this.
crawler.rb
require 'nokogiri'
require 'mechanize'
require 'selenium-webdriver'
def selemium_init
ua = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36'
# caps = Selenium::WebDriver::Remote::Capabilities.chrome('chromeOptions' => { args: ["--user-agent=#{ua}", 'window-size=1280x800', '--incognito'] }) #Secret mode
caps = Selenium::WebDriver::Remote::Capabilities.chrome('chromeOptions' => {args: ["--headless","--no-sandbox", "--disable-setuid-sandbox", "--disable-gpu", "--user-agent=#{ua}", 'window-size=1280x800']})
client = Selenium::WebDriver::Remote::Http::Default.new
driver = Selenium::WebDriver.for :chrome, desired_capabilities: caps
end
driver = selemium_init
driver.navigate.to 'https://qiita.com/login'
driver.execute_script("document.getElementsByName('identity')[0].value = 'mail address'")
driver.execute_script("document.getElementsByName('password')[0].value = 'password'")
driver.execute_script("document.getElementsByName('commit')[0].click()")
sleep 1
Here is the code to log in.
From here, this time I will get a list of my pages and jump to that page to get a collection of views
crawler.rb
driver.navigate.to 'https://qiita.com/itaya'
sleep 1
doc = Nokogiri::HTML.parse(driver.page_source, nil, 'utf-8')
doc.css('.AllArticleList__Item-mhtjc8-2').each do |div|
driver.navigate.to "https://qiita.com" + div.css('.AllArticleList__ItemBodyTitle-mhtjc8-6')[0]['href']
sleep 1
article_doc = Nokogiri::HTML.parse(driver.page_source, nil, 'utf-8')
p article_doc.css('.it-Header_pv')[0].text.split(" ")[0]
end
I feel like this.
When you run this
"99"
"56"
"218"
"212"
"120"
"107"
"288"
"112"
"213"
"93"
"111"
"128"
"131"
"149"
"383"
"801"
"4629"
"510"
"1086"
You can get the number of views of the article displayed on the first page like this. However, I can't get the number of views of the articles on the second and subsequent pages with this alone, so I'd like to make that part tomorrow ....
** * Please note that too many requests will overwhelm the server !!! **
You can use Qiita's API normally ... https://qiita.com/api/v2/docs#%E6%8A%95%E7%A8%BF
Note Posted daily 14th day