In Nokogiri, you can select a node by specifying the element name with css or at_css, but on the contrary, I was wondering how to get the element name from the selected node, so I checked it. ..
Getting the element name of a node is very simple. Suppose you have the following html file.
hello.html
<html>
<head>
<title>hello</title>
<meta charset="UTF-8">
</head>
<body>
<p>Hello</p>
</body>
</html>
Select the p tag in at_css.
sample.rb
require 'nokogiri'
html = open('hello_utf8.html').read
doc = Nokogiri::HTML.parse(html)
element = doc.at_css('p')
p element.name #=> 'p'
p element.parent.name #=> 'body'
You can also use parent etc. to select another node and get the element name.
Normally, the html structure is first looked at and then scraped, so I don't think there is much demand for getting the element name from the node ^^;
Recommended Posts